[dpdk-dev,v3,1/7] net/mlx4: remove error flows from Tx fast path
Checks
Commit Message
Move unnecessary error flows to DEBUG mode.
Signed-off-by: Matan Azrad <matan@mellanox.com>
Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
---
drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
Comments
On Mon, Oct 30, 2017 at 10:07:23AM +0000, Matan Azrad wrote:
> Move unnecessary error flows to DEBUG mode.
>
> Signed-off-by: Matan Azrad <matan@mellanox.com>
> Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
I missed a couple of details while reviewing the original version, the first
one being mlx4_post_send()'s return value is still documented as updating
rte_errno in case of error, it's not the case anymore after this patch.
Please see below for the other one:
> ---
> drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
> 1 file changed, 6 insertions(+), 10 deletions(-)
>
> diff --git a/drivers/net/mlx4/mlx4_rxtx.c b/drivers/net/mlx4/mlx4_rxtx.c
<snip>
> /**
> @@ -510,8 +508,6 @@ struct pv {
> assert(max <= elts_n);
> /* Always leave one free entry in the ring. */
> --max;
> - if (max == 0)
> - return 0;
> if (max > pkts_n)
> max = pkts_n;
> for (i = 0; (i != max); ++i) {
While minor, this change has nothing to do with this patch, right?
I think it can slightly degrade an application performance as it removes the
guarantee that subsequent code only needs to be run if there is at least one
packet to process in case the TX ring is constantly full (SW faster than
HW).
Can you remove it?
Hi Adrien
> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Monday, October 30, 2017 4:23 PM
> To: Matan Azrad <matan@mellanox.com>
> Cc: dev@dpdk.org; Ophir Munk <ophirmu@mellanox.com>
> Subject: Re: [PATCH v3 1/7] net/mlx4: remove error flows from Tx fast path
>
> On Mon, Oct 30, 2017 at 10:07:23AM +0000, Matan Azrad wrote:
> > Move unnecessary error flows to DEBUG mode.
> >
> > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
>
> I missed a couple of details while reviewing the original version, the first one
> being mlx4_post_send()'s return value is still documented as updating
> rte_errno in case of error, it's not the case anymore after this patch.
>
Good attention, Will be fixed in next version.
> Please see below for the other one:
>
> > ---
> > drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
> > 1 file changed, 6 insertions(+), 10 deletions(-)
> >
> > diff --git a/drivers/net/mlx4/mlx4_rxtx.c
> > b/drivers/net/mlx4/mlx4_rxtx.c
> <snip>
> > /**
> > @@ -510,8 +508,6 @@ struct pv {
> > assert(max <= elts_n);
> > /* Always leave one free entry in the ring. */
> > --max;
> > - if (max == 0)
> > - return 0;
> > if (max > pkts_n)
> > max = pkts_n;
> > for (i = 0; (i != max); ++i) {
>
> While minor, this change has nothing to do with this patch, right?
>
Yes you right, maybe it can be merged in patch 4/7.
> I think it can slightly degrade an application performance as it removes the
> guarantee that subsequent code only needs to be run if there is at least one
> packet to process in case the TX ring is constantly full (SW faster than HW).
>
In case the TX ring is full, the loop condition should fail in the start and then return with 0 because the packet counter is 0.(more 2 checks)
Since this case are less common (in my opinion) than at least 1 free space in ring, we can prevent this unnecessary check for all these common cases.
Are you sure the 2 extra check important for performance in this empty case? Doesn't the application will call us again?
> Can you remove it?
>
> --
> Adrien Mazarguil
> 6WIND
Hi Matan,
On Mon, Oct 30, 2017 at 06:11:31PM +0000, Matan Azrad wrote:
> Hi Adrien
>
> > -----Original Message-----
> > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > Sent: Monday, October 30, 2017 4:23 PM
> > To: Matan Azrad <matan@mellanox.com>
> > Cc: dev@dpdk.org; Ophir Munk <ophirmu@mellanox.com>
> > Subject: Re: [PATCH v3 1/7] net/mlx4: remove error flows from Tx fast path
> >
> > On Mon, Oct 30, 2017 at 10:07:23AM +0000, Matan Azrad wrote:
> > > Move unnecessary error flows to DEBUG mode.
> > >
> > > Signed-off-by: Matan Azrad <matan@mellanox.com>
> > > Acked-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
> >
> > I missed a couple of details while reviewing the original version, the first one
> > being mlx4_post_send()'s return value is still documented as updating
> > rte_errno in case of error, it's not the case anymore after this patch.
> >
> Good attention, Will be fixed in next version.
>
> > Please see below for the other one:
> >
> > > ---
> > > drivers/net/mlx4/mlx4_rxtx.c | 16 ++++++----------
> > > 1 file changed, 6 insertions(+), 10 deletions(-)
> > >
> > > diff --git a/drivers/net/mlx4/mlx4_rxtx.c
> > > b/drivers/net/mlx4/mlx4_rxtx.c
> > <snip>
> > > /**
> > > @@ -510,8 +508,6 @@ struct pv {
> > > assert(max <= elts_n);
> > > /* Always leave one free entry in the ring. */
> > > --max;
> > > - if (max == 0)
> > > - return 0;
> > > if (max > pkts_n)
> > > max = pkts_n;
> > > for (i = 0; (i != max); ++i) {
> >
> > While minor, this change has nothing to do with this patch, right?
> >
> Yes you right, maybe it can be merged in patch 4/7.
>
> > I think it can slightly degrade an application performance as it removes the
> > guarantee that subsequent code only needs to be run if there is at least one
> > packet to process in case the TX ring is constantly full (SW faster than HW).
> >
>
> In case the TX ring is full, the loop condition should fail in the start and then return with 0 because the packet counter is 0.(more 2 checks)
> Since this case are less common (in my opinion) than at least 1 free space in ring, we can prevent this unnecessary check for all these common cases.
>
> Are you sure the 2 extra check important for performance in this empty case? Doesn't the application will call us again?
No, I don't think they're important to performance, like the changes from
patch 4/7, I'm not certain they actually make any difference. My suggestion
was mainly to leave it alone because of that. It's OK if you want to keep
and move it to 4/7.
@@ -169,6 +169,7 @@ struct pv {
* Make sure we read the CQE after we read the ownership bit.
*/
rte_rmb();
+#ifndef NDEBUG
if (unlikely((cqe->owner_sr_opcode & MLX4_CQE_OPCODE_MASK) ==
MLX4_CQE_OPCODE_ERROR)) {
struct mlx4_err_cqe *cqe_err =
@@ -178,6 +179,7 @@ struct pv {
(void *)txq, cqe_err->vendor_err,
cqe_err->syndrome);
}
+#endif /* NDEBUG */
/* Get WQE index reported in the CQE. */
new_index =
rte_be_to_cpu_16(cqe->wqe_index) & sq->txbb_cnt_mask;
@@ -322,7 +324,6 @@ struct pv {
uint32_t byte_count;
int wqe_real_size;
int nr_txbbs;
- int rc;
struct pv *pv = (struct pv *)txq->bounce_buf;
int pv_counter = 0;
@@ -337,8 +338,7 @@ struct pv {
if (((sq->head - sq->tail) + nr_txbbs +
sq->headroom_txbbs) >= sq->txbb_cnt ||
nr_txbbs > MLX4_MAX_WQE_TXBBS) {
- rc = ENOSPC;
- goto err;
+ return -ENOSPC;
}
/* Get the control and data entries of the WQE. */
ctrl = (struct mlx4_wqe_ctrl_seg *)mlx4_get_send_wqe(sq, head_idx);
@@ -354,6 +354,7 @@ struct pv {
dseg->addr = rte_cpu_to_be_64(addr);
/* Memory region key for this memory pool. */
lkey = mlx4_txq_mp2mr(txq, mlx4_txq_mb2mp(buf));
+#ifndef NDEBUG
if (unlikely(lkey == (uint32_t)-1)) {
/* MR does not exist. */
DEBUG("%p: unable to get MP <-> MR association",
@@ -366,9 +367,9 @@ struct pv {
ctrl->fence_size = (wqe_real_size >> 4) & 0x3f;
mlx4_txq_stamp_freed_wqe(sq, head_idx,
(sq->head & sq->txbb_cnt) ? 0 : 1);
- rc = EFAULT;
- goto err;
+ return -EFAULT;
}
+#endif /* NDEBUG */
dseg->lkey = rte_cpu_to_be_32(lkey);
if (likely(buf->data_len)) {
byte_count = rte_cpu_to_be_32(buf->data_len);
@@ -471,9 +472,6 @@ struct pv {
MLX4_BIT_WQE_OWN : 0));
sq->head += nr_txbbs;
return 0;
-err:
- rte_errno = rc;
- return -rc;
}
/**
@@ -510,8 +508,6 @@ struct pv {
assert(max <= elts_n);
/* Always leave one free entry in the ring. */
--max;
- if (max == 0)
- return 0;
if (max > pkts_n)
max = pkts_n;
for (i = 0; (i != max); ++i) {