[v1] net/memif: remove extra mbuf refcnt update in zero copy Tx

Message ID 20231208023801.3156065-1-liangxing.wang@arm.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v1] net/memif: remove extra mbuf refcnt update in zero copy Tx |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/github-robot: build success github build: passed
ci/loongarch-compilation success Compilation OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Liangxing Wang Dec. 8, 2023, 2:38 a.m. UTC
  The refcnt update of stored mbufs in memif driver is redundant since
those mbufs are only freed in eth_memif_tx_zc(). No other place
can free those stored mbufs quietly. So remove the redundant mbuf
refcnt update in dpdk memif driver to avoid extra heavy cost.
Performance of dpdk memif zero copy tx is improved with this change.

Signed-off-by: Liangxing Wang <liangxing.wang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 drivers/net/memif/rte_eth_memif.c | 6 ------
 1 file changed, 6 deletions(-)
  

Comments

Ferruh Yigit Dec. 8, 2023, 1:44 p.m. UTC | #1
On 12/8/2023 2:38 AM, Liangxing Wang wrote:
> The refcnt update of stored mbufs in memif driver is redundant since
> those mbufs are only freed in eth_memif_tx_zc(). No other place
> can free those stored mbufs quietly. So remove the redundant mbuf
> refcnt update in dpdk memif driver to avoid extra heavy cost.
> Performance of dpdk memif zero copy tx is improved with this change.
> 

As mentioned above, since free is called only from 'eth_memif_tx_zc()',
this change looks good to me.
Did you measure the performance improvement, if so can you please share it?



And addition to this being an optimization, it may be a required fix,
can you please check following case:

- When 'memif_tx_one_zc()' called, it has number of free slot
information as parameter
- If the mbuf is chained mbuf, only first mbuf reference is increased
- If number of segment in the mbuf chain is bigger than free slot,
function returns 0
- in this error case 'eth_memif_tx_zc()' breaks the sending look and returns
- In this scenario application gives the decision to either free the
mbuf or re-send it. But for this case application can't free the mbuf
because of reference count which may cause memory leak
- If application decides to re-send, reference count increased again, I
guess eventually 'memif_free_stored_mbufs()' will decrease the refcount
to be able to free it

Assuming above is not done intentionally to make sure all mbufs are sent.

This refcount prevent application discretion to drop packets, so your
change is required to fix this. Can you please double check if I am
missing anything?


> Signed-off-by: Liangxing Wang <liangxing.wang@arm.com>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/memif/rte_eth_memif.c | 6 ------
>  1 file changed, 6 deletions(-)
> 
> diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
> index 7cc8c0da91..962d390b90 100644
> --- a/drivers/net/memif/rte_eth_memif.c
> +++ b/drivers/net/memif/rte_eth_memif.c
> @@ -265,8 +265,6 @@ memif_free_stored_mbufs(struct pmd_process_private *proc_private, struct memif_q
>  	cur_tail = __atomic_load_n(&ring->tail, __ATOMIC_ACQUIRE);
>  	while (mq->last_tail != cur_tail) {
>  		RTE_MBUF_PREFETCH_TO_FREE(mq->buffers[(mq->last_tail + 1) & mask]);
> -		/* Decrement refcnt and free mbuf. (current segment) */
> -		rte_mbuf_refcnt_update(mq->buffers[mq->last_tail & mask], -1);
>  		rte_pktmbuf_free_seg(mq->buffers[mq->last_tail & mask]);
>  		mq->last_tail++;
>  	}
> @@ -825,10 +823,6 @@ memif_tx_one_zc(struct pmd_process_private *proc_private, struct memif_queue *mq
>  next_in_chain:
>  	/* store pointer to mbuf to free it later */
>  	mq->buffers[slot & mask] = mbuf;
> -	/* Increment refcnt to make sure the buffer is not freed before server
> -	 * receives it. (current segment)
> -	 */
> -	rte_mbuf_refcnt_update(mbuf, 1);
>  	/* populate descriptor */
>  	d0 = &ring->desc[slot & mask];
>  	d0->length = rte_pktmbuf_data_len(mbuf);
  
Ferruh Yigit Feb. 8, 2024, 1:06 a.m. UTC | #2
On 12/8/2023 1:44 PM, Ferruh Yigit wrote:
> On 12/8/2023 2:38 AM, Liangxing Wang wrote:
>> The refcnt update of stored mbufs in memif driver is redundant since
>> those mbufs are only freed in eth_memif_tx_zc(). No other place
>> can free those stored mbufs quietly. So remove the redundant mbuf
>> refcnt update in dpdk memif driver to avoid extra heavy cost.
>> Performance of dpdk memif zero copy tx is improved with this change.
>>
> 
> As mentioned above, since free is called only from 'eth_memif_tx_zc()',
> this change looks good to me.
> Did you measure the performance improvement, if so can you please share it?
> 
> 
> 
> And addition to this being an optimization, it may be a required fix,
> can you please check following case:
> 
> - When 'memif_tx_one_zc()' called, it has number of free slot
> information as parameter
> - If the mbuf is chained mbuf, only first mbuf reference is increased
> - If number of segment in the mbuf chain is bigger than free slot,
> function returns 0
> - in this error case 'eth_memif_tx_zc()' breaks the sending look and returns
> - In this scenario application gives the decision to either free the
> mbuf or re-send it. But for this case application can't free the mbuf
> because of reference count which may cause memory leak
> - If application decides to re-send, reference count increased again, I
> guess eventually 'memif_free_stored_mbufs()' will decrease the refcount
> to be able to free it
> 
> Assuming above is not done intentionally to make sure all mbufs are sent.
> 
> This refcount prevent application discretion to drop packets, so your
> change is required to fix this. Can you please double check if I am
> missing anything?
> 
> 

Hi Liangxing,

Let me summarize two points above,

1. Can you quantify performance improvement and document this commit log
of next version?

2. For some cases this optimization can be required as fix, can you
please make this a fix patch with fixes tag etc in next version?
  

Patch

diff --git a/drivers/net/memif/rte_eth_memif.c b/drivers/net/memif/rte_eth_memif.c
index 7cc8c0da91..962d390b90 100644
--- a/drivers/net/memif/rte_eth_memif.c
+++ b/drivers/net/memif/rte_eth_memif.c
@@ -265,8 +265,6 @@  memif_free_stored_mbufs(struct pmd_process_private *proc_private, struct memif_q
 	cur_tail = __atomic_load_n(&ring->tail, __ATOMIC_ACQUIRE);
 	while (mq->last_tail != cur_tail) {
 		RTE_MBUF_PREFETCH_TO_FREE(mq->buffers[(mq->last_tail + 1) & mask]);
-		/* Decrement refcnt and free mbuf. (current segment) */
-		rte_mbuf_refcnt_update(mq->buffers[mq->last_tail & mask], -1);
 		rte_pktmbuf_free_seg(mq->buffers[mq->last_tail & mask]);
 		mq->last_tail++;
 	}
@@ -825,10 +823,6 @@  memif_tx_one_zc(struct pmd_process_private *proc_private, struct memif_queue *mq
 next_in_chain:
 	/* store pointer to mbuf to free it later */
 	mq->buffers[slot & mask] = mbuf;
-	/* Increment refcnt to make sure the buffer is not freed before server
-	 * receives it. (current segment)
-	 */
-	rte_mbuf_refcnt_update(mbuf, 1);
 	/* populate descriptor */
 	d0 = &ring->desc[slot & mask];
 	d0->length = rte_pktmbuf_data_len(mbuf);