[dpdk-dev,2/2] net/virtio: optimize header reset on any layout

Message ID 1484108832-19907-3-git-send-email-yuanhan.liu@linux.intel.com (mailing list archive)
State Accepted, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel compilation success Compilation OK

Commit Message

Yuanhan Liu Jan. 11, 2017, 4:27 a.m. UTC
  When any layout is used, the header is stored in the head room of mbuf.
mbuf is allocated and filled by user, means there is no gurateen the
header is all zero for non TSO case. Therefore, we have to do the reset
by ourself:

    memest(hdr, 0, head_size);

The memset has two impacts on performance:

- memset could not be inlined, which is a bit costly.
- more importantly, it touches the mbuf, which could introduce severe
  cache issues as described by former patch.

Similiary, we could do the same trick: reset just when necessary, when
the corresponding field is already 0, which is likely true for a simple
l2 forward case. It could boost the performance up to 20+% in micro
benchmarking.

Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: stable@dpdk.org
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
---
 drivers/net/virtio/virtio_rxtx.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)
  

Comments

Maxime Coquelin Jan. 11, 2017, 8:01 a.m. UTC | #1
On 01/11/2017 05:27 AM, Yuanhan Liu wrote:
> When any layout is used, the header is stored in the head room of mbuf.
> mbuf is allocated and filled by user, means there is no gurateen the
> header is all zero for non TSO case. Therefore, we have to do the reset
> by ourself:
>
>     memest(hdr, 0, head_size);
>
> The memset has two impacts on performance:
>
> - memset could not be inlined, which is a bit costly.
> - more importantly, it touches the mbuf, which could introduce severe
>   cache issues as described by former patch.
>
> Similiary, we could do the same trick: reset just when necessary, when
> the corresponding field is already 0, which is likely true for a simple
> l2 forward case. It could boost the performance up to 20+% in micro
> benchmarking.
>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: stable@dpdk.org
> Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
> ---
>  drivers/net/virtio/virtio_rxtx.c | 10 ++++++++--
>  1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
> index 8ec2f1a..5ca3a88 100644
> --- a/drivers/net/virtio/virtio_rxtx.c
> +++ b/drivers/net/virtio/virtio_rxtx.c
> @@ -292,8 +292,14 @@
>  		hdr = (struct virtio_net_hdr *)
>  			rte_pktmbuf_prepend(cookie, head_size);
>  		/* if offload disabled, it is not zeroed below, do it now */
> -		if (offload == 0)
> -			memset(hdr, 0, head_size);
> +		if (offload == 0) {
> +			ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
> +			ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
> +			ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
> +			ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
> +			ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
> +			ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
> +		}
>  	} else if (use_indirect) {
>  		/* setup tx ring slot to point to indirect
>  		 * descriptor list stored in reserved region.
>

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks!
Maxime
  

Patch

diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index 8ec2f1a..5ca3a88 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -292,8 +292,14 @@ 
 		hdr = (struct virtio_net_hdr *)
 			rte_pktmbuf_prepend(cookie, head_size);
 		/* if offload disabled, it is not zeroed below, do it now */
-		if (offload == 0)
-			memset(hdr, 0, head_size);
+		if (offload == 0) {
+			ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->flags, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0);
+			ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0);
+		}
 	} else if (use_indirect) {
 		/* setup tx ring slot to point to indirect
 		 * descriptor list stored in reserved region.