[dpdk-dev] pmd: Add generic support for TCP TSO (Transmit Segmentation Offload)

Message ID 20141020094252.14456.58891.stgit@gklab-18-011.igk.intel.com (mailing list archive)
State Superseded, archived
Headers

Commit Message

miroslaw.walukiewicz@intel.com Oct. 20, 2014, 9:42 a.m. UTC
From: Miroslaw Walukiewicz <miroslaw.walukiewicz@intel.com>

Add new  PKT_TX_TCP_SEG flag
Add new fields in the tx offload fields indicating MSS and L4 len

Signed-off-by: Mirek Walukiewicz <miroslaw.walukiewicz@intel.com>
---
 lib/librte_mbuf/rte_mbuf.h |   23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)
  

Comments

Thomas Monjalon Oct. 20, 2014, 11:30 a.m. UTC | #1
Hi Miroslaw,

I'll try to comment your patch, but I don't know if you'll receive it.
Indeed, you didn't reply to the previous comments.
Please configure your email client to receive these emails.
This is not a write-only list.

2014-10-20 05:42, miroslaw.walukiewicz@intel.com:
> Add new  PKT_TX_TCP_SEG flag
> Add new fields in the tx offload fields indicating MSS and L4 len

You should explain why these additions are needed.

>  	/* fields to support TX offloads */
> -	union {
> -		uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var */
> -		struct {
> -			uint16_t l3_len:9;      /**< L3 (IP) Header Length. */
> -			uint16_t l2_len:7;      /**< L2 (MAC) Header Length. */
> +	/* two bytes - l2/l3 len for compatibility (endian issues)
> +	 * two bytes - reseved for alignment
> +	 * two bytes - l4 len (TCP/UDP) header len
> +	 * two bytes - TCP tso segment size 
> + 	 */
> +	struct {
> +		union {
> +			uint16_t l2_l3_len; /**< combined l2/l3 len */
> +			struct {
> +				uint16_t l3_len:9; /**< L3 (IP) Header */
> +				uint16_t l2_len:7; /**< L2 (MAC) Header */
> +			};
>  		};

Why nesting these fields in an anonymous structure?

> +		uint16_t reserved_tx_offload;
> +		uint16_t l4_len;            /**< TCP/UDP header len */
> +		uint16_t tso_segsz;         /**< TCP TSO segment size */
>  	};

What means reserved_tx_offload?

Is there an impact on performance of actual drivers ?

How this patch is related with previous work in progress about TSO?
  
miroslaw.walukiewicz@intel.com Oct. 20, 2014, 12:45 p.m. UTC | #2
Hi Thomas, 

Thank for your comments. My responses are inline.

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, October 20, 2014 1:30 PM
> To: Walukiewicz, Miroslaw
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] pmd: Add generic support for TCP TSO
> (Transmit Segmentation Offload)
> 
> Hi Miroslaw,
> 
> I'll try to comment your patch, but I don't know if you'll receive it.
> Indeed, you didn't reply to the previous comments.
> Please configure your email client to receive these emails.
> This is not a write-only list.
> 
> 2014-10-20 05:42, miroslaw.walukiewicz@intel.com:
> > Add new  PKT_TX_TCP_SEG flag
> > Add new fields in the tx offload fields indicating MSS and L4 len
> 
> You should explain why these additions are needed.

I will resend a patch with better description of new fields. 

> 
> >  	/* fields to support TX offloads */
> > -	union {
> > -		uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var
> */
> > -		struct {
> > -			uint16_t l3_len:9;      /**< L3 (IP) Header Length. */
> > -			uint16_t l2_len:7;      /**< L2 (MAC) Header Length.
> */
> > +	/* two bytes - l2/l3 len for compatibility (endian issues)
> > +	 * two bytes - reseved for alignment
> > +	 * two bytes - l4 len (TCP/UDP) header len
> > +	 * two bytes - TCP tso segment size
> > + 	 */
> > +	struct {
> > +		union {
> > +			uint16_t l2_l3_len; /**< combined l2/l3 len */
> > +			struct {
> > +				uint16_t l3_len:9; /**< L3 (IP) Header */
> > +				uint16_t l2_len:7; /**< L2 (MAC) Header */
> > +			};
> >  		};
> 
> Why nesting these fields in an anonymous structure?

I want to keep a source compatibility with non-TSO applications using that field for example IP checksum computing by NIC. 
Keeping this structure anonymous I do not require changes in old applications that do not need TSO support.

The second argument is that in original patch extending the rte_mbuf to 128 bytes made by Bruce the author made this structure anonymous and I follow this assumption too.
> 
> > +		uint16_t reserved_tx_offload;
> > +		uint16_t l4_len;            /**< TCP/UDP header len */
> > +		uint16_t tso_segsz;         /**< TCP TSO segment size */
> >  	};
> 
> What means reserved_tx_offload?

It is really for alignment. I want to keep all this structure 8 byte long.

Really I found an issue in my patch. I think that all tx offload fields should be available in single 64-bit dword to make correct operation on in pkt_mbuf_reset and pkt_mbuf_attach.

Today these macros use only first 32-bits from structure and keeps l4_len and tso_segsz untouched.

I will modify my patch also in this direction. 
> 
> Is there an impact on performance of actual drivers ?
> 
I did not observed on my machine any significant differences when aligned and non-aligned structure is used. 
I agree that alignment  is important for small packets. The TSO is used for using very long TCP segments usually.

> How this patch is related with previous work in progress about TSO?
> 

As the original Bruce's patch defining a new rte_mbuf structure did not follow exactly the concept proposed by Olivier Matz I made the closest approximation.

I defined PKT_TX_TCP_SEG, l4_len, mss = tso_segsz 

Using mss could be misinterpreted. I think tso_segsz much better describes this field meaning.

I completely agree that the pseudo   header checksum could be computed outside driver and I also followed this assumption.

Mirek

> --
> Thomas
  
Thomas Monjalon Oct. 20, 2014, 1:51 p.m. UTC | #3
2014-10-20 12:45, Walukiewicz, Miroslaw:
> > >  	/* fields to support TX offloads */
> > > -	union {
> > > -		uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var
> > */
> > > -		struct {
> > > -			uint16_t l3_len:9;      /**< L3 (IP) Header Length. */
> > > -			uint16_t l2_len:7;      /**< L2 (MAC) Header Length.
> > */
> > > +	/* two bytes - l2/l3 len for compatibility (endian issues)
> > > +	 * two bytes - reseved for alignment
> > > +	 * two bytes - l4 len (TCP/UDP) header len
> > > +	 * two bytes - TCP tso segment size
> > > + 	 */
> > > +	struct {
> > > +		union {
> > > +			uint16_t l2_l3_len; /**< combined l2/l3 len */
> > > +			struct {
> > > +				uint16_t l3_len:9; /**< L3 (IP) Header */
> > > +				uint16_t l2_len:7; /**< L2 (MAC) Header */
> > > +			};
> > >  		};
> > 
> > Why nesting these fields in an anonymous structure?
> 
> I want to keep a source compatibility with non-TSO applications using that
> field for example IP checksum computing by NIC. 
> Keeping this structure anonymous I do not require changes in old
> applications that do not need TSO support.
> 
> The second argument is that in original patch extending the rte_mbuf to 128
> bytes made by Bruce the author made this structure anonymous and I follow
> this assumption too.

Excuse me, maybe I missed something, but I still don't understand why you are
embedding the union into a struct?
  
miroslaw.walukiewicz@intel.com Oct. 20, 2014, 2:03 p.m. UTC | #4
> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Monday, October 20, 2014 3:51 PM
> To: Walukiewicz, Miroslaw
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] pmd: Add generic support for TCP TSO
> (Transmit Segmentation Offload)
> 
> 2014-10-20 12:45, Walukiewicz, Miroslaw:
> > > >  	/* fields to support TX offloads */
> > > > -	union {
> > > > -		uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var
> > > */
> > > > -		struct {
> > > > -			uint16_t l3_len:9;      /**< L3 (IP) Header Length. */
> > > > -			uint16_t l2_len:7;      /**< L2 (MAC) Header Length.
> > > */
> > > > +	/* two bytes - l2/l3 len for compatibility (endian issues)
> > > > +	 * two bytes - reseved for alignment
> > > > +	 * two bytes - l4 len (TCP/UDP) header len
> > > > +	 * two bytes - TCP tso segment size
> > > > + 	 */
> > > > +	struct {
> > > > +		union {
> > > > +			uint16_t l2_l3_len; /**< combined l2/l3 len */
> > > > +			struct {
> > > > +				uint16_t l3_len:9; /**< L3 (IP) Header */
> > > > +				uint16_t l2_len:7; /**< L2 (MAC) Header */
> > > > +			};
> > > >  		};
> > >
> > > Why nesting these fields in an anonymous structure?
> >
> > I want to keep a source compatibility with non-TSO applications using that
> > field for example IP checksum computing by NIC.
> > Keeping this structure anonymous I do not require changes in old
> > applications that do not need TSO support.
> >
> > The second argument is that in original patch extending the rte_mbuf to
> 128
> > bytes made by Bruce the author made this structure anonymous and I
> follow
> > this assumption too.
> 
> Excuse me, maybe I missed something, but I still don't understand why you
> are
> embedding the union into a struct?

You are right. It has no sense. 
Let me send a new version of the patch with new structure definition and better description

> 
> --
> Thomas
  

Patch

diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index ddadc21..bcb09b9 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -114,6 +114,9 @@  extern "C" {
 /* Bit 51 - IEEE1588*/
 #define PKT_TX_IEEE1588_TMST (1ULL << 51) /**< TX IEEE1588 packet to timestamp. */
 
+/* Bit 49 - TCP transmit segmenation offload */
+#define PKT_TX_TCP_SEG (1ULL << 49) /**< TX TSO offload */
+ 
 /* Use final bit of flags to indicate a control mbuf */
 #define CTRL_MBUF_FLAG       (1ULL << 63) /**< Mbuf contains control data */
 
@@ -189,12 +192,22 @@  struct rte_mbuf {
 	struct rte_mbuf *next;    /**< Next segment of scattered packet. */
 
 	/* fields to support TX offloads */
-	union {
-		uint16_t l2_l3_len; /**< combined l2/l3 lengths as single var */
-		struct {
-			uint16_t l3_len:9;      /**< L3 (IP) Header Length. */
-			uint16_t l2_len:7;      /**< L2 (MAC) Header Length. */
+	/* two bytes - l2/l3 len for compatibility (endian issues)
+	 * two bytes - reseved for alignment
+	 * two bytes - l4 len (TCP/UDP) header len
+	 * two bytes - TCP tso segment size 
+ 	 */
+	struct {
+		union {
+			uint16_t l2_l3_len; /**< combined l2/l3 len */
+			struct {
+				uint16_t l3_len:9; /**< L3 (IP) Header */
+				uint16_t l2_len:7; /**< L2 (MAC) Header */
+			};
 		};
+		uint16_t reserved_tx_offload;
+		uint16_t l4_len;            /**< TCP/UDP header len */
+		uint16_t tso_segsz;         /**< TCP TSO segment size */
 	};
 } __rte_cache_aligned;