[dpdk-dev,v5,3/3] mbuf:replace the inner_l2_len and the inner_l3_len fields

Message ID 1417532767-1309-4-git-send-email-jijiang.liu@intel.com (mailing list archive)
State Accepted, archived
Headers

Commit Message

Jijiang Liu Dec. 2, 2014, 3:06 p.m. UTC
  Replace the inner_l2_len and the inner_l3_len field with the outer_l2_len and outer_l3_len field, and rework csum forward engine and i40e PMD due to  these changes.

Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
---
 app/test-pmd/csumonly.c         |   58 +++++++++++++++++++++------------------
 lib/librte_mbuf/rte_mbuf.h      |    4 +-
 lib/librte_pmd_i40e/i40e_rxtx.c |   38 +++++++++++++------------
 3 files changed, 53 insertions(+), 47 deletions(-)
  

Comments

Olivier Matz Dec. 3, 2014, 11:45 a.m. UTC | #1
Hi Jijiang,

On 12/02/2014 04:06 PM, Jijiang Liu wrote:
> Replace the inner_l2_len and the inner_l3_len field with the outer_l2_len and outer_l3_len field, and rework csum forward engine and i40e PMD due to  these changes.
>
> Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> ---
>   app/test-pmd/csumonly.c         |   58 +++++++++++++++++++++------------------
>   lib/librte_mbuf/rte_mbuf.h      |    4 +-
>   lib/librte_pmd_i40e/i40e_rxtx.c |   38 +++++++++++++------------
>   3 files changed, 53 insertions(+), 47 deletions(-)
>

One more thing about the arguments of testpmd, thay may be changed
a bit to make it clearer. We may also remove the distinction between
udp/tcp/sctp and just have l4 instead.

We don't need to add a tunnel-type argument in the testpmd
command line, because the application is already able to parse
the packet and can set the proper tunnel flag by itself.

This is the current description of the csumonly forward engine:

  * Receive a burst of packets, and for each packet:
  *  - parse packet, and try to recognize a supported packet type (1)
  *  - if it's not a supported packet type, don't touch the packet, else:
  *  - modify the IPs in inner headers and in outer headers if any
  *  - reprocess the checksum of all supported layers. This is done in SW
  *    or HW, depending on testpmd command line configuration
  *  - if TSO is enabled in testpmd command line, also flag the mbuf for
  *    TCP segmentation offload (this implies HW TCP checksum)
  * Then transmit packets on the output port.
  *
  * (1) Supported packets are:
  *   Ether / (vlan) / IP|IP6 / UDP|TCP|SCTP .
  *   Ether / (vlan) / outer IP|IP6 / outer UDP / VxLAN / Ether / IP|IP6
  *            / UDP|TCP|SCTP
  *
  * The testpmd command line for this forward engine sets the flags
  * TESTPMD_TX_OFFLOAD_* in ports[tx_port].tx_ol_flags. They control
  * wether a checksum must be calculated in software or in hardware. The
  * IP, UDP, TCP and SCTP flags always concern the inner layer.  The
  * VxLAN flag concerns the outer IP and UDP layer (if packet is
  * recognized as a vxlan packet).


This would give:

   tx_cksum set (outer-ip|outer-l4|ip|l4) (hw|sw) (port_id)


A/ outer-ip=sw, outer-l4=sw, ip=sw, l4=sw

   Ether / IP / UDP
     IP, UDP checksum reprocessed in sw

   Ether / IP1 / UDP1 / VxLAN / Ether / IP2 / UDP2:
     IP1, UDP1, IP2, UDP2 checksums reprocessed in sw



B/ outer-ip=hw, outer-l4=hw, ip=sw, l4=sw

   Ether / IP / UDP
     IP, UDP checksum reprocessed in sw

   Ether / IP1 / UDP1 / VxLAN / Ether / IP2 / UDP2:
     IP1, UDP1 checksums reprocessed in hw (using l2_len, l3_len)
     IP2, UDP2 checksums reprocessed in sw



C/ outer-ip=hw, outer-l4=hw, ip=hw, l4=hw

   Ether / IP / UDP
     IP, UDP checksum reprocessed in hw (using l2_len, l3_len)

   Ether / IP1 / UDP1 / VxLAN / Ether / IP2 / UDP2:
     IP1 checksums reprocessed in hw (using outer_l2_len, outer_l3_len)
     UDP1 checksum must be 0 in packet (I think it's not supported by hw
        or driver today)
     IP2, UDP2 checksums reprocessed in hw (using l2_len, l3_len)



D/ outer-ip=sw, outer-l4=sw, ip=hw, l4=hw

   Ether / IP / UDP
     IP, UDP checksum reprocessed in hw (using l2_len and l3_len)

   Ether / IP1 / UDP1 / VxLAN / Ether / IP2 / UDP2:
     not supported, we are not able to calculate the outer
     checksum in sw as some inner fields will be filled by the hw


What is your opinion?

Regards,
Olivier
  
Olivier Matz Dec. 5, 2014, 11:12 a.m. UTC | #2
Hi,

On 12/02/2014 04:06 PM, Jijiang Liu wrote:
> Replace the inner_l2_len and the inner_l3_len field with the outer_l2_len and outer_l3_len field, and rework csum forward engine and i40e PMD due to  these changes.
> 
> Signed-off-by: Jijiang Liu <jijiang.liu@intel.com>
> ---
>  app/test-pmd/csumonly.c         |   58 +++++++++++++++++++++------------------
>  lib/librte_mbuf/rte_mbuf.h      |    4 +-
>  lib/librte_pmd_i40e/i40e_rxtx.c |   38 +++++++++++++------------
>  3 files changed, 53 insertions(+), 47 deletions(-)
> 

Acked-by: Olivier Matz <olivier.matz@6wind.com>

One minor comment below:

> @@ -303,8 +305,7 @@ process_outer_cksums(void *outer_l3_hdr, uint16_t outer_ethertype,
>   * TESTPMD_TX_OFFLOAD_* in ports[tx_port].tx_ol_flags. They control
>   * wether a checksum must be calculated in software or in hardware. The
>   * IP, UDP, TCP and SCTP flags always concern the inner layer.  The
> - * VxLAN flag concerns the outer IP and UDP layer (if packet is
> - * recognized as a vxlan packet).
> + * VxLAN flag concerns the outer IP(if packet is recognized as a vxlan packet).
>   */
>  static void
>  pkt_burst_checksum_forward(struct fwd_stream *fs)

Typo (no space before parenthesis)

Regards,
Olivier
  

Patch

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 9094967..33ef377 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -189,11 +189,12 @@  process_inner_cksums(void *l3_hdr, uint16_t ethertype, uint16_t l3_len,
 		} else {
 			if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_IP_CKSUM)
 				ol_flags |= PKT_TX_IP_CKSUM;
-			else
+			else {
 				ipv4_hdr->hdr_checksum =
 					rte_ipv4_cksum(ipv4_hdr);
+				ol_flags |= PKT_TX_IPV4;
+			}
 		}
-		ol_flags |= PKT_TX_IPV4;
 	} else if (ethertype == _htons(ETHER_TYPE_IPv6))
 		ol_flags |= PKT_TX_IPV6;
 	else
@@ -262,22 +263,23 @@  process_outer_cksums(void *outer_l3_hdr, uint16_t outer_ethertype,
 	if (outer_ethertype == _htons(ETHER_TYPE_IPv4)) {
 		ipv4_hdr->hdr_checksum = 0;
 
-		if ((testpmd_ol_flags & TESTPMD_TX_OFFLOAD_VXLAN_CKSUM) == 0)
+		if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_VXLAN_CKSUM)
+			ol_flags |= PKT_TX_OUTER_IP_CKSUM;
+		else
 			ipv4_hdr->hdr_checksum = rte_ipv4_cksum(ipv4_hdr);
-	}
+	} else if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_VXLAN_CKSUM)
+		ol_flags |= PKT_TX_OUTER_IPV6;
 
 	udp_hdr = (struct udp_hdr *)((char *)outer_l3_hdr + outer_l3_len);
 	/* do not recalculate udp cksum if it was 0 */
 	if (udp_hdr->dgram_cksum != 0) {
 		udp_hdr->dgram_cksum = 0;
-		if ((testpmd_ol_flags & TESTPMD_TX_OFFLOAD_VXLAN_CKSUM) == 0) {
-			if (outer_ethertype == _htons(ETHER_TYPE_IPv4))
-				udp_hdr->dgram_cksum =
-					rte_ipv4_udptcp_cksum(ipv4_hdr, udp_hdr);
-			else
-				udp_hdr->dgram_cksum =
-					rte_ipv6_udptcp_cksum(ipv6_hdr, udp_hdr);
-		}
+		if (outer_ethertype == _htons(ETHER_TYPE_IPv4))
+			udp_hdr->dgram_cksum =
+				rte_ipv4_udptcp_cksum(ipv4_hdr, udp_hdr);
+		else
+			udp_hdr->dgram_cksum =
+				rte_ipv6_udptcp_cksum(ipv6_hdr, udp_hdr);
 	}
 
 	return ol_flags;
@@ -303,8 +305,7 @@  process_outer_cksums(void *outer_l3_hdr, uint16_t outer_ethertype,
  * TESTPMD_TX_OFFLOAD_* in ports[tx_port].tx_ol_flags. They control
  * wether a checksum must be calculated in software or in hardware. The
  * IP, UDP, TCP and SCTP flags always concern the inner layer.  The
- * VxLAN flag concerns the outer IP and UDP layer (if packet is
- * recognized as a vxlan packet).
+ * VxLAN flag concerns the outer IP(if packet is recognized as a vxlan packet).
  */
 static void
 pkt_burst_checksum_forward(struct fwd_stream *fs)
@@ -320,7 +321,7 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 	uint16_t i;
 	uint64_t ol_flags;
 	uint16_t testpmd_ol_flags;
-	uint8_t l4_proto;
+	uint8_t l4_proto, l4_tun_len = 0;
 	uint16_t ethertype = 0, outer_ethertype = 0;
 	uint16_t l2_len = 0, l3_len = 0, l4_len = 0;
 	uint16_t outer_l2_len = 0, outer_l3_len = 0;
@@ -360,6 +361,7 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 
 		ol_flags = 0;
 		tunnel = 0;
+		l4_tun_len = 0;
 		m = pkts_burst[i];
 
 		/* Update the L3/L4 checksum error packet statistics */
@@ -378,14 +380,16 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 		if (l4_proto == IPPROTO_UDP) {
 			udp_hdr = (struct udp_hdr *)((char *)l3_hdr + l3_len);
 
+			/* check udp destination port, 4789 is the default
+			 * vxlan port (rfc7348) */
+			if (udp_hdr->dst_port == _htons(4789)) {
+				l4_tun_len = ETHER_VXLAN_HLEN;
+				tunnel = 1;
+
 			/* currently, this flag is set by i40e only if the
 			 * packet is vxlan */
-			if (((m->ol_flags & PKT_RX_TUNNEL_IPV4_HDR) ||
-					(m->ol_flags & PKT_RX_TUNNEL_IPV6_HDR)))
-				tunnel = 1;
-			/* else check udp destination port, 4789 is the default
-			 * vxlan port (rfc7348) */
-			else if (udp_hdr->dst_port == _htons(4789))
+			} else if (m->ol_flags & (PKT_RX_TUNNEL_IPV4_HDR |
+					PKT_RX_TUNNEL_IPV6_HDR))
 				tunnel = 1;
 
 			if (tunnel == 1) {
@@ -432,10 +436,10 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 
 		if (tunnel == 1) {
 			if (testpmd_ol_flags & TESTPMD_TX_OFFLOAD_VXLAN_CKSUM) {
-				m->l2_len = outer_l2_len;
-				m->l3_len = outer_l3_len;
-				m->inner_l2_len = l2_len;
-				m->inner_l3_len = l3_len;
+				m->outer_l2_len = outer_l2_len;
+				m->outer_l3_len = outer_l3_len;
+				m->l2_len = l4_tun_len + l2_len;
+				m->l3_len = l3_len;
 			}
 			else {
 				/* if we don't do vxlan cksum in hw,
@@ -503,8 +507,8 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 					m->l2_len, m->l3_len, m->l4_len);
 			if ((tunnel == 1) &&
 				(testpmd_ol_flags & TESTPMD_TX_OFFLOAD_VXLAN_CKSUM))
-				printf("tx: m->inner_l2_len=%d m->inner_l3_len=%d\n",
-					m->inner_l2_len, m->inner_l3_len);
+				printf("tx: m->outer_l2_len=%d m->outer_l3_len=%d\n",
+					m->outer_l2_len, m->outer_l3_len);
 			if (tso_segsz != 0)
 				printf("tx: m->tso_segsz=%d\n", m->tso_segsz);
 			printf("tx: flags=");
diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
index 6eb898f..0404261 100644
--- a/lib/librte_mbuf/rte_mbuf.h
+++ b/lib/librte_mbuf/rte_mbuf.h
@@ -276,8 +276,8 @@  struct rte_mbuf {
 			uint64_t tso_segsz:16; /**< TCP TSO segment size */
 
 			/* fields for TX offloading of tunnels */
-			uint64_t inner_l3_len:9; /**< inner L3 (IP) Hdr Length. */
-			uint64_t inner_l2_len:7; /**< inner L2 (MAC) Hdr Length. */
+			uint64_t outer_l3_len:9; /**< Outer L3 (IP) Hdr Length. */
+			uint64_t outer_l2_len:7; /**< Outer L2 (MAC) Hdr Length. */
 
 			/* uint64_t unused:8; */
 		};
diff --git a/lib/librte_pmd_i40e/i40e_rxtx.c b/lib/librte_pmd_i40e/i40e_rxtx.c
index 078e973..9f0d1eb 100644
--- a/lib/librte_pmd_i40e/i40e_rxtx.c
+++ b/lib/librte_pmd_i40e/i40e_rxtx.c
@@ -462,41 +462,43 @@  i40e_txd_enable_checksum(uint64_t ol_flags,
 			uint32_t *td_offset,
 			uint8_t l2_len,
 			uint16_t l3_len,
-			uint8_t inner_l2_len,
-			uint16_t inner_l3_len,
+			uint8_t outer_l2_len,
+			uint16_t outer_l3_len,
 			uint32_t *cd_tunneling)
 {
 	if (!l2_len) {
 		PMD_DRV_LOG(DEBUG, "L2 length set to 0");
 		return;
 	}
-	*td_offset |= (l2_len >> 1) << I40E_TX_DESC_LENGTH_MACLEN_SHIFT;
 
 	if (!l3_len) {
 		PMD_DRV_LOG(DEBUG, "L3 length set to 0");
 		return;
 	}
 
-	/* VXLAN packet TX checksum offload */
+	/* UDP tunneling packet TX checksum offload */
 	if (unlikely(ol_flags & PKT_TX_UDP_TUNNEL_PKT)) {
-		uint8_t l4tun_len;
 
-		l4tun_len = ETHER_VXLAN_HLEN + inner_l2_len;
+		*td_offset |= (outer_l2_len >> 1)
+				<< I40E_TX_DESC_LENGTH_MACLEN_SHIFT;
 
-		if (ol_flags & PKT_TX_IPV4_CSUM)
+		if (ol_flags & PKT_TX_OUTER_IP_CKSUM)
 			*cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV4;
-		else if (ol_flags & PKT_TX_IPV6)
+		else if (ol_flags & PKT_TX_OUTER_IPV4)
+			*cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV4_NO_CSUM;
+		else if (ol_flags & PKT_TX_OUTER_IPV6)
 			*cd_tunneling |= I40E_TX_CTX_EXT_IP_IPV6;
 
 		/* Now set the ctx descriptor fields */
-		*cd_tunneling |= (l3_len >> 2) <<
+		*cd_tunneling |= (outer_l3_len >> 2) <<
 				I40E_TXD_CTX_QW0_EXT_IPLEN_SHIFT |
 				I40E_TXD_CTX_UDP_TUNNELING |
-				(l4tun_len >> 1) <<
+				(l2_len >> 1) <<
 				I40E_TXD_CTX_QW0_NATLEN_SHIFT;
 
-		l3_len = inner_l3_len;
-	}
+	} else
+		*td_offset |= (l2_len >> 1)
+			<< I40E_TX_DESC_LENGTH_MACLEN_SHIFT;
 
 	/* Enable L3 checksum offloads */
 	if (ol_flags & PKT_TX_IPV4_CSUM) {
@@ -1190,8 +1192,8 @@  i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 	uint64_t ol_flags;
 	uint8_t l2_len;
 	uint16_t l3_len;
-	uint8_t inner_l2_len;
-	uint16_t inner_l3_len;
+	uint8_t outer_l2_len;
+	uint16_t outer_l3_len;
 	uint16_t nb_used;
 	uint16_t nb_ctx;
 	uint16_t tx_last;
@@ -1219,9 +1221,9 @@  i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 
 		ol_flags = tx_pkt->ol_flags;
 		l2_len = tx_pkt->l2_len;
-		inner_l2_len = tx_pkt->inner_l2_len;
 		l3_len = tx_pkt->l3_len;
-		inner_l3_len = tx_pkt->inner_l3_len;
+		outer_l2_len = tx_pkt->outer_l2_len;
+		outer_l3_len = tx_pkt->outer_l3_len;
 
 		/* Calculate the number of context descriptors needed. */
 		nb_ctx = i40e_calc_context_desc(ol_flags);
@@ -1271,8 +1273,8 @@  i40e_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts)
 		/* Enable checksum offloading */
 		cd_tunneling_params = 0;
 		i40e_txd_enable_checksum(ol_flags, &td_cmd, &td_offset,
-						l2_len, l3_len, inner_l2_len,
-						inner_l3_len,
+						l2_len, l3_len, outer_l2_len,
+						outer_l3_len,
 						&cd_tunneling_params);
 
 		if (unlikely(nb_ctx)) {