[dpdk-dev,v3,03/10] net/enic: heed the requested max Rx packet size

Message ID 20180308024702.25974-4-johndale@cisco.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

John Daley (johndale) March 8, 2018, 2:46 a.m. UTC
  From: Hyong Youb Kim <hyonkim@cisco.com>

Currently, enic completely ignores the requested max Rx packet size
(rxmode.max_rx_pkt_len). The desired behavior is that the NIC hardware
drops packets larger than the requested size, even though they are
still smaller than MTU.

Cisco VIC does not have such a feature. But, we can accomplish a
similar (not same) effect by reducing the size of posted receive
buffers. Packets larger than the posted size get truncated, and the
receive handler drops them. This is also how the kernel enic driver
enforces the Rx side MTU.

This workaround works only when scatter mode is *not* used. When
scatter is used, there is currently no way to support
rxmode.max_rx_pkt_len, as the NIC always receives packets up to MTU.

For posterity, add a copious amount of comments regarding the
hardware's drop/receive behavior with respect to max/current MTU.

Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
Reviewed-by: John Daley <johndale@cisco.com>
---
 doc/guides/nics/enic.rst       |  1 +
 drivers/net/enic/enic.h        |  7 ++++++
 drivers/net/enic/enic_ethdev.c |  9 +++++++-
 drivers/net/enic/enic_main.c   | 49 ++++++++++++++++++++++++++++++++++++------
 4 files changed, 58 insertions(+), 8 deletions(-)
  

Comments

Ferruh Yigit March 9, 2018, 3:03 p.m. UTC | #1
On 3/8/2018 2:46 AM, John Daley wrote:
> From: Hyong Youb Kim <hyonkim@cisco.com>
> 
> Currently, enic completely ignores the requested max Rx packet size
> (rxmode.max_rx_pkt_len). The desired behavior is that the NIC hardware
> drops packets larger than the requested size, even though they are
> still smaller than MTU.

Your description looks reasonable but is there reason of two levels of limits,
max_rx_pkt_len and MTU, why not just use MTU. There is already a mail thread to
clarify max_rx_pkt_len [1].

Is this work based on an application that uses max_rx_pkt_len and to make PMD
compatible with that application? If so we can continue with patch, but if the
patch is to implement DPDK properly I suggest postponing this until
max_rx_pkt_len clarified.

[1]
https://dpdk.org/ml/archives/dev/2018-March/092178.html

> 
> Cisco VIC does not have such a feature. But, we can accomplish a
> similar (not same) effect by reducing the size of posted receive
> buffers. Packets larger than the posted size get truncated, and the
> receive handler drops them. This is also how the kernel enic driver
> enforces the Rx side MTU.
> 
> This workaround works only when scatter mode is *not* used. When
> scatter is used, there is currently no way to support
> rxmode.max_rx_pkt_len, as the NIC always receives packets up to MTU.
> 
> For posterity, add a copious amount of comments regarding the
> hardware's drop/receive behavior with respect to max/current MTU.
> 
> Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
> Reviewed-by: John Daley <johndale@cisco.com>

<...>
  
Ananyev, Konstantin March 9, 2018, 3:51 p.m. UTC | #2
Hi everyone,

> -----Original Message-----

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit

> Sent: Friday, March 9, 2018 3:04 PM

> To: John Daley <johndale@cisco.com>

> Cc: dev@dpdk.org; Hyong Youb Kim <hyonkim@cisco.com>

> Subject: Re: [dpdk-dev] [PATCH v3 03/10] net/enic: heed the requested max Rx packet size

> 

> On 3/8/2018 2:46 AM, John Daley wrote:

> > From: Hyong Youb Kim <hyonkim@cisco.com>

> >

> > Currently, enic completely ignores the requested max Rx packet size

> > (rxmode.max_rx_pkt_len). The desired behavior is that the NIC hardware

> > drops packets larger than the requested size, even though they are

> > still smaller than MTU.

> 

> Your description looks reasonable but is there reason of two levels of limits,

> max_rx_pkt_len and MTU, why not just use MTU. There is already a mail thread to

> clarify max_rx_pkt_len [1].

> 

> Is this work based on an application that uses max_rx_pkt_len and to make PMD

> compatible with that application? If so we can continue with patch, but if the

> patch is to implement DPDK properly I suggest postponing this until

> max_rx_pkt_len clarified.

> 

> [1]

> https://dpdk.org/ml/archives/dev/2018-March/092178.html


I think there are quite a lot apps these days that might rely on setting MTU via
rxmode.max_rx_pkt_len.
I think we need to support them till we (ever) deprecate rxmode.max_rx_pkt_len.
Konstantin

> 

> >

> > Cisco VIC does not have such a feature. But, we can accomplish a

> > similar (not same) effect by reducing the size of posted receive

> > buffers. Packets larger than the posted size get truncated, and the

> > receive handler drops them. This is also how the kernel enic driver

> > enforces the Rx side MTU.

> >

> > This workaround works only when scatter mode is *not* used. When

> > scatter is used, there is currently no way to support

> > rxmode.max_rx_pkt_len, as the NIC always receives packets up to MTU.

> >

> > For posterity, add a copious amount of comments regarding the

> > hardware's drop/receive behavior with respect to max/current MTU.

> >

> > Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>

> > Reviewed-by: John Daley <johndale@cisco.com>

> 

> <...>
  
Ferruh Yigit March 9, 2018, 3:57 p.m. UTC | #3
On 3/9/2018 3:51 PM, Ananyev, Konstantin wrote:
> 
> Hi everyone,
> 
>> -----Original Message-----
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>> Sent: Friday, March 9, 2018 3:04 PM
>> To: John Daley <johndale@cisco.com>
>> Cc: dev@dpdk.org; Hyong Youb Kim <hyonkim@cisco.com>
>> Subject: Re: [dpdk-dev] [PATCH v3 03/10] net/enic: heed the requested max Rx packet size
>>
>> On 3/8/2018 2:46 AM, John Daley wrote:
>>> From: Hyong Youb Kim <hyonkim@cisco.com>
>>>
>>> Currently, enic completely ignores the requested max Rx packet size
>>> (rxmode.max_rx_pkt_len). The desired behavior is that the NIC hardware
>>> drops packets larger than the requested size, even though they are
>>> still smaller than MTU.
>>
>> Your description looks reasonable but is there reason of two levels of limits,
>> max_rx_pkt_len and MTU, why not just use MTU. There is already a mail thread to
>> clarify max_rx_pkt_len [1].
>>
>> Is this work based on an application that uses max_rx_pkt_len and to make PMD
>> compatible with that application? If so we can continue with patch, but if the
>> patch is to implement DPDK properly I suggest postponing this until
>> max_rx_pkt_len clarified.
>>
>> [1]
>> https://dpdk.org/ml/archives/dev/2018-March/092178.html
> 
> I think there are quite a lot apps these days that might rely on setting MTU via
> rxmode.max_rx_pkt_len.
> I think we need to support them till we (ever) deprecate rxmode.max_rx_pkt_len.

Right. I was trying to save effort in case something changes related
max_rx_pkt_len, but since it is not clear yet, will continue with this patch.

> Konstantin
> 
>>
>>>
>>> Cisco VIC does not have such a feature. But, we can accomplish a
>>> similar (not same) effect by reducing the size of posted receive
>>> buffers. Packets larger than the posted size get truncated, and the
>>> receive handler drops them. This is also how the kernel enic driver
>>> enforces the Rx side MTU.
>>>
>>> This workaround works only when scatter mode is *not* used. When
>>> scatter is used, there is currently no way to support
>>> rxmode.max_rx_pkt_len, as the NIC always receives packets up to MTU.
>>>
>>> For posterity, add a copious amount of comments regarding the
>>> hardware's drop/receive behavior with respect to max/current MTU.
>>>
>>> Signed-off-by: Hyong Youb Kim <hyonkim@cisco.com>
>>> Reviewed-by: John Daley <johndale@cisco.com>
>>
>> <...>
>
  
Hyong Youb Kim (hyonkim) March 10, 2018, 12:37 a.m. UTC | #4
On Fri, Mar 09, 2018 at 03:57:46PM +0000, Ferruh Yigit wrote:
> On 3/9/2018 3:51 PM, Ananyev, Konstantin wrote:
[...]
> >> Is this work based on an application that uses max_rx_pkt_len and to make PMD
> >> compatible with that application? If so we can continue with patch, but if the
> >> patch is to implement DPDK properly I suggest postponing this until
> >> max_rx_pkt_len clarified.
> >>
> >> [1]
> >> https://dpdk.org/ml/archives/dev/2018-March/092178.html
> > 
> > I think there are quite a lot apps these days that might rely on setting MTU via
> > rxmode.max_rx_pkt_len.
> > I think we need to support them till we (ever) deprecate rxmode.max_rx_pkt_len.
> 
> Right. I was trying to save effort in case something changes related
> max_rx_pkt_len, but since it is not clear yet, will continue with this patch.
> 
> > Konstantin
> > 

testpmd has a command to change max_rx_pkt_len, a few DTS test cases
rely on this feature to see if packets of certain sizes get dropped,
and so on. We worked on this patch to support these cases.

I prefer using only MTU, to follow the convention of most (all?)
OSes. Though, this feature (max_rx_pkt_len) seems to come straight
from an Intel 82599 feature. In its datasheet, see "8.2.3.22.13 Max
Frame Size -- MAXFRS". Like to understand use cases for that, if
anyone can share.

-Hyong
  
Ferruh Yigit March 14, 2018, 12:56 a.m. UTC | #5
On 3/10/2018 12:37 AM, Hyong Youb Kim wrote:
> On Fri, Mar 09, 2018 at 03:57:46PM +0000, Ferruh Yigit wrote:
>> On 3/9/2018 3:51 PM, Ananyev, Konstantin wrote:
> [...]
>>>> Is this work based on an application that uses max_rx_pkt_len and to make PMD
>>>> compatible with that application? If so we can continue with patch, but if the
>>>> patch is to implement DPDK properly I suggest postponing this until
>>>> max_rx_pkt_len clarified.
>>>>
>>>> [1]
>>>> https://dpdk.org/ml/archives/dev/2018-March/092178.html
>>>
>>> I think there are quite a lot apps these days that might rely on setting MTU via
>>> rxmode.max_rx_pkt_len.
>>> I think we need to support them till we (ever) deprecate rxmode.max_rx_pkt_len.
>>
>> Right. I was trying to save effort in case something changes related
>> max_rx_pkt_len, but since it is not clear yet, will continue with this patch.
>>
>>> Konstantin
>>>
> 
> testpmd has a command to change max_rx_pkt_len, a few DTS test cases
> rely on this feature to see if packets of certain sizes get dropped,
> and so on. We worked on this patch to support these cases.
> 
> I prefer using only MTU, to follow the convention of most (all?)
> OSes. Though, this feature (max_rx_pkt_len) seems to come straight
> from an Intel 82599 feature. In its datasheet, see "8.2.3.22.13 Max
> Frame Size -- MAXFRS". Like to understand use cases for that, if
> anyone can share.

ixgbe driver updates MAXFRS register in ixgbe_dev_mtu_set(), so mtu seems can
replace max_rx_pkt_len.

MAXFRS is for rx only, from datasheet description of it:
"This value has no effect on transmit frames; it is the responsibility of
software to limit the size of transmit frames"

This may be the reason a new variable has been created for rx_frames, to
differentiate it from mtu. Not sure if max rx and tx size can be different
values for ixgbe.

> 
> -Hyong
>
  

Patch

diff --git a/doc/guides/nics/enic.rst b/doc/guides/nics/enic.rst
index 4dffce1a6..0e655e9e3 100644
--- a/doc/guides/nics/enic.rst
+++ b/doc/guides/nics/enic.rst
@@ -371,6 +371,7 @@  Known bugs and unsupported features in this release
 - Setting of extended VLAN
 - UDP RSS hashing
 - MTU update only works if Scattered Rx mode is disabled
+- Maximum receive packet length is ignored if Scattered Rx mode is used
 
 Prerequisites
 -------------
diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h
index d29939c94..1b3813a58 100644
--- a/drivers/net/enic/enic.h
+++ b/drivers/net/enic/enic.h
@@ -162,6 +162,13 @@  struct enic {
 	union vnic_rss_cpu rss_cpu;
 };
 
+/* Compute ethdev's max packet size from MTU */
+static inline uint32_t enic_mtu_to_max_rx_pktlen(uint32_t mtu)
+{
+	/* ethdev max size includes eth and crc whereas NIC MTU does not */
+	return mtu + ETHER_HDR_LEN + ETHER_CRC_LEN;
+}
+
 /* Get the CQ index from a Start of Packet(SOP) RQ index */
 static inline unsigned int enic_sop_rq_idx_to_cq_idx(unsigned int sop_idx)
 {
diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c
index cbab7029b..bdbaf4cdf 100644
--- a/drivers/net/enic/enic_ethdev.c
+++ b/drivers/net/enic/enic_ethdev.c
@@ -470,7 +470,14 @@  static void enicpmd_dev_info_get(struct rte_eth_dev *eth_dev,
 	device_info->max_rx_queues = enic->conf_rq_count / 2;
 	device_info->max_tx_queues = enic->conf_wq_count;
 	device_info->min_rx_bufsize = ENIC_MIN_MTU;
-	device_info->max_rx_pktlen = enic->max_mtu + ETHER_HDR_LEN + 4;
+	/* "Max" mtu is not a typo. HW receives packet sizes up to the
+	 * max mtu regardless of the current mtu (vNIC's mtu). vNIC mtu is
+	 * a hint to the driver to size receive buffers accordingly so that
+	 * larger-than-vnic-mtu packets get truncated.. For DPDK, we let
+	 * the user decide the buffer size via rxmode.max_rx_pkt_len, basically
+	 * ignoring vNIC mtu.
+	 */
+	device_info->max_rx_pktlen = enic_mtu_to_max_rx_pktlen(enic->max_mtu);
 	device_info->max_mac_addrs = ENIC_MAX_MAC_ADDR;
 	device_info->rx_offload_capa =
 		DEV_RX_OFFLOAD_VLAN_STRIP |
diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c
index f00e816a1..d4f478b5e 100644
--- a/drivers/net/enic/enic_main.c
+++ b/drivers/net/enic/enic_main.c
@@ -266,6 +266,8 @@  enic_alloc_rx_queue_mbufs(struct enic *enic, struct vnic_rq *rq)
 	struct rq_enet_desc *rqd = rq->ring.descs;
 	unsigned i;
 	dma_addr_t dma_addr;
+	uint32_t max_rx_pkt_len;
+	uint16_t rq_buf_len;
 
 	if (!rq->in_use)
 		return 0;
@@ -273,6 +275,18 @@  enic_alloc_rx_queue_mbufs(struct enic *enic, struct vnic_rq *rq)
 	dev_debug(enic, "queue %u, allocating %u rx queue mbufs\n", rq->index,
 		  rq->ring.desc_count);
 
+	/*
+	 * If *not* using scatter and the mbuf size is smaller than the
+	 * requested max packet size (max_rx_pkt_len), then reduce the
+	 * posted buffer size to max_rx_pkt_len. HW still receives packets
+	 * larger than max_rx_pkt_len, but they will be truncated, which we
+	 * drop in the rx handler. Not ideal, but better than returning
+	 * large packets when the user is not expecting them.
+	 */
+	max_rx_pkt_len = enic->rte_dev->data->dev_conf.rxmode.max_rx_pkt_len;
+	rq_buf_len = rte_pktmbuf_data_room_size(rq->mp) - RTE_PKTMBUF_HEADROOM;
+	if (max_rx_pkt_len < rq_buf_len && !rq->data_queue_enable)
+		rq_buf_len = max_rx_pkt_len;
 	for (i = 0; i < rq->ring.desc_count; i++, rqd++) {
 		mb = rte_mbuf_raw_alloc(rq->mp);
 		if (mb == NULL) {
@@ -287,7 +301,7 @@  enic_alloc_rx_queue_mbufs(struct enic *enic, struct vnic_rq *rq)
 		rq_enet_desc_enc(rqd, dma_addr,
 				(rq->is_sop ? RQ_ENET_TYPE_ONLY_SOP
 				: RQ_ENET_TYPE_NOT_SOP),
-				mb->buf_len - RTE_PKTMBUF_HEADROOM);
+				rq_buf_len);
 		rq->mbuf_ring[i] = mb;
 	}
 
@@ -581,7 +595,7 @@  int enic_alloc_rq(struct enic *enic, uint16_t queue_idx,
 	unsigned int mbuf_size, mbufs_per_pkt;
 	unsigned int nb_sop_desc, nb_data_desc;
 	uint16_t min_sop, max_sop, min_data, max_data;
-	uint16_t mtu = enic->rte_dev->data->mtu;
+	uint32_t max_rx_pkt_len;
 
 	rq_sop->is_sop = 1;
 	rq_sop->data_queue_idx = data_queue_idx;
@@ -599,22 +613,42 @@  int enic_alloc_rq(struct enic *enic, uint16_t queue_idx,
 
 	mbuf_size = (uint16_t)(rte_pktmbuf_data_room_size(mp) -
 			       RTE_PKTMBUF_HEADROOM);
+	/* max_rx_pkt_len includes the ethernet header and CRC. */
+	max_rx_pkt_len = enic->rte_dev->data->dev_conf.rxmode.max_rx_pkt_len;
 
 	if (enic->rte_dev->data->dev_conf.rxmode.offloads &
 	    DEV_RX_OFFLOAD_SCATTER) {
 		dev_info(enic, "Rq %u Scatter rx mode enabled\n", queue_idx);
-		/* ceil((mtu + ETHER_HDR_LEN + 4)/mbuf_size) */
-		mbufs_per_pkt = ((mtu + ETHER_HDR_LEN + 4) +
-				 (mbuf_size - 1)) / mbuf_size;
+		/* ceil((max pkt len)/mbuf_size) */
+		mbufs_per_pkt = (max_rx_pkt_len + mbuf_size - 1) / mbuf_size;
 	} else {
 		dev_info(enic, "Scatter rx mode disabled\n");
 		mbufs_per_pkt = 1;
+		if (max_rx_pkt_len > mbuf_size) {
+			dev_warning(enic, "The maximum Rx packet size (%u) is"
+				    " larger than the mbuf size (%u), and"
+				    " scatter is disabled. Larger packets will"
+				    " be truncated.\n",
+				    max_rx_pkt_len, mbuf_size);
+		}
 	}
 
 	if (mbufs_per_pkt > 1) {
 		dev_info(enic, "Rq %u Scatter rx mode in use\n", queue_idx);
 		rq_sop->data_queue_enable = 1;
 		rq_data->in_use = 1;
+		/*
+		 * HW does not directly support rxmode.max_rx_pkt_len. HW always
+		 * receives packet sizes up to the "max" MTU.
+		 * If not using scatter, we can achieve the effect of dropping
+		 * larger packets by reducing the size of posted buffers.
+		 * See enic_alloc_rx_queue_mbufs().
+		 */
+		if (max_rx_pkt_len <
+		    enic_mtu_to_max_rx_pktlen(enic->rte_dev->data->mtu)) {
+			dev_warning(enic, "rxmode.max_rx_pkt_len is ignored"
+				    " when scatter rx mode is in use.\n");
+		}
 	} else {
 		dev_info(enic, "Rq %u Scatter rx mode not being used\n",
 			 queue_idx);
@@ -654,8 +688,9 @@  int enic_alloc_rq(struct enic *enic, uint16_t queue_idx,
 		nb_data_desc = max_data;
 	}
 	if (mbufs_per_pkt > 1) {
-		dev_info(enic, "For mtu %d and mbuf size %d valid rx descriptor range is %d to %d\n",
-			 mtu, mbuf_size, min_sop + min_data,
+		dev_info(enic, "For max packet size %u and mbuf size %u valid"
+			 " rx descriptor range is %u to %u\n",
+			 max_rx_pkt_len, mbuf_size, min_sop + min_data,
 			 max_sop + max_data);
 	}
 	dev_info(enic, "Using %d rx descriptors (sop %d, data %d)\n",