From patchwork Fri Jun 29 09:29:40 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "John Daley (johndale)" X-Patchwork-Id: 41924 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0A9751B3AD; Fri, 29 Jun 2018 11:34:29 +0200 (CEST) Received: from alln-iport-2.cisco.com (alln-iport-2.cisco.com [173.37.142.89]) by dpdk.org (Postfix) with ESMTP id 3A3471B14C for ; Fri, 29 Jun 2018 11:34:28 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=6821; q=dns/txt; s=iport; t=1530264868; x=1531474468; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=fPifJXOdTBMZjsim0/ABJ1AQIpRipbjld+HB4wDXI6A=; b=dnynJ+4JQWiIPw748GR6ENpd3i5Qd+hqG0jOcMHdecLcgue3+68sOsfs k43+Mbtj/oz8KYMEM+PThM0N8ZH8ekayfn2CrFE7AbTRIG/Doi+KsEqJk uzLmgWCjpWheGvLjwPCpryl/3Ht9FdZf7uspErajBCMd8rV5dAVsOdING w=; X-IronPort-AV: E=Sophos;i="5.51,285,1526342400"; d="scan'208";a="136375678" Received: from rcdn-core-8.cisco.com ([173.37.93.144]) by alln-iport-2.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 29 Jun 2018 09:34:27 +0000 Received: from cisco.com (savbu-usnic-a.cisco.com [10.193.184.48]) by rcdn-core-8.cisco.com (8.14.5/8.14.5) with ESMTP id w5T9YRfC003596; Fri, 29 Jun 2018 09:34:27 GMT Received: by cisco.com (Postfix, from userid 392789) id 382A420F2001; Fri, 29 Jun 2018 02:34:27 -0700 (PDT) From: John Daley To: ferruh.yigit@intel.com Cc: dev@dpdk.org, Hyong Youb Kim Date: Fri, 29 Jun 2018 02:29:40 -0700 Message-Id: <20180629092944.15576-12-johndale@cisco.com> X-Mailer: git-send-email 2.16.2 In-Reply-To: <20180629092944.15576-1-johndale@cisco.com> References: <20180628031940.17397-1-johndale@cisco.com> <20180629092944.15576-1-johndale@cisco.com> Subject: [dpdk-dev] [PATCH v2 11/15] net/enic: add the simple version of Tx handler X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Hyong Youb Kim Add a much-simplified handler that works when all offloads are disabled, except mbuf fast free. When compared against the default handler, under ideal conditions, cycles per packet drop by 60+%. The driver tries to use the simple handler first. The idea of using specialized/simplified handlers is from the Intel and Mellanox drivers. Signed-off-by: Hyong Youb Kim Reviewed-by: John Daley --- v2: removed devarg to disable handler drivers/net/enic/enic.h | 2 ++ drivers/net/enic/enic_compat.h | 5 +++ drivers/net/enic/enic_ethdev.c | 4 --- drivers/net/enic/enic_main.c | 36 ++++++++++++++++++++ drivers/net/enic/enic_rxtx.c | 77 ++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 120 insertions(+), 4 deletions(-) diff --git a/drivers/net/enic/enic.h b/drivers/net/enic/enic.h index af790fc2e..a40ea790e 100644 --- a/drivers/net/enic/enic.h +++ b/drivers/net/enic/enic.h @@ -318,6 +318,8 @@ uint16_t enic_dummy_recv_pkts(void *rx_queue, uint16_t nb_pkts); uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); +uint16_t enic_simple_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts); uint16_t enic_prep_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); int enic_set_mtu(struct enic *enic, uint16_t new_mtu); diff --git a/drivers/net/enic/enic_compat.h b/drivers/net/enic/enic_compat.h index c0af1ed29..ceb1b0962 100644 --- a/drivers/net/enic/enic_compat.h +++ b/drivers/net/enic/enic_compat.h @@ -56,6 +56,11 @@ #define dev_debug(x, args...) dev_printk(DEBUG, args) extern int enicpmd_logtype_flow; +extern int enicpmd_logtype_init; + +#define PMD_INIT_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, enicpmd_logtype_init, \ + "%s" fmt "\n", __func__, ##args) #define __le16 u16 #define __le32 u32 diff --git a/drivers/net/enic/enic_ethdev.c b/drivers/net/enic/enic_ethdev.c index ef18f8802..15d93711d 100644 --- a/drivers/net/enic/enic_ethdev.c +++ b/drivers/net/enic/enic_ethdev.c @@ -24,10 +24,6 @@ int enicpmd_logtype_init; int enicpmd_logtype_flow; -#define PMD_INIT_LOG(level, fmt, args...) \ - rte_log(RTE_LOG_ ## level, enicpmd_logtype_init, \ - "%s" fmt "\n", __func__, ##args) - #define ENICPMD_FUNC_TRACE() PMD_INIT_LOG(DEBUG, " >>") /* diff --git a/drivers/net/enic/enic_main.c b/drivers/net/enic/enic_main.c index c03ec247a..e89105d96 100644 --- a/drivers/net/enic/enic_main.c +++ b/drivers/net/enic/enic_main.c @@ -493,6 +493,27 @@ static void enic_rxq_intr_deinit(struct enic *enic) } } +static void enic_prep_wq_for_simple_tx(struct enic *enic, uint16_t queue_idx) +{ + struct wq_enet_desc *desc; + struct vnic_wq *wq; + unsigned int i; + + /* + * Fill WQ descriptor fields that never change. Every descriptor is + * one packet, so set EOP. Also set CQ_ENTRY every ENIC_WQ_CQ_THRESH + * descriptors (i.e. request one completion update every 32 packets). + */ + wq = &enic->wq[queue_idx]; + desc = (struct wq_enet_desc *)wq->ring.descs; + for (i = 0; i < wq->ring.desc_count; i++, desc++) { + desc->header_length_flags = 1 << WQ_ENET_FLAGS_EOP_SHIFT; + if (i % ENIC_WQ_CQ_THRESH == ENIC_WQ_CQ_THRESH - 1) + desc->header_length_flags |= + (1 << WQ_ENET_FLAGS_CQ_ENTRY_SHIFT); + } +} + int enic_enable(struct enic *enic) { unsigned int index; @@ -535,6 +556,21 @@ int enic_enable(struct enic *enic) } } + /* + * Use the simple TX handler if possible. All offloads must be disabled + * except mbuf fast free. + */ + if ((eth_dev->data->dev_conf.txmode.offloads & + ~DEV_TX_OFFLOAD_MBUF_FAST_FREE) == 0) { + PMD_INIT_LOG(DEBUG, " use the simple tx handler"); + eth_dev->tx_pkt_burst = &enic_simple_xmit_pkts; + for (index = 0; index < enic->wq_count; index++) + enic_prep_wq_for_simple_tx(enic, index); + } else { + PMD_INIT_LOG(DEBUG, " use the default tx handler"); + eth_dev->tx_pkt_burst = &enic_xmit_pkts; + } + for (index = 0; index < enic->wq_count; index++) enic_start_wq(enic, index); for (index = 0; index < enic->rq_count; index++) diff --git a/drivers/net/enic/enic_rxtx.c b/drivers/net/enic/enic_rxtx.c index 7cddb53d9..7dec486fe 100644 --- a/drivers/net/enic/enic_rxtx.c +++ b/drivers/net/enic/enic_rxtx.c @@ -741,4 +741,81 @@ uint16_t enic_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, return index; } +static void enqueue_simple_pkts(struct rte_mbuf **pkts, + struct wq_enet_desc *desc, + uint16_t n, + struct enic *enic) +{ + struct rte_mbuf *p; + + while (n) { + n--; + p = *pkts++; + desc->address = p->buf_iova + p->data_off; + desc->length = p->pkt_len; + /* + * The app should not send oversized + * packets. tx_pkt_prepare includes a check as + * well. But some apps ignore the device max size and + * tx_pkt_prepare. Oversized packets cause WQ errrors + * and the NIC ends up disabling the whole WQ. So + * truncate packets.. + */ + if (unlikely(p->pkt_len > ENIC_TX_MAX_PKT_SIZE)) { + desc->length = ENIC_TX_MAX_PKT_SIZE; + rte_atomic64_inc(&enic->soft_stats.tx_oversized); + } + desc++; + } +} + +uint16_t enic_simple_xmit_pkts(void *tx_queue, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts) +{ + unsigned int head_idx, desc_count; + struct wq_enet_desc *desc; + struct vnic_wq *wq; + struct enic *enic; + uint16_t rem, n; + + wq = (struct vnic_wq *)tx_queue; + enic = vnic_dev_priv(wq->vdev); + enic_cleanup_wq(enic, wq); + /* Will enqueue this many packets in this call */ + nb_pkts = RTE_MIN(nb_pkts, wq->ring.desc_avail); + if (nb_pkts == 0) + return 0; + + head_idx = wq->head_idx; + desc_count = wq->ring.desc_count; + + /* Descriptors until the end of the ring */ + n = desc_count - head_idx; + n = RTE_MIN(nb_pkts, n); + /* Save mbuf pointers to free later */ + memcpy(wq->bufs + head_idx, tx_pkts, sizeof(struct rte_mbuf *) * n); + + /* Enqueue until the ring end */ + rem = nb_pkts - n; + desc = ((struct wq_enet_desc *)wq->ring.descs) + head_idx; + enqueue_simple_pkts(tx_pkts, desc, n, enic); + + /* Wrap to the start of the ring */ + if (rem) { + tx_pkts += n; + memcpy(wq->bufs, tx_pkts, sizeof(struct rte_mbuf *) * rem); + desc = (struct wq_enet_desc *)wq->ring.descs; + enqueue_simple_pkts(tx_pkts, desc, rem, enic); + } + rte_wmb(); + + /* Update head_idx and desc_avail */ + wq->ring.desc_avail -= nb_pkts; + head_idx += nb_pkts; + if (head_idx >= desc_count) + head_idx -= desc_count; + wq->head_idx = head_idx; + iowrite32_relaxed(head_idx, &wq->ctrl->posted_index); + return nb_pkts; +}