From patchwork Wed Apr 29 07:28:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69515 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AB4A5A00BE; Wed, 29 Apr 2020 09:28:36 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C76981D904; Wed, 29 Apr 2020 09:28:32 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id BD52B1D8F5 for ; Wed, 29 Apr 2020 09:28:28 +0200 (CEST) IronPort-SDR: sft57ZKxiNxLUre9SlTPkL8pBLUT/gfogDDqde4NkiL83/l52HFLItPZlef4A3NTAbGT46JMko 9fdkf71huzzQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:28 -0700 IronPort-SDR: XQDfwXVIhrywXuAh/le0J+kWmD7Vrb4jZ+0F7qDY4fh1RRV/AA8K/HHhPLD0YeU7q0SXPMPhc2 NCJJq3f7AFwA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415075" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:26 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:14 +0800 Message-Id: <20200429072822.102745-2-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 1/9] net/virtio: add Rx free threshold setting X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Introduce free threshold setting in Rx queue, its default value is 32. Limit the threshold size to multiple of four as only vectorized packed Rx function will utilize it. Virtio driver will rearm Rx queue when more than rx_free_thresh descs were dequeued. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 060410577..94ba7a3ec 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -936,6 +936,7 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev, struct virtio_hw *hw = dev->data->dev_private; struct virtqueue *vq = hw->vqs[vtpci_queue_idx]; struct virtnet_rx *rxvq; + uint16_t rx_free_thresh; PMD_INIT_FUNC_TRACE(); @@ -944,6 +945,28 @@ virtio_dev_rx_queue_setup(struct rte_eth_dev *dev, return -EINVAL; } + rx_free_thresh = rx_conf->rx_free_thresh; + if (rx_free_thresh == 0) + rx_free_thresh = + RTE_MIN(vq->vq_nentries / 4, DEFAULT_RX_FREE_THRESH); + + if (rx_free_thresh & 0x3) { + RTE_LOG(ERR, PMD, "rx_free_thresh must be multiples of four." + " (rx_free_thresh=%u port=%u queue=%u)\n", + rx_free_thresh, dev->data->port_id, queue_idx); + return -EINVAL; + } + + if (rx_free_thresh >= vq->vq_nentries) { + RTE_LOG(ERR, PMD, "rx_free_thresh must be less than the " + "number of RX entries (%u)." + " (rx_free_thresh=%u port=%u queue=%u)\n", + vq->vq_nentries, + rx_free_thresh, dev->data->port_id, queue_idx); + return -EINVAL; + } + vq->vq_free_thresh = rx_free_thresh; + if (nb_desc == 0 || nb_desc > vq->vq_nentries) nb_desc = vq->vq_nentries; vq->vq_free_cnt = RTE_MIN(vq->vq_free_cnt, nb_desc); diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h index 58ad7309a..6301c56b2 100644 --- a/drivers/net/virtio/virtqueue.h +++ b/drivers/net/virtio/virtqueue.h @@ -18,6 +18,8 @@ struct rte_mbuf; +#define DEFAULT_RX_FREE_THRESH 32 + /* * Per virtio_ring.h in Linux. * For virtio_pci on SMP, we don't need to order with respect to MMIO From patchwork Wed Apr 29 07:28:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69516 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 91545A00BE; Wed, 29 Apr 2020 09:28:46 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4C2D11D90B; Wed, 29 Apr 2020 09:28:34 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 6DBC51D901 for ; Wed, 29 Apr 2020 09:28:30 +0200 (CEST) IronPort-SDR: hyFDKUHmWIvLSfem/6rt4d/eocJ36qQptooISx1tD7Rj7XlZvQHAOPJ4UcX3IC5u4F+u3dlj1c Otki7dJZXOrg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:29 -0700 IronPort-SDR: bOh1Yly16v1eoz05wsMiVp8QNGt5oVfYxYgUu1lM5orKIzyd02usBIwFDGpTdbg3RS674HNP8Q mzbY+VmCKv2Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415080" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:28 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:15 +0800 Message-Id: <20200429072822.102745-3-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 2/9] net/virtio: inorder should depend on feature bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Ring initialization is different when inorder feature negotiated. This action should dependent on negotiated feature bits. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 94ba7a3ec..e450477e8 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -989,6 +989,7 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, uint16_t queue_idx) struct rte_mbuf *m; uint16_t desc_idx; int error, nbufs, i; + bool in_order = vtpci_with_feature(hw, VIRTIO_F_IN_ORDER); PMD_INIT_FUNC_TRACE(); @@ -1018,7 +1019,7 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, uint16_t queue_idx) virtio_rxq_rearm_vec(rxvq); nbufs += RTE_VIRTIO_VPMD_RX_REARM_THRESH; } - } else if (hw->use_inorder_rx) { + } else if (!vtpci_packed_queue(vq->hw) && in_order) { if ((!virtqueue_full(vq))) { uint16_t free_cnt = vq->vq_free_cnt; struct rte_mbuf *pkts[free_cnt]; @@ -1133,7 +1134,7 @@ virtio_dev_tx_queue_setup_finish(struct rte_eth_dev *dev, PMD_INIT_FUNC_TRACE(); if (!vtpci_packed_queue(hw)) { - if (hw->use_inorder_tx) + if (vtpci_with_feature(hw, VIRTIO_F_IN_ORDER)) vq->vq_split.ring.desc[vq->vq_nentries - 1].next = 0; } @@ -2046,7 +2047,7 @@ virtio_xmit_pkts_packed(void *tx_queue, struct rte_mbuf **tx_pkts, struct virtio_hw *hw = vq->hw; uint16_t hdr_size = hw->vtnet_hdr_size; uint16_t nb_tx = 0; - bool in_order = hw->use_inorder_tx; + bool in_order = vtpci_with_feature(hw, VIRTIO_F_IN_ORDER); if (unlikely(hw->started == 0 && tx_pkts != hw->inject_pkts)) return nb_tx; From patchwork Wed Apr 29 07:28:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69517 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id ADEFEA00BE; Wed, 29 Apr 2020 09:28:56 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8F6B51D911; Wed, 29 Apr 2020 09:28:35 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id AC5921D901 for ; Wed, 29 Apr 2020 09:28:31 +0200 (CEST) IronPort-SDR: TY651javx27q/32pbyqPNx2jkW1ov8lwAjD1RheHblLMeDejWOtc1iP5B0scIaQwhmV0bgeDAo NdI73r9HcSGw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:31 -0700 IronPort-SDR: UhtPBYuH7Mse5IIIW7P5wa4e3FUMEx4Ui3Ln024Eubj5fgyWH9AIbfBofPlLspqG7kGXXXDffd xTy9hVXnX18w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415086" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:29 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:16 +0800 Message-Id: <20200429072822.102745-4-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 3/9] net/virtio: add vectorized devarg X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Previously, virtio split ring vectorized path was enabled by default. This is not suitable for everyone because that path dose not follow virtio spec. Add new devarg for virtio vectorized path selection. By default vectorized path is disabled. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst index 6286286db..a67774e91 100644 --- a/doc/guides/nics/virtio.rst +++ b/doc/guides/nics/virtio.rst @@ -363,6 +363,13 @@ Below devargs are supported by the PCI virtio driver: rte_eth_link_get_nowait function. (Default: 10000 (10G)) +#. ``vectorized``: + + It is used to specify whether virtio device perfers to use vectorized path. + Afterwards, dependencies of vectorized path will be checked in path + election. + (Default: 0 (disabled)) + Below devargs are supported by the virtio-user vdev: #. ``path``: diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 37766cbb6..0a69a4db1 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -48,7 +48,8 @@ static int virtio_dev_allmulticast_disable(struct rte_eth_dev *dev); static uint32_t virtio_dev_speed_capa_get(uint32_t speed); static int virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa, - uint32_t *speed); + uint32_t *speed, + int *vectorized); static int virtio_dev_info_get(struct rte_eth_dev *dev, struct rte_eth_dev_info *dev_info); static int virtio_dev_link_update(struct rte_eth_dev *dev, @@ -1551,8 +1552,8 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev) eth_dev->rx_pkt_burst = &virtio_recv_pkts_packed; } } else { - if (hw->use_simple_rx) { - PMD_INIT_LOG(INFO, "virtio: using simple Rx path on port %u", + if (hw->use_vec_rx) { + PMD_INIT_LOG(INFO, "virtio: using vectorized Rx path on port %u", eth_dev->data->port_id); eth_dev->rx_pkt_burst = virtio_recv_pkts_vec; } else if (hw->use_inorder_rx) { @@ -1886,6 +1887,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) { struct virtio_hw *hw = eth_dev->data->dev_private; uint32_t speed = SPEED_UNKNOWN; + int vectorized = 0; int ret; if (sizeof(struct virtio_net_hdr_mrg_rxbuf) > RTE_PKTMBUF_HEADROOM) { @@ -1912,7 +1914,7 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) return 0; } ret = virtio_dev_devargs_parse(eth_dev->device->devargs, - NULL, &speed); + NULL, &speed, &vectorized); if (ret < 0) return ret; hw->speed = speed; @@ -1949,6 +1951,11 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) if (ret < 0) goto err_virtio_init; + if (vectorized) { + if (!vtpci_packed_queue(hw)) + hw->use_vec_rx = 1; + } + hw->opened = true; return 0; @@ -2021,9 +2028,20 @@ virtio_dev_speed_capa_get(uint32_t speed) } } +static int vectorized_check_handler(__rte_unused const char *key, + const char *value, void *ret_val) +{ + if (strcmp(value, "1") == 0) + *(int *)ret_val = 1; + else + *(int *)ret_val = 0; + + return 0; +} #define VIRTIO_ARG_SPEED "speed" #define VIRTIO_ARG_VDPA "vdpa" +#define VIRTIO_ARG_VECTORIZED "vectorized" static int @@ -2045,7 +2063,7 @@ link_speed_handler(const char *key __rte_unused, static int virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa, - uint32_t *speed) + uint32_t *speed, int *vectorized) { struct rte_kvargs *kvlist; int ret = 0; @@ -2081,6 +2099,18 @@ virtio_dev_devargs_parse(struct rte_devargs *devargs, int *vdpa, } } + if (vectorized && + rte_kvargs_count(kvlist, VIRTIO_ARG_VECTORIZED) == 1) { + ret = rte_kvargs_process(kvlist, + VIRTIO_ARG_VECTORIZED, + vectorized_check_handler, vectorized); + if (ret < 0) { + PMD_INIT_LOG(ERR, "Failed to parse %s", + VIRTIO_ARG_VECTORIZED); + goto exit; + } + } + exit: rte_kvargs_free(kvlist); return ret; @@ -2092,7 +2122,8 @@ static int eth_virtio_pci_probe(struct rte_pci_driver *pci_drv __rte_unused, int vdpa = 0; int ret = 0; - ret = virtio_dev_devargs_parse(pci_dev->device.devargs, &vdpa, NULL); + ret = virtio_dev_devargs_parse(pci_dev->device.devargs, &vdpa, NULL, + NULL); if (ret < 0) { PMD_INIT_LOG(ERR, "devargs parsing is failed"); return ret; @@ -2257,33 +2288,31 @@ virtio_dev_configure(struct rte_eth_dev *dev) return -EBUSY; } - hw->use_simple_rx = 1; - if (vtpci_with_feature(hw, VIRTIO_F_IN_ORDER)) { hw->use_inorder_tx = 1; hw->use_inorder_rx = 1; - hw->use_simple_rx = 0; + hw->use_vec_rx = 0; } if (vtpci_packed_queue(hw)) { - hw->use_simple_rx = 0; + hw->use_vec_rx = 0; hw->use_inorder_rx = 0; } #if defined RTE_ARCH_ARM64 || defined RTE_ARCH_ARM if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON)) { - hw->use_simple_rx = 0; + hw->use_vec_rx = 0; } #endif if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { - hw->use_simple_rx = 0; + hw->use_vec_rx = 0; } if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM | DEV_RX_OFFLOAD_TCP_CKSUM | DEV_RX_OFFLOAD_TCP_LRO | DEV_RX_OFFLOAD_VLAN_STRIP)) - hw->use_simple_rx = 0; + hw->use_vec_rx = 0; return 0; } diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h index bd89357e4..668e688e1 100644 --- a/drivers/net/virtio/virtio_pci.h +++ b/drivers/net/virtio/virtio_pci.h @@ -253,7 +253,8 @@ struct virtio_hw { uint8_t vlan_strip; uint8_t use_msix; uint8_t modern; - uint8_t use_simple_rx; + uint8_t use_vec_rx; + uint8_t use_vec_tx; uint8_t use_inorder_rx; uint8_t use_inorder_tx; uint8_t weak_barriers; diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index e450477e8..84f4cf946 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -996,7 +996,7 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, uint16_t queue_idx) /* Allocate blank mbufs for the each rx descriptor */ nbufs = 0; - if (hw->use_simple_rx) { + if (hw->use_vec_rx && !vtpci_packed_queue(hw)) { for (desc_idx = 0; desc_idx < vq->vq_nentries; desc_idx++) { vq->vq_split.ring.avail->ring[desc_idx] = desc_idx; @@ -1014,7 +1014,7 @@ virtio_dev_rx_queue_setup_finish(struct rte_eth_dev *dev, uint16_t queue_idx) &rxvq->fake_mbuf; } - if (hw->use_simple_rx) { + if (hw->use_vec_rx && !vtpci_packed_queue(hw)) { while (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { virtio_rxq_rearm_vec(rxvq); nbufs += RTE_VIRTIO_VPMD_RX_REARM_THRESH; diff --git a/drivers/net/virtio/virtio_user_ethdev.c b/drivers/net/virtio/virtio_user_ethdev.c index 953f00d72..150a8d987 100644 --- a/drivers/net/virtio/virtio_user_ethdev.c +++ b/drivers/net/virtio/virtio_user_ethdev.c @@ -525,7 +525,7 @@ virtio_user_eth_dev_alloc(struct rte_vdev_device *vdev) */ hw->use_msix = 1; hw->modern = 0; - hw->use_simple_rx = 0; + hw->use_vec_rx = 0; hw->use_inorder_rx = 0; hw->use_inorder_tx = 0; hw->virtio_user_dev = dev; diff --git a/drivers/net/virtio/virtqueue.c b/drivers/net/virtio/virtqueue.c index 0b4e3bf3e..ca23180de 100644 --- a/drivers/net/virtio/virtqueue.c +++ b/drivers/net/virtio/virtqueue.c @@ -32,7 +32,8 @@ virtqueue_detach_unused(struct virtqueue *vq) end = (vq->vq_avail_idx + vq->vq_free_cnt) & (vq->vq_nentries - 1); for (idx = 0; idx < vq->vq_nentries; idx++) { - if (hw->use_simple_rx && type == VTNET_RQ) { + if (hw->use_vec_rx && !vtpci_packed_queue(hw) && + type == VTNET_RQ) { if (start <= end && idx >= start && idx < end) continue; if (start > end && (idx >= start || idx < end)) @@ -97,7 +98,7 @@ virtqueue_rxvq_flush_split(struct virtqueue *vq) for (i = 0; i < nb_used; i++) { used_idx = vq->vq_used_cons_idx & (vq->vq_nentries - 1); uep = &vq->vq_split.ring.used->ring[used_idx]; - if (hw->use_simple_rx) { + if (hw->use_vec_rx) { desc_idx = used_idx; rte_pktmbuf_free(vq->sw_ring[desc_idx]); vq->vq_free_cnt++; @@ -121,7 +122,7 @@ virtqueue_rxvq_flush_split(struct virtqueue *vq) vq->vq_used_cons_idx++; } - if (hw->use_simple_rx) { + if (hw->use_vec_rx) { while (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { virtio_rxq_rearm_vec(rxq); if (virtqueue_kick_prepare(vq)) From patchwork Wed Apr 29 07:28:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69518 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0B999A00BE; Wed, 29 Apr 2020 09:29:09 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 68A5A1D919; Wed, 29 Apr 2020 09:28:40 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 1EB861D907 for ; Wed, 29 Apr 2020 09:28:32 +0200 (CEST) IronPort-SDR: lhQ9/edwz/VXnsYylwRweT8rrEIiSqhNNL3iYiaPcs0FvWL8PPGuGZsnJ7y5kNbmWR5RVT2NLZ yfsY0WBevA6A== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:32 -0700 IronPort-SDR: D/9S8Rb8U4d6BYE0Dj8BDrd/zA94snG612s79ea1admouSoXCShAoUU6r0v/VHsWJvXa+tXmzt RtFggvhx++6A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415092" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:31 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:17 +0800 Message-Id: <20200429072822.102745-5-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 4/9] net/virtio-user: add vectorized devarg X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Add new devarg for virtio user device vectorized path selection. By default vectorized path is disabled. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst index a67774e91..fdd0790e0 100644 --- a/doc/guides/nics/virtio.rst +++ b/doc/guides/nics/virtio.rst @@ -424,6 +424,12 @@ Below devargs are supported by the virtio-user vdev: rte_eth_link_get_nowait function. (Default: 10000 (10G)) +#. ``vectorized``: + + It is used to specify whether virtio device perfers to use vectorized path. + Afterwards, dependencies of vectorized path will be checked in path + election. + (Default: 0 (disabled)) Virtio paths Selection and Usage -------------------------------- diff --git a/drivers/net/virtio/virtio_user_ethdev.c b/drivers/net/virtio/virtio_user_ethdev.c index 150a8d987..40ad786cc 100644 --- a/drivers/net/virtio/virtio_user_ethdev.c +++ b/drivers/net/virtio/virtio_user_ethdev.c @@ -452,6 +452,8 @@ static const char *valid_args[] = { VIRTIO_USER_ARG_PACKED_VQ, #define VIRTIO_USER_ARG_SPEED "speed" VIRTIO_USER_ARG_SPEED, +#define VIRTIO_USER_ARG_VECTORIZED "vectorized" + VIRTIO_USER_ARG_VECTORIZED, NULL }; @@ -559,6 +561,7 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) uint64_t mrg_rxbuf = 1; uint64_t in_order = 1; uint64_t packed_vq = 0; + uint64_t vectorized = 0; char *path = NULL; char *ifname = NULL; char *mac_addr = NULL; @@ -675,6 +678,15 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) } } + if (rte_kvargs_count(kvlist, VIRTIO_USER_ARG_VECTORIZED) == 1) { + if (rte_kvargs_process(kvlist, VIRTIO_USER_ARG_VECTORIZED, + &get_integer_arg, &vectorized) < 0) { + PMD_INIT_LOG(ERR, "error to parse %s", + VIRTIO_USER_ARG_VECTORIZED); + goto end; + } + } + if (queues > 1 && cq == 0) { PMD_INIT_LOG(ERR, "multi-q requires ctrl-q"); goto end; @@ -727,6 +739,9 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) goto end; } + if (vectorized) + hw->use_vec_rx = 1; + rte_eth_dev_probing_finish(eth_dev); ret = 0; @@ -785,4 +800,5 @@ RTE_PMD_REGISTER_PARAM_STRING(net_virtio_user, "mrg_rxbuf=<0|1> " "in_order=<0|1> " "packed_vq=<0|1> " - "speed="); + "speed= " + "vectorized=<0|1>"); From patchwork Wed Apr 29 07:28:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69519 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C08F3A00BE; Wed, 29 Apr 2020 09:29:16 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B65211D922; Wed, 29 Apr 2020 09:28:41 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 308531D90F for ; Wed, 29 Apr 2020 09:28:35 +0200 (CEST) IronPort-SDR: v60WQT8sqDtwqPuFHeoz8GwR+H1esI3OSJykN/Ve9XsjXj/GgrSpZDvUxB2iLD6cyYMjg693E2 Zw1AgRJErPvw== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:34 -0700 IronPort-SDR: ZV1MZ40LXQf3MNFH5DymDgp0hJKKCCOPYz4M4n2DarCN8w+fl5Diz8O/2qaXSFdoC13wHvs4v6 cBKzOn/mP8qg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415099" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:32 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:18 +0800 Message-Id: <20200429072822.102745-6-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 5/9] net/virtio: reuse packed ring functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Move offload, xmit cleanup and packed xmit enqueue function to header file. These functions will be reused by packed ring vectorized path. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index 84f4cf946..a549991aa 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -89,23 +89,6 @@ vq_ring_free_chain(struct virtqueue *vq, uint16_t desc_idx) dp->next = VQ_RING_DESC_CHAIN_END; } -static void -vq_ring_free_id_packed(struct virtqueue *vq, uint16_t id) -{ - struct vq_desc_extra *dxp; - - dxp = &vq->vq_descx[id]; - vq->vq_free_cnt += dxp->ndescs; - - if (vq->vq_desc_tail_idx == VQ_RING_DESC_CHAIN_END) - vq->vq_desc_head_idx = id; - else - vq->vq_descx[vq->vq_desc_tail_idx].next = id; - - vq->vq_desc_tail_idx = id; - dxp->next = VQ_RING_DESC_CHAIN_END; -} - void virtio_update_packet_stats(struct virtnet_stats *stats, struct rte_mbuf *mbuf) { @@ -264,130 +247,6 @@ virtqueue_dequeue_rx_inorder(struct virtqueue *vq, return i; } -#ifndef DEFAULT_TX_FREE_THRESH -#define DEFAULT_TX_FREE_THRESH 32 -#endif - -static void -virtio_xmit_cleanup_inorder_packed(struct virtqueue *vq, int num) -{ - uint16_t used_idx, id, curr_id, free_cnt = 0; - uint16_t size = vq->vq_nentries; - struct vring_packed_desc *desc = vq->vq_packed.ring.desc; - struct vq_desc_extra *dxp; - - used_idx = vq->vq_used_cons_idx; - /* desc_is_used has a load-acquire or rte_cio_rmb inside - * and wait for used desc in virtqueue. - */ - while (num > 0 && desc_is_used(&desc[used_idx], vq)) { - id = desc[used_idx].id; - do { - curr_id = used_idx; - dxp = &vq->vq_descx[used_idx]; - used_idx += dxp->ndescs; - free_cnt += dxp->ndescs; - num -= dxp->ndescs; - if (used_idx >= size) { - used_idx -= size; - vq->vq_packed.used_wrap_counter ^= 1; - } - if (dxp->cookie != NULL) { - rte_pktmbuf_free(dxp->cookie); - dxp->cookie = NULL; - } - } while (curr_id != id); - } - vq->vq_used_cons_idx = used_idx; - vq->vq_free_cnt += free_cnt; -} - -static void -virtio_xmit_cleanup_normal_packed(struct virtqueue *vq, int num) -{ - uint16_t used_idx, id; - uint16_t size = vq->vq_nentries; - struct vring_packed_desc *desc = vq->vq_packed.ring.desc; - struct vq_desc_extra *dxp; - - used_idx = vq->vq_used_cons_idx; - /* desc_is_used has a load-acquire or rte_cio_rmb inside - * and wait for used desc in virtqueue. - */ - while (num-- && desc_is_used(&desc[used_idx], vq)) { - id = desc[used_idx].id; - dxp = &vq->vq_descx[id]; - vq->vq_used_cons_idx += dxp->ndescs; - if (vq->vq_used_cons_idx >= size) { - vq->vq_used_cons_idx -= size; - vq->vq_packed.used_wrap_counter ^= 1; - } - vq_ring_free_id_packed(vq, id); - if (dxp->cookie != NULL) { - rte_pktmbuf_free(dxp->cookie); - dxp->cookie = NULL; - } - used_idx = vq->vq_used_cons_idx; - } -} - -/* Cleanup from completed transmits. */ -static inline void -virtio_xmit_cleanup_packed(struct virtqueue *vq, int num, int in_order) -{ - if (in_order) - virtio_xmit_cleanup_inorder_packed(vq, num); - else - virtio_xmit_cleanup_normal_packed(vq, num); -} - -static void -virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num) -{ - uint16_t i, used_idx, desc_idx; - for (i = 0; i < num; i++) { - struct vring_used_elem *uep; - struct vq_desc_extra *dxp; - - used_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); - uep = &vq->vq_split.ring.used->ring[used_idx]; - - desc_idx = (uint16_t) uep->id; - dxp = &vq->vq_descx[desc_idx]; - vq->vq_used_cons_idx++; - vq_ring_free_chain(vq, desc_idx); - - if (dxp->cookie != NULL) { - rte_pktmbuf_free(dxp->cookie); - dxp->cookie = NULL; - } - } -} - -/* Cleanup from completed inorder transmits. */ -static __rte_always_inline void -virtio_xmit_cleanup_inorder(struct virtqueue *vq, uint16_t num) -{ - uint16_t i, idx = vq->vq_used_cons_idx; - int16_t free_cnt = 0; - struct vq_desc_extra *dxp = NULL; - - if (unlikely(num == 0)) - return; - - for (i = 0; i < num; i++) { - dxp = &vq->vq_descx[idx++ & (vq->vq_nentries - 1)]; - free_cnt += dxp->ndescs; - if (dxp->cookie != NULL) { - rte_pktmbuf_free(dxp->cookie); - dxp->cookie = NULL; - } - } - - vq->vq_free_cnt += free_cnt; - vq->vq_used_cons_idx = idx; -} - static inline int virtqueue_enqueue_refill_inorder(struct virtqueue *vq, struct rte_mbuf **cookies, @@ -562,68 +421,7 @@ virtio_tso_fix_cksum(struct rte_mbuf *m) } -/* avoid write operation when necessary, to lessen cache issues */ -#define ASSIGN_UNLESS_EQUAL(var, val) do { \ - if ((var) != (val)) \ - (var) = (val); \ -} while (0) - -#define virtqueue_clear_net_hdr(_hdr) do { \ - ASSIGN_UNLESS_EQUAL((_hdr)->csum_start, 0); \ - ASSIGN_UNLESS_EQUAL((_hdr)->csum_offset, 0); \ - ASSIGN_UNLESS_EQUAL((_hdr)->flags, 0); \ - ASSIGN_UNLESS_EQUAL((_hdr)->gso_type, 0); \ - ASSIGN_UNLESS_EQUAL((_hdr)->gso_size, 0); \ - ASSIGN_UNLESS_EQUAL((_hdr)->hdr_len, 0); \ -} while (0) - -static inline void -virtqueue_xmit_offload(struct virtio_net_hdr *hdr, - struct rte_mbuf *cookie, - bool offload) -{ - if (offload) { - if (cookie->ol_flags & PKT_TX_TCP_SEG) - cookie->ol_flags |= PKT_TX_TCP_CKSUM; - - switch (cookie->ol_flags & PKT_TX_L4_MASK) { - case PKT_TX_UDP_CKSUM: - hdr->csum_start = cookie->l2_len + cookie->l3_len; - hdr->csum_offset = offsetof(struct rte_udp_hdr, - dgram_cksum); - hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; - break; - - case PKT_TX_TCP_CKSUM: - hdr->csum_start = cookie->l2_len + cookie->l3_len; - hdr->csum_offset = offsetof(struct rte_tcp_hdr, cksum); - hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; - break; - - default: - ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0); - ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0); - ASSIGN_UNLESS_EQUAL(hdr->flags, 0); - break; - } - /* TCP Segmentation Offload */ - if (cookie->ol_flags & PKT_TX_TCP_SEG) { - hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ? - VIRTIO_NET_HDR_GSO_TCPV6 : - VIRTIO_NET_HDR_GSO_TCPV4; - hdr->gso_size = cookie->tso_segsz; - hdr->hdr_len = - cookie->l2_len + - cookie->l3_len + - cookie->l4_len; - } else { - ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0); - ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0); - ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0); - } - } -} static inline void virtqueue_enqueue_xmit_inorder(struct virtnet_tx *txvq, @@ -725,102 +523,6 @@ virtqueue_enqueue_xmit_packed_fast(struct virtnet_tx *txvq, virtqueue_store_flags_packed(dp, flags, vq->hw->weak_barriers); } -static inline void -virtqueue_enqueue_xmit_packed(struct virtnet_tx *txvq, struct rte_mbuf *cookie, - uint16_t needed, int can_push, int in_order) -{ - struct virtio_tx_region *txr = txvq->virtio_net_hdr_mz->addr; - struct vq_desc_extra *dxp; - struct virtqueue *vq = txvq->vq; - struct vring_packed_desc *start_dp, *head_dp; - uint16_t idx, id, head_idx, head_flags; - int16_t head_size = vq->hw->vtnet_hdr_size; - struct virtio_net_hdr *hdr; - uint16_t prev; - bool prepend_header = false; - - id = in_order ? vq->vq_avail_idx : vq->vq_desc_head_idx; - - dxp = &vq->vq_descx[id]; - dxp->ndescs = needed; - dxp->cookie = cookie; - - head_idx = vq->vq_avail_idx; - idx = head_idx; - prev = head_idx; - start_dp = vq->vq_packed.ring.desc; - - head_dp = &vq->vq_packed.ring.desc[idx]; - head_flags = cookie->next ? VRING_DESC_F_NEXT : 0; - head_flags |= vq->vq_packed.cached_flags; - - if (can_push) { - /* prepend cannot fail, checked by caller */ - hdr = rte_pktmbuf_mtod_offset(cookie, struct virtio_net_hdr *, - -head_size); - prepend_header = true; - - /* if offload disabled, it is not zeroed below, do it now */ - if (!vq->hw->has_tx_offload) - virtqueue_clear_net_hdr(hdr); - } else { - /* setup first tx ring slot to point to header - * stored in reserved region. - */ - start_dp[idx].addr = txvq->virtio_net_hdr_mem + - RTE_PTR_DIFF(&txr[idx].tx_hdr, txr); - start_dp[idx].len = vq->hw->vtnet_hdr_size; - hdr = (struct virtio_net_hdr *)&txr[idx].tx_hdr; - idx++; - if (idx >= vq->vq_nentries) { - idx -= vq->vq_nentries; - vq->vq_packed.cached_flags ^= - VRING_PACKED_DESC_F_AVAIL_USED; - } - } - - virtqueue_xmit_offload(hdr, cookie, vq->hw->has_tx_offload); - - do { - uint16_t flags; - - start_dp[idx].addr = VIRTIO_MBUF_DATA_DMA_ADDR(cookie, vq); - start_dp[idx].len = cookie->data_len; - if (prepend_header) { - start_dp[idx].addr -= head_size; - start_dp[idx].len += head_size; - prepend_header = false; - } - - if (likely(idx != head_idx)) { - flags = cookie->next ? VRING_DESC_F_NEXT : 0; - flags |= vq->vq_packed.cached_flags; - start_dp[idx].flags = flags; - } - prev = idx; - idx++; - if (idx >= vq->vq_nentries) { - idx -= vq->vq_nentries; - vq->vq_packed.cached_flags ^= - VRING_PACKED_DESC_F_AVAIL_USED; - } - } while ((cookie = cookie->next) != NULL); - - start_dp[prev].id = id; - - vq->vq_free_cnt = (uint16_t)(vq->vq_free_cnt - needed); - vq->vq_avail_idx = idx; - - if (!in_order) { - vq->vq_desc_head_idx = dxp->next; - if (vq->vq_desc_head_idx == VQ_RING_DESC_CHAIN_END) - vq->vq_desc_tail_idx = VQ_RING_DESC_CHAIN_END; - } - - virtqueue_store_flags_packed(head_dp, head_flags, - vq->hw->weak_barriers); -} - static inline void virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct rte_mbuf *cookie, uint16_t needed, int use_indirect, int can_push, @@ -1246,7 +948,6 @@ virtio_rx_offload(struct rte_mbuf *m, struct virtio_net_hdr *hdr) return 0; } -#define VIRTIO_MBUF_BURST_SZ 64 #define DESC_PER_CACHELINE (RTE_CACHE_LINE_SIZE / sizeof(struct vring_desc)) uint16_t virtio_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts) diff --git a/drivers/net/virtio/virtqueue.h b/drivers/net/virtio/virtqueue.h index 6301c56b2..ca1c10499 100644 --- a/drivers/net/virtio/virtqueue.h +++ b/drivers/net/virtio/virtqueue.h @@ -10,6 +10,7 @@ #include #include #include +#include #include "virtio_pci.h" #include "virtio_ring.h" @@ -18,8 +19,10 @@ struct rte_mbuf; +#define DEFAULT_TX_FREE_THRESH 32 #define DEFAULT_RX_FREE_THRESH 32 +#define VIRTIO_MBUF_BURST_SZ 64 /* * Per virtio_ring.h in Linux. * For virtio_pci on SMP, we don't need to order with respect to MMIO @@ -560,4 +563,303 @@ virtqueue_notify(struct virtqueue *vq) #define VIRTQUEUE_DUMP(vq) do { } while (0) #endif +/* avoid write operation when necessary, to lessen cache issues */ +#define ASSIGN_UNLESS_EQUAL(var, val) do { \ + typeof(var) var_ = (var); \ + typeof(val) val_ = (val); \ + if ((var_) != (val_)) \ + (var_) = (val_); \ +} while (0) + +#define virtqueue_clear_net_hdr(hdr) do { \ + typeof(hdr) hdr_ = (hdr); \ + ASSIGN_UNLESS_EQUAL((hdr_)->csum_start, 0); \ + ASSIGN_UNLESS_EQUAL((hdr_)->csum_offset, 0); \ + ASSIGN_UNLESS_EQUAL((hdr_)->flags, 0); \ + ASSIGN_UNLESS_EQUAL((hdr_)->gso_type, 0); \ + ASSIGN_UNLESS_EQUAL((hdr_)->gso_size, 0); \ + ASSIGN_UNLESS_EQUAL((hdr_)->hdr_len, 0); \ +} while (0) + +static inline void +virtqueue_xmit_offload(struct virtio_net_hdr *hdr, + struct rte_mbuf *cookie, + bool offload) +{ + if (offload) { + if (cookie->ol_flags & PKT_TX_TCP_SEG) + cookie->ol_flags |= PKT_TX_TCP_CKSUM; + + switch (cookie->ol_flags & PKT_TX_L4_MASK) { + case PKT_TX_UDP_CKSUM: + hdr->csum_start = cookie->l2_len + cookie->l3_len; + hdr->csum_offset = offsetof(struct rte_udp_hdr, + dgram_cksum); + hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; + break; + + case PKT_TX_TCP_CKSUM: + hdr->csum_start = cookie->l2_len + cookie->l3_len; + hdr->csum_offset = offsetof(struct rte_tcp_hdr, cksum); + hdr->flags = VIRTIO_NET_HDR_F_NEEDS_CSUM; + break; + + default: + ASSIGN_UNLESS_EQUAL(hdr->csum_start, 0); + ASSIGN_UNLESS_EQUAL(hdr->csum_offset, 0); + ASSIGN_UNLESS_EQUAL(hdr->flags, 0); + break; + } + + /* TCP Segmentation Offload */ + if (cookie->ol_flags & PKT_TX_TCP_SEG) { + hdr->gso_type = (cookie->ol_flags & PKT_TX_IPV6) ? + VIRTIO_NET_HDR_GSO_TCPV6 : + VIRTIO_NET_HDR_GSO_TCPV4; + hdr->gso_size = cookie->tso_segsz; + hdr->hdr_len = + cookie->l2_len + + cookie->l3_len + + cookie->l4_len; + } else { + ASSIGN_UNLESS_EQUAL(hdr->gso_type, 0); + ASSIGN_UNLESS_EQUAL(hdr->gso_size, 0); + ASSIGN_UNLESS_EQUAL(hdr->hdr_len, 0); + } + } +} + +static inline void +virtqueue_enqueue_xmit_packed(struct virtnet_tx *txvq, struct rte_mbuf *cookie, + uint16_t needed, int can_push, int in_order) +{ + struct virtio_tx_region *txr = txvq->virtio_net_hdr_mz->addr; + struct vq_desc_extra *dxp; + struct virtqueue *vq = txvq->vq; + struct vring_packed_desc *start_dp, *head_dp; + uint16_t idx, id, head_idx, head_flags; + int16_t head_size = vq->hw->vtnet_hdr_size; + struct virtio_net_hdr *hdr; + uint16_t prev; + bool prepend_header = false; + + id = in_order ? vq->vq_avail_idx : vq->vq_desc_head_idx; + + dxp = &vq->vq_descx[id]; + dxp->ndescs = needed; + dxp->cookie = cookie; + + head_idx = vq->vq_avail_idx; + idx = head_idx; + prev = head_idx; + start_dp = vq->vq_packed.ring.desc; + + head_dp = &vq->vq_packed.ring.desc[idx]; + head_flags = cookie->next ? VRING_DESC_F_NEXT : 0; + head_flags |= vq->vq_packed.cached_flags; + + if (can_push) { + /* prepend cannot fail, checked by caller */ + hdr = rte_pktmbuf_mtod_offset(cookie, struct virtio_net_hdr *, + -head_size); + prepend_header = true; + + /* if offload disabled, it is not zeroed below, do it now */ + if (!vq->hw->has_tx_offload) + virtqueue_clear_net_hdr(hdr); + } else { + /* setup first tx ring slot to point to header + * stored in reserved region. + */ + start_dp[idx].addr = txvq->virtio_net_hdr_mem + + RTE_PTR_DIFF(&txr[idx].tx_hdr, txr); + start_dp[idx].len = vq->hw->vtnet_hdr_size; + hdr = (struct virtio_net_hdr *)&txr[idx].tx_hdr; + idx++; + if (idx >= vq->vq_nentries) { + idx -= vq->vq_nentries; + vq->vq_packed.cached_flags ^= + VRING_PACKED_DESC_F_AVAIL_USED; + } + } + + virtqueue_xmit_offload(hdr, cookie, vq->hw->has_tx_offload); + + do { + uint16_t flags; + + start_dp[idx].addr = VIRTIO_MBUF_DATA_DMA_ADDR(cookie, vq); + start_dp[idx].len = cookie->data_len; + if (prepend_header) { + start_dp[idx].addr -= head_size; + start_dp[idx].len += head_size; + prepend_header = false; + } + + if (likely(idx != head_idx)) { + flags = cookie->next ? VRING_DESC_F_NEXT : 0; + flags |= vq->vq_packed.cached_flags; + start_dp[idx].flags = flags; + } + prev = idx; + idx++; + if (idx >= vq->vq_nentries) { + idx -= vq->vq_nentries; + vq->vq_packed.cached_flags ^= + VRING_PACKED_DESC_F_AVAIL_USED; + } + } while ((cookie = cookie->next) != NULL); + + start_dp[prev].id = id; + + vq->vq_free_cnt = (uint16_t)(vq->vq_free_cnt - needed); + vq->vq_avail_idx = idx; + + if (!in_order) { + vq->vq_desc_head_idx = dxp->next; + if (vq->vq_desc_head_idx == VQ_RING_DESC_CHAIN_END) + vq->vq_desc_tail_idx = VQ_RING_DESC_CHAIN_END; + } + + virtqueue_store_flags_packed(head_dp, head_flags, + vq->hw->weak_barriers); +} + +static void +vq_ring_free_id_packed(struct virtqueue *vq, uint16_t id) +{ + struct vq_desc_extra *dxp; + + dxp = &vq->vq_descx[id]; + vq->vq_free_cnt += dxp->ndescs; + + if (vq->vq_desc_tail_idx == VQ_RING_DESC_CHAIN_END) + vq->vq_desc_head_idx = id; + else + vq->vq_descx[vq->vq_desc_tail_idx].next = id; + + vq->vq_desc_tail_idx = id; + dxp->next = VQ_RING_DESC_CHAIN_END; +} + +static void +virtio_xmit_cleanup_inorder_packed(struct virtqueue *vq, int num) +{ + uint16_t used_idx, id, curr_id, free_cnt = 0; + uint16_t size = vq->vq_nentries; + struct vring_packed_desc *desc = vq->vq_packed.ring.desc; + struct vq_desc_extra *dxp; + + used_idx = vq->vq_used_cons_idx; + /* desc_is_used has a load-acquire or rte_cio_rmb inside + * and wait for used desc in virtqueue. + */ + while (num > 0 && desc_is_used(&desc[used_idx], vq)) { + id = desc[used_idx].id; + do { + curr_id = used_idx; + dxp = &vq->vq_descx[used_idx]; + used_idx += dxp->ndescs; + free_cnt += dxp->ndescs; + num -= dxp->ndescs; + if (used_idx >= size) { + used_idx -= size; + vq->vq_packed.used_wrap_counter ^= 1; + } + if (dxp->cookie != NULL) { + rte_pktmbuf_free(dxp->cookie); + dxp->cookie = NULL; + } + } while (curr_id != id); + } + vq->vq_used_cons_idx = used_idx; + vq->vq_free_cnt += free_cnt; +} + +static void +virtio_xmit_cleanup_normal_packed(struct virtqueue *vq, int num) +{ + uint16_t used_idx, id; + uint16_t size = vq->vq_nentries; + struct vring_packed_desc *desc = vq->vq_packed.ring.desc; + struct vq_desc_extra *dxp; + + used_idx = vq->vq_used_cons_idx; + /* desc_is_used has a load-acquire or rte_cio_rmb inside + * and wait for used desc in virtqueue. + */ + while (num-- && desc_is_used(&desc[used_idx], vq)) { + id = desc[used_idx].id; + dxp = &vq->vq_descx[id]; + vq->vq_used_cons_idx += dxp->ndescs; + if (vq->vq_used_cons_idx >= size) { + vq->vq_used_cons_idx -= size; + vq->vq_packed.used_wrap_counter ^= 1; + } + vq_ring_free_id_packed(vq, id); + if (dxp->cookie != NULL) { + rte_pktmbuf_free(dxp->cookie); + dxp->cookie = NULL; + } + used_idx = vq->vq_used_cons_idx; + } +} + +/* Cleanup from completed transmits. */ +static inline void +virtio_xmit_cleanup_packed(struct virtqueue *vq, int num, int in_order) +{ + if (in_order) + virtio_xmit_cleanup_inorder_packed(vq, num); + else + virtio_xmit_cleanup_normal_packed(vq, num); +} + +static inline void +virtio_xmit_cleanup(struct virtqueue *vq, uint16_t num) +{ + uint16_t i, used_idx, desc_idx; + for (i = 0; i < num; i++) { + struct vring_used_elem *uep; + struct vq_desc_extra *dxp; + + used_idx = (uint16_t)(vq->vq_used_cons_idx & + (vq->vq_nentries - 1)); + uep = &vq->vq_split.ring.used->ring[used_idx]; + + desc_idx = (uint16_t)uep->id; + dxp = &vq->vq_descx[desc_idx]; + vq->vq_used_cons_idx++; + vq_ring_free_chain(vq, desc_idx); + + if (dxp->cookie != NULL) { + rte_pktmbuf_free(dxp->cookie); + dxp->cookie = NULL; + } + } +} + +/* Cleanup from completed inorder transmits. */ +static __rte_always_inline void +virtio_xmit_cleanup_inorder(struct virtqueue *vq, uint16_t num) +{ + uint16_t i, idx = vq->vq_used_cons_idx; + int16_t free_cnt = 0; + struct vq_desc_extra *dxp = NULL; + + if (unlikely(num == 0)) + return; + + for (i = 0; i < num; i++) { + dxp = &vq->vq_descx[idx++ & (vq->vq_nentries - 1)]; + free_cnt += dxp->ndescs; + if (dxp->cookie != NULL) { + rte_pktmbuf_free(dxp->cookie); + dxp->cookie = NULL; + } + } + + vq->vq_free_cnt += free_cnt; + vq->vq_used_cons_idx = idx; +} #endif /* _VIRTQUEUE_H_ */ From patchwork Wed Apr 29 07:28:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69520 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B4D14A00BE; Wed, 29 Apr 2020 09:29:28 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3CF291D929; Wed, 29 Apr 2020 09:28:43 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id DF23F1D917 for ; Wed, 29 Apr 2020 09:28:36 +0200 (CEST) IronPort-SDR: wV9k3J8a1uawfU72zcL8MCyKhtpyTULSY1VkzzkmBngry2mNrdnbFduQeBHJaskoeXp3qqX0wC a0ur18iVVlQg== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:36 -0700 IronPort-SDR: fNlMxNAD7nJkRWlZoIQVPujy1sM8spgQpYCx/Z8U/1uzngEggJHKwQxvZjg4ECmmaV4/1suY+4 JLWrWiZstQ9Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415104" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:34 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:19 +0800 Message-Id: <20200429072822.102745-7-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 6/9] net/virtio: add vectorized packed ring Rx path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Optimize packed ring Rx path with SIMD instructions. Solution of optimization is pretty like vhost, is that split path into batch and single functions. Batch function is further optimized by AVX512 instructions. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/drivers/net/virtio/Makefile b/drivers/net/virtio/Makefile index c9edb84ee..102b1deab 100644 --- a/drivers/net/virtio/Makefile +++ b/drivers/net/virtio/Makefile @@ -36,6 +36,41 @@ else ifneq ($(filter y,$(CONFIG_RTE_ARCH_ARM) $(CONFIG_RTE_ARCH_ARM64)),) SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_simple_neon.c endif +ifneq ($(FORCE_DISABLE_AVX512), y) + CC_AVX512_SUPPORT=\ + $(shell $(CC) -march=native -dM -E - &1 | \ + sed '/./{H;$$!d} ; x ; /AVX512F/!d; /AVX512BW/!d; /AVX512VL/!d' | \ + grep -q AVX512 && echo 1) +endif + +ifeq ($(CC_AVX512_SUPPORT), 1) +CFLAGS += -DCC_AVX512_SUPPORT +SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_rxtx_packed_avx.c + +ifeq ($(RTE_TOOLCHAIN), gcc) +ifeq ($(shell test $(GCC_VERSION) -ge 83 && echo 1), 1) +CFLAGS += -DVIRTIO_GCC_UNROLL_PRAGMA +endif +endif + +ifeq ($(RTE_TOOLCHAIN), clang) +ifeq ($(shell test $(CLANG_MAJOR_VERSION)$(CLANG_MINOR_VERSION) -ge 37 && echo 1), 1) +CFLAGS += -DVIRTIO_CLANG_UNROLL_PRAGMA +endif +endif + +ifeq ($(RTE_TOOLCHAIN), icc) +ifeq ($(shell test $(ICC_MAJOR_VERSION) -ge 16 && echo 1), 1) +CFLAGS += -DVIRTIO_ICC_UNROLL_PRAGMA +endif +endif + +CFLAGS_virtio_rxtx_packed_avx.o += -mavx512f -mavx512bw -mavx512vl +ifeq ($(shell test $(GCC_VERSION) -ge 100 && echo 1), 1) +CFLAGS_virtio_rxtx_packed_avx.o += -Wno-zero-length-bounds +endif +endif + ifeq ($(CONFIG_RTE_VIRTIO_USER),y) SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_user.c SRCS-$(CONFIG_RTE_LIBRTE_VIRTIO_PMD) += virtio_user/vhost_kernel.c diff --git a/drivers/net/virtio/meson.build b/drivers/net/virtio/meson.build index 15150eea1..8e68c3039 100644 --- a/drivers/net/virtio/meson.build +++ b/drivers/net/virtio/meson.build @@ -9,6 +9,20 @@ sources += files('virtio_ethdev.c', deps += ['kvargs', 'bus_pci'] if arch_subdir == 'x86' + if '-mno-avx512f' not in machine_args + if cc.has_argument('-mavx512f') and cc.has_argument('-mavx512vl') and cc.has_argument('-mavx512bw') + cflags += ['-mavx512f', '-mavx512bw', '-mavx512vl'] + cflags += ['-DCC_AVX512_SUPPORT'] + if (toolchain == 'gcc' and cc.version().version_compare('>=8.3.0')) + cflags += '-DVHOST_GCC_UNROLL_PRAGMA' + elif (toolchain == 'clang' and cc.version().version_compare('>=3.7.0')) + cflags += '-DVHOST_CLANG_UNROLL_PRAGMA' + elif (toolchain == 'icc' and cc.version().version_compare('>=16.0.0')) + cflags += '-DVHOST_ICC_UNROLL_PRAGMA' + endif + sources += files('virtio_rxtx_packed_avx.c') + endif + endif sources += files('virtio_rxtx_simple_sse.c') elif arch_subdir == 'ppc' sources += files('virtio_rxtx_simple_altivec.c') diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h index febaf17a8..5c112cac7 100644 --- a/drivers/net/virtio/virtio_ethdev.h +++ b/drivers/net/virtio/virtio_ethdev.h @@ -105,6 +105,9 @@ uint16_t virtio_xmit_pkts_inorder(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts); +uint16_t virtio_recv_pkts_packed_vec(void *rx_queue, struct rte_mbuf **rx_pkts, + uint16_t nb_pkts); + int eth_virtio_dev_init(struct rte_eth_dev *eth_dev); void virtio_interrupt_handler(void *param); diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index a549991aa..c50980c82 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -2030,3 +2030,13 @@ virtio_xmit_pkts_inorder(void *tx_queue, return nb_tx; } + +#ifndef CC_AVX512_SUPPORT +uint16_t +virtio_recv_pkts_packed_vec(void *rx_queue __rte_unused, + struct rte_mbuf **rx_pkts __rte_unused, + uint16_t nb_pkts __rte_unused) +{ + return 0; +} +#endif /* ifndef CC_AVX512_SUPPORT */ diff --git a/drivers/net/virtio/virtio_rxtx_packed_avx.c b/drivers/net/virtio/virtio_rxtx_packed_avx.c new file mode 100644 index 000000000..88831a786 --- /dev/null +++ b/drivers/net/virtio/virtio_rxtx_packed_avx.c @@ -0,0 +1,374 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright(c) 2010-2020 Intel Corporation + */ + +#include +#include +#include +#include +#include + +#include + +#include "virtio_logs.h" +#include "virtio_ethdev.h" +#include "virtio_pci.h" +#include "virtqueue.h" + +#define BYTE_SIZE 8 +/* flag bits offset in packed ring desc higher 64bits */ +#define FLAGS_BITS_OFFSET ((offsetof(struct vring_packed_desc, flags) - \ + offsetof(struct vring_packed_desc, len)) * BYTE_SIZE) + +#define PACKED_FLAGS_MASK ((0ULL | VRING_PACKED_DESC_F_AVAIL_USED) << \ + FLAGS_BITS_OFFSET) + +#define PACKED_BATCH_SIZE (RTE_CACHE_LINE_SIZE / \ + sizeof(struct vring_packed_desc)) +#define PACKED_BATCH_MASK (PACKED_BATCH_SIZE - 1) + +#ifdef VIRTIO_GCC_UNROLL_PRAGMA +#define virtio_for_each_try_unroll(iter, val, size) _Pragma("GCC unroll 4") \ + for (iter = val; iter < size; iter++) +#endif + +#ifdef VIRTIO_CLANG_UNROLL_PRAGMA +#define virtio_for_each_try_unroll(iter, val, size) _Pragma("unroll 4") \ + for (iter = val; iter < size; iter++) +#endif + +#ifdef VIRTIO_ICC_UNROLL_PRAGMA +#define virtio_for_each_try_unroll(iter, val, size) _Pragma("unroll (4)") \ + for (iter = val; iter < size; iter++) +#endif + +#ifndef virtio_for_each_try_unroll +#define virtio_for_each_try_unroll(iter, val, num) \ + for (iter = val; iter < num; iter++) +#endif + +static inline void +virtio_update_batch_stats(struct virtnet_stats *stats, + uint16_t pkt_len1, + uint16_t pkt_len2, + uint16_t pkt_len3, + uint16_t pkt_len4) +{ + stats->bytes += pkt_len1; + stats->bytes += pkt_len2; + stats->bytes += pkt_len3; + stats->bytes += pkt_len4; +} + +/* Optionally fill offload information in structure */ +static inline int +virtio_vec_rx_offload(struct rte_mbuf *m, struct virtio_net_hdr *hdr) +{ + struct rte_net_hdr_lens hdr_lens; + uint32_t hdrlen, ptype; + int l4_supported = 0; + + /* nothing to do */ + if (hdr->flags == 0) + return 0; + + /* GSO not support in vec path, skip check */ + m->ol_flags |= PKT_RX_IP_CKSUM_UNKNOWN; + + ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK); + m->packet_type = ptype; + if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP || + (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP || + (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP) + l4_supported = 1; + + if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { + hdrlen = hdr_lens.l2_len + hdr_lens.l3_len + hdr_lens.l4_len; + if (hdr->csum_start <= hdrlen && l4_supported) { + m->ol_flags |= PKT_RX_L4_CKSUM_NONE; + } else { + /* Unknown proto or tunnel, do sw cksum. We can assume + * the cksum field is in the first segment since the + * buffers we provided to the host are large enough. + * In case of SCTP, this will be wrong since it's a CRC + * but there's nothing we can do. + */ + uint16_t csum = 0, off; + + rte_raw_cksum_mbuf(m, hdr->csum_start, + rte_pktmbuf_pkt_len(m) - hdr->csum_start, + &csum); + if (likely(csum != 0xffff)) + csum = ~csum; + off = hdr->csum_offset + hdr->csum_start; + if (rte_pktmbuf_data_len(m) >= off + 1) + *rte_pktmbuf_mtod_offset(m, uint16_t *, + off) = csum; + } + } else if (hdr->flags & VIRTIO_NET_HDR_F_DATA_VALID && l4_supported) { + m->ol_flags |= PKT_RX_L4_CKSUM_GOOD; + } + + return 0; +} + +static inline uint16_t +virtqueue_dequeue_batch_packed_vec(struct virtnet_rx *rxvq, + struct rte_mbuf **rx_pkts) +{ + struct virtqueue *vq = rxvq->vq; + struct virtio_hw *hw = vq->hw; + uint16_t hdr_size = hw->vtnet_hdr_size; + uint64_t addrs[PACKED_BATCH_SIZE]; + uint16_t id = vq->vq_used_cons_idx; + uint8_t desc_stats; + uint16_t i; + void *desc_addr; + + if (id & PACKED_BATCH_MASK) + return -1; + + if (unlikely((id + PACKED_BATCH_SIZE) > vq->vq_nentries)) + return -1; + + /* only care avail/used bits */ + __m512i v_mask = _mm512_maskz_set1_epi64(0xaa, PACKED_FLAGS_MASK); + desc_addr = &vq->vq_packed.ring.desc[id]; + + __m512i v_desc = _mm512_loadu_si512(desc_addr); + __m512i v_flag = _mm512_and_epi64(v_desc, v_mask); + + __m512i v_used_flag = _mm512_setzero_si512(); + if (vq->vq_packed.used_wrap_counter) + v_used_flag = _mm512_maskz_set1_epi64(0xaa, PACKED_FLAGS_MASK); + + /* Check all descs are used */ + desc_stats = _mm512_cmpneq_epu64_mask(v_flag, v_used_flag); + if (desc_stats) + return -1; + + virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) { + rx_pkts[i] = (struct rte_mbuf *)vq->vq_descx[id + i].cookie; + rte_packet_prefetch(rte_pktmbuf_mtod(rx_pkts[i], void *)); + + addrs[i] = (uintptr_t)rx_pkts[i]->rx_descriptor_fields1; + } + + /* + * load len from desc, store into mbuf pkt_len and data_len + * len limiated by l6bit buf_len, pkt_len[16:31] can be ignored + */ + const __mmask16 mask = 0x6 | 0x6 << 4 | 0x6 << 8 | 0x6 << 12; + __m512i values = _mm512_maskz_shuffle_epi32(mask, v_desc, 0xAA); + + /* reduce hdr_len from pkt_len and data_len */ + __m512i mbuf_len_offset = _mm512_maskz_set1_epi32(mask, + (uint32_t)-hdr_size); + + __m512i v_value = _mm512_add_epi32(values, mbuf_len_offset); + + /* assert offset of data_len */ + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_len) != + offsetof(struct rte_mbuf, rx_descriptor_fields1) + 8); + + __m512i v_index = _mm512_set_epi64(addrs[3] + 8, addrs[3], + addrs[2] + 8, addrs[2], + addrs[1] + 8, addrs[1], + addrs[0] + 8, addrs[0]); + /* batch store into mbufs */ + _mm512_i64scatter_epi64(0, v_index, v_value, 1); + + if (hw->has_rx_offload) { + virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) { + char *addr = (char *)rx_pkts[i]->buf_addr + + RTE_PKTMBUF_HEADROOM - hdr_size; + virtio_vec_rx_offload(rx_pkts[i], + (struct virtio_net_hdr *)addr); + } + } + + virtio_update_batch_stats(&rxvq->stats, rx_pkts[0]->pkt_len, + rx_pkts[1]->pkt_len, rx_pkts[2]->pkt_len, + rx_pkts[3]->pkt_len); + + vq->vq_free_cnt += PACKED_BATCH_SIZE; + + vq->vq_used_cons_idx += PACKED_BATCH_SIZE; + if (vq->vq_used_cons_idx >= vq->vq_nentries) { + vq->vq_used_cons_idx -= vq->vq_nentries; + vq->vq_packed.used_wrap_counter ^= 1; + } + + return 0; +} + +static uint16_t +virtqueue_dequeue_single_packed_vec(struct virtnet_rx *rxvq, + struct rte_mbuf **rx_pkts) +{ + uint16_t used_idx, id; + uint32_t len; + struct virtqueue *vq = rxvq->vq; + struct virtio_hw *hw = vq->hw; + uint32_t hdr_size = hw->vtnet_hdr_size; + struct virtio_net_hdr *hdr; + struct vring_packed_desc *desc; + struct rte_mbuf *cookie; + + desc = vq->vq_packed.ring.desc; + used_idx = vq->vq_used_cons_idx; + if (!desc_is_used(&desc[used_idx], vq)) + return -1; + + len = desc[used_idx].len; + id = desc[used_idx].id; + cookie = (struct rte_mbuf *)vq->vq_descx[id].cookie; + if (unlikely(cookie == NULL)) { + PMD_DRV_LOG(ERR, "vring descriptor with no mbuf cookie at %u", + vq->vq_used_cons_idx); + return -1; + } + rte_prefetch0(cookie); + rte_packet_prefetch(rte_pktmbuf_mtod(cookie, void *)); + + cookie->data_off = RTE_PKTMBUF_HEADROOM; + cookie->ol_flags = 0; + cookie->pkt_len = (uint32_t)(len - hdr_size); + cookie->data_len = (uint32_t)(len - hdr_size); + + hdr = (struct virtio_net_hdr *)((char *)cookie->buf_addr + + RTE_PKTMBUF_HEADROOM - hdr_size); + if (hw->has_rx_offload) + virtio_vec_rx_offload(cookie, hdr); + + *rx_pkts = cookie; + + rxvq->stats.bytes += cookie->pkt_len; + + vq->vq_free_cnt++; + vq->vq_used_cons_idx++; + if (vq->vq_used_cons_idx >= vq->vq_nentries) { + vq->vq_used_cons_idx -= vq->vq_nentries; + vq->vq_packed.used_wrap_counter ^= 1; + } + + return 0; +} + +static inline void +virtio_recv_refill_packed_vec(struct virtnet_rx *rxvq, + struct rte_mbuf **cookie, + uint16_t num) +{ + struct virtqueue *vq = rxvq->vq; + struct vring_packed_desc *start_dp = vq->vq_packed.ring.desc; + uint16_t flags = vq->vq_packed.cached_flags; + struct virtio_hw *hw = vq->hw; + struct vq_desc_extra *dxp; + uint16_t idx, i; + uint16_t batch_num, total_num = 0; + uint16_t head_idx = vq->vq_avail_idx; + uint16_t head_flag = vq->vq_packed.cached_flags; + uint64_t addr; + + do { + idx = vq->vq_avail_idx; + + batch_num = PACKED_BATCH_SIZE; + if (unlikely((idx + PACKED_BATCH_SIZE) > vq->vq_nentries)) + batch_num = vq->vq_nentries - idx; + if (unlikely((total_num + batch_num) > num)) + batch_num = num - total_num; + + virtio_for_each_try_unroll(i, 0, batch_num) { + dxp = &vq->vq_descx[idx + i]; + dxp->cookie = (void *)cookie[total_num + i]; + + addr = VIRTIO_MBUF_ADDR(cookie[total_num + i], vq) + + RTE_PKTMBUF_HEADROOM - hw->vtnet_hdr_size; + start_dp[idx + i].addr = addr; + start_dp[idx + i].len = cookie[total_num + i]->buf_len + - RTE_PKTMBUF_HEADROOM + hw->vtnet_hdr_size; + if (total_num || i) { + virtqueue_store_flags_packed(&start_dp[idx + i], + flags, hw->weak_barriers); + } + } + + vq->vq_avail_idx += batch_num; + if (vq->vq_avail_idx >= vq->vq_nentries) { + vq->vq_avail_idx -= vq->vq_nentries; + vq->vq_packed.cached_flags ^= + VRING_PACKED_DESC_F_AVAIL_USED; + flags = vq->vq_packed.cached_flags; + } + total_num += batch_num; + } while (total_num < num); + + virtqueue_store_flags_packed(&start_dp[head_idx], head_flag, + hw->weak_barriers); + vq->vq_free_cnt = (uint16_t)(vq->vq_free_cnt - num); +} + +uint16_t +virtio_recv_pkts_packed_vec(void *rx_queue, + struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + struct virtnet_rx *rxvq = rx_queue; + struct virtqueue *vq = rxvq->vq; + struct virtio_hw *hw = vq->hw; + uint16_t num, nb_rx = 0; + uint32_t nb_enqueued = 0; + uint16_t free_cnt = vq->vq_free_thresh; + + if (unlikely(hw->started == 0)) + return nb_rx; + + num = RTE_MIN(VIRTIO_MBUF_BURST_SZ, nb_pkts); + if (likely(num > PACKED_BATCH_SIZE)) + num = num - ((vq->vq_used_cons_idx + num) % PACKED_BATCH_SIZE); + + while (num) { + if (!virtqueue_dequeue_batch_packed_vec(rxvq, + &rx_pkts[nb_rx])) { + nb_rx += PACKED_BATCH_SIZE; + num -= PACKED_BATCH_SIZE; + continue; + } + if (!virtqueue_dequeue_single_packed_vec(rxvq, + &rx_pkts[nb_rx])) { + nb_rx++; + num--; + continue; + } + break; + }; + + PMD_RX_LOG(DEBUG, "dequeue:%d", num); + + rxvq->stats.packets += nb_rx; + + if (likely(vq->vq_free_cnt >= free_cnt)) { + struct rte_mbuf *new_pkts[free_cnt]; + if (likely(rte_pktmbuf_alloc_bulk(rxvq->mpool, new_pkts, + free_cnt) == 0)) { + virtio_recv_refill_packed_vec(rxvq, new_pkts, + free_cnt); + nb_enqueued += free_cnt; + } else { + struct rte_eth_dev *dev = + &rte_eth_devices[rxvq->port_id]; + dev->data->rx_mbuf_alloc_failed += free_cnt; + } + } + + if (likely(nb_enqueued)) { + if (unlikely(virtqueue_kick_prepare_packed(vq))) { + virtqueue_notify(vq); + PMD_RX_LOG(DEBUG, "Notified"); + } + } + + return nb_rx; +} diff --git a/drivers/net/virtio/virtio_user_ethdev.c b/drivers/net/virtio/virtio_user_ethdev.c index 40ad786cc..c54698ad1 100644 --- a/drivers/net/virtio/virtio_user_ethdev.c +++ b/drivers/net/virtio/virtio_user_ethdev.c @@ -528,6 +528,7 @@ virtio_user_eth_dev_alloc(struct rte_vdev_device *vdev) hw->use_msix = 1; hw->modern = 0; hw->use_vec_rx = 0; + hw->use_vec_tx = 0; hw->use_inorder_rx = 0; hw->use_inorder_tx = 0; hw->virtio_user_dev = dev; @@ -739,8 +740,19 @@ virtio_user_pmd_probe(struct rte_vdev_device *dev) goto end; } - if (vectorized) - hw->use_vec_rx = 1; + if (vectorized) { + if (packed_vq) { +#if defined(CC_AVX512_SUPPORT) + hw->use_vec_rx = 1; + hw->use_vec_tx = 1; +#else + PMD_INIT_LOG(INFO, + "building environment do not support packed ring vectorized"); +#endif + } else { + hw->use_vec_rx = 1; + } + } rte_eth_dev_probing_finish(eth_dev); ret = 0; From patchwork Wed Apr 29 07:28:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69521 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 73252A00BE; Wed, 29 Apr 2020 09:29:39 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DE6761D92C; Wed, 29 Apr 2020 09:28:44 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 884891D919 for ; Wed, 29 Apr 2020 09:28:38 +0200 (CEST) IronPort-SDR: OkW3nug2Bpt2HC7M0mk8Q8GzCXC//o8R37/B/XEGEBs2/p/Olbup8PShbGH5uAap9M7E03HMqi ysLdd///56iA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:38 -0700 IronPort-SDR: +Q2QCXwcvqgjd3BtsCYS3Ak2E2kEBlpYVmWFn7XIZ3kVYngdhJs9BbHoMHWTG6kzRtf9Q8dtPu dTi4lriYVhaA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415108" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:36 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:20 +0800 Message-Id: <20200429072822.102745-8-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 7/9] net/virtio: add vectorized packed ring Tx path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Optimize packed ring Tx path like Rx path. Split Tx path into batch and single Tx functions. Batch function is further optimized by AVX512 instructions. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/drivers/net/virtio/virtio_ethdev.h b/drivers/net/virtio/virtio_ethdev.h index 5c112cac7..b7d52d497 100644 --- a/drivers/net/virtio/virtio_ethdev.h +++ b/drivers/net/virtio/virtio_ethdev.h @@ -108,6 +108,9 @@ uint16_t virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t virtio_recv_pkts_packed_vec(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts); +uint16_t virtio_xmit_pkts_packed_vec(void *tx_queue, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts); + int eth_virtio_dev_init(struct rte_eth_dev *eth_dev); void virtio_interrupt_handler(void *param); diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c index c50980c82..050541a10 100644 --- a/drivers/net/virtio/virtio_rxtx.c +++ b/drivers/net/virtio/virtio_rxtx.c @@ -2039,4 +2039,12 @@ virtio_recv_pkts_packed_vec(void *rx_queue __rte_unused, { return 0; } + +uint16_t +virtio_xmit_pkts_packed_vec(void *tx_queue __rte_unused, + struct rte_mbuf **tx_pkts __rte_unused, + uint16_t nb_pkts __rte_unused) +{ + return 0; +} #endif /* ifndef CC_AVX512_SUPPORT */ diff --git a/drivers/net/virtio/virtio_rxtx_packed_avx.c b/drivers/net/virtio/virtio_rxtx_packed_avx.c index 88831a786..d130d68bf 100644 --- a/drivers/net/virtio/virtio_rxtx_packed_avx.c +++ b/drivers/net/virtio/virtio_rxtx_packed_avx.c @@ -23,6 +23,24 @@ #define PACKED_FLAGS_MASK ((0ULL | VRING_PACKED_DESC_F_AVAIL_USED) << \ FLAGS_BITS_OFFSET) +/* reference count offset in mbuf rearm data */ +#define REFCNT_BITS_OFFSET ((offsetof(struct rte_mbuf, refcnt) - \ + offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE) +/* segment number offset in mbuf rearm data */ +#define SEG_NUM_BITS_OFFSET ((offsetof(struct rte_mbuf, nb_segs) - \ + offsetof(struct rte_mbuf, rearm_data)) * BYTE_SIZE) + +/* default rearm data */ +#define DEFAULT_REARM_DATA (1ULL << SEG_NUM_BITS_OFFSET | \ + 1ULL << REFCNT_BITS_OFFSET) + +/* id bits offset in packed ring desc higher 64bits */ +#define ID_BITS_OFFSET ((offsetof(struct vring_packed_desc, id) - \ + offsetof(struct vring_packed_desc, len)) * BYTE_SIZE) + +/* net hdr short size mask */ +#define NET_HDR_MASK 0x3F + #define PACKED_BATCH_SIZE (RTE_CACHE_LINE_SIZE / \ sizeof(struct vring_packed_desc)) #define PACKED_BATCH_MASK (PACKED_BATCH_SIZE - 1) @@ -60,6 +78,221 @@ virtio_update_batch_stats(struct virtnet_stats *stats, stats->bytes += pkt_len4; } +static inline int +virtqueue_enqueue_batch_packed_vec(struct virtnet_tx *txvq, + struct rte_mbuf **tx_pkts) +{ + struct virtqueue *vq = txvq->vq; + uint16_t head_size = vq->hw->vtnet_hdr_size; + uint16_t idx = vq->vq_avail_idx; + struct virtio_net_hdr *hdr; + struct vq_desc_extra *dxp; + uint16_t i, cmp; + + if (vq->vq_avail_idx & PACKED_BATCH_MASK) + return -1; + + if (unlikely((idx + PACKED_BATCH_SIZE) > vq->vq_nentries)) + return -1; + + /* Load four mbufs rearm data */ + RTE_BUILD_BUG_ON(REFCNT_BITS_OFFSET >= 64); + RTE_BUILD_BUG_ON(SEG_NUM_BITS_OFFSET >= 64); + __m256i mbufs = _mm256_set_epi64x(*tx_pkts[3]->rearm_data, + *tx_pkts[2]->rearm_data, + *tx_pkts[1]->rearm_data, + *tx_pkts[0]->rearm_data); + + /* refcnt=1 and nb_segs=1 */ + __m256i mbuf_ref = _mm256_set1_epi64x(DEFAULT_REARM_DATA); + __m256i head_rooms = _mm256_set1_epi16(head_size); + + /* Check refcnt and nb_segs */ + const __mmask16 mask = 0x6 | 0x6 << 4 | 0x6 << 8 | 0x6 << 12; + cmp = _mm256_mask_cmpneq_epu16_mask(mask, mbufs, mbuf_ref); + if (unlikely(cmp)) + return -1; + + /* Check headroom is enough */ + const __mmask16 data_mask = 0x1 | 0x1 << 4 | 0x1 << 8 | 0x1 << 12; + RTE_BUILD_BUG_ON(offsetof(struct rte_mbuf, data_off) != + offsetof(struct rte_mbuf, rearm_data)); + cmp = _mm256_mask_cmplt_epu16_mask(data_mask, mbufs, head_rooms); + if (unlikely(cmp)) + return -1; + + virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) { + dxp = &vq->vq_descx[idx + i]; + dxp->ndescs = 1; + dxp->cookie = tx_pkts[i]; + } + + virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) { + tx_pkts[i]->data_off -= head_size; + tx_pkts[i]->data_len += head_size; + } + + __m512i descs_base = _mm512_set_epi64(tx_pkts[3]->data_len, + VIRTIO_MBUF_ADDR(tx_pkts[3], vq), + tx_pkts[2]->data_len, + VIRTIO_MBUF_ADDR(tx_pkts[2], vq), + tx_pkts[1]->data_len, + VIRTIO_MBUF_ADDR(tx_pkts[1], vq), + tx_pkts[0]->data_len, + VIRTIO_MBUF_ADDR(tx_pkts[0], vq)); + + /* id offset and data offset */ + __m512i data_offsets = _mm512_set_epi64((uint64_t)3 << ID_BITS_OFFSET, + tx_pkts[3]->data_off, + (uint64_t)2 << ID_BITS_OFFSET, + tx_pkts[2]->data_off, + (uint64_t)1 << ID_BITS_OFFSET, + tx_pkts[1]->data_off, + 0, tx_pkts[0]->data_off); + + __m512i new_descs = _mm512_add_epi64(descs_base, data_offsets); + + uint64_t flags_temp = (uint64_t)idx << ID_BITS_OFFSET | + (uint64_t)vq->vq_packed.cached_flags << FLAGS_BITS_OFFSET; + + /* flags offset and guest virtual address offset */ + __m128i flag_offset = _mm_set_epi64x(flags_temp, 0); + __m512i v_offset = _mm512_broadcast_i32x4(flag_offset); + __m512i v_desc = _mm512_add_epi64(new_descs, v_offset); + + if (!vq->hw->has_tx_offload) { + __m128i all_mask = _mm_set1_epi16(0xFFFF); + virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) { + hdr = rte_pktmbuf_mtod_offset(tx_pkts[i], + struct virtio_net_hdr *, -head_size); + __m128i v_hdr = _mm_loadu_si128((void *)hdr); + if (unlikely(_mm_mask_test_epi16_mask(NET_HDR_MASK, + v_hdr, all_mask))) { + __m128i all_zero = _mm_setzero_si128(); + _mm_mask_storeu_epi16((void *)hdr, + NET_HDR_MASK, all_zero); + } + } + } else { + virtio_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) { + hdr = rte_pktmbuf_mtod_offset(tx_pkts[i], + struct virtio_net_hdr *, -head_size); + virtqueue_xmit_offload(hdr, tx_pkts[i], true); + } + } + + /* Enqueue Packet buffers */ + _mm512_storeu_si512((void *)&vq->vq_packed.ring.desc[idx], v_desc); + + virtio_update_batch_stats(&txvq->stats, tx_pkts[0]->pkt_len, + tx_pkts[1]->pkt_len, tx_pkts[2]->pkt_len, + tx_pkts[3]->pkt_len); + + vq->vq_avail_idx += PACKED_BATCH_SIZE; + vq->vq_free_cnt -= PACKED_BATCH_SIZE; + + if (vq->vq_avail_idx >= vq->vq_nentries) { + vq->vq_avail_idx -= vq->vq_nentries; + vq->vq_packed.cached_flags ^= + VRING_PACKED_DESC_F_AVAIL_USED; + } + + return 0; +} + +static inline int +virtqueue_enqueue_single_packed_vec(struct virtnet_tx *txvq, + struct rte_mbuf *txm) +{ + struct virtqueue *vq = txvq->vq; + struct virtio_hw *hw = vq->hw; + uint16_t hdr_size = hw->vtnet_hdr_size; + uint16_t slots, can_push; + int16_t need; + + /* How many main ring entries are needed to this Tx? + * any_layout => number of segments + * default => number of segments + 1 + */ + can_push = rte_mbuf_refcnt_read(txm) == 1 && + RTE_MBUF_DIRECT(txm) && + txm->nb_segs == 1 && + rte_pktmbuf_headroom(txm) >= hdr_size; + + slots = txm->nb_segs + !can_push; + need = slots - vq->vq_free_cnt; + + /* Positive value indicates it need free vring descriptors */ + if (unlikely(need > 0)) { + virtio_xmit_cleanup_inorder_packed(vq, need); + need = slots - vq->vq_free_cnt; + if (unlikely(need > 0)) { + PMD_TX_LOG(ERR, + "No free tx descriptors to transmit"); + return -1; + } + } + + /* Enqueue Packet buffers */ + virtqueue_enqueue_xmit_packed(txvq, txm, slots, can_push, 1); + + txvq->stats.bytes += txm->pkt_len; + return 0; +} + +uint16_t +virtio_xmit_pkts_packed_vec(void *tx_queue, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts) +{ + struct virtnet_tx *txvq = tx_queue; + struct virtqueue *vq = txvq->vq; + struct virtio_hw *hw = vq->hw; + uint16_t nb_tx = 0; + uint16_t remained; + + if (unlikely(hw->started == 0 && tx_pkts != hw->inject_pkts)) + return nb_tx; + + if (unlikely(nb_pkts < 1)) + return nb_pkts; + + PMD_TX_LOG(DEBUG, "%d packets to xmit", nb_pkts); + + if (vq->vq_free_cnt <= vq->vq_nentries - vq->vq_free_thresh) + virtio_xmit_cleanup_inorder_packed(vq, vq->vq_free_thresh); + + remained = RTE_MIN(nb_pkts, vq->vq_free_cnt); + + while (remained) { + if (remained >= PACKED_BATCH_SIZE) { + if (!virtqueue_enqueue_batch_packed_vec(txvq, + &tx_pkts[nb_tx])) { + nb_tx += PACKED_BATCH_SIZE; + remained -= PACKED_BATCH_SIZE; + continue; + } + } + if (!virtqueue_enqueue_single_packed_vec(txvq, + tx_pkts[nb_tx])) { + nb_tx++; + remained--; + continue; + } + break; + }; + + txvq->stats.packets += nb_tx; + + if (likely(nb_tx)) { + if (unlikely(virtqueue_kick_prepare_packed(vq))) { + virtqueue_notify(vq); + PMD_TX_LOG(DEBUG, "Notified backend after xmit"); + } + } + + return nb_tx; +} + /* Optionally fill offload information in structure */ static inline int virtio_vec_rx_offload(struct rte_mbuf *m, struct virtio_net_hdr *hdr) From patchwork Wed Apr 29 07:28:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69522 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8669BA00BE; Wed, 29 Apr 2020 09:29:51 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 9623B1D933; Wed, 29 Apr 2020 09:28:50 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id 326251D90A for ; Wed, 29 Apr 2020 09:28:40 +0200 (CEST) IronPort-SDR: HwH7X+yzI6tLgTC4Sgt1dKIYDEZJNbRHsKHLb/vO6dd6sqnF89T689/IY2vYaROGSKYevIhzkJ vg8/AD8Epo8Q== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:39 -0700 IronPort-SDR: gfior5Rse5YstR/opyfspYPzUsiuWUQN4ywT/ffPtwtolWbD8RBPNVMloZxzfj0Eb5krLPQlvw KP9sjWkr8YhQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415113" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:38 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:21 +0800 Message-Id: <20200429072822.102745-9-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 8/9] net/virtio: add election for vectorized path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Rewrite vectorized path selection logic. Default setting comes from vectorized devarg, then checks each criteria. Packed ring vectorized path need: AVX512F and required extensions are supported by compiler and host VERSION_1 and IN_ORDER features are negotiated mergeable feature is not negotiated LRO offloading is disabled Split ring vectorized rx path need: mergeable and IN_ORDER features are not negotiated LRO, chksum and vlan strip offloadings are disabled Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/drivers/net/virtio/virtio_ethdev.c b/drivers/net/virtio/virtio_ethdev.c index 0a69a4db1..e86d4e08f 100644 --- a/drivers/net/virtio/virtio_ethdev.c +++ b/drivers/net/virtio/virtio_ethdev.c @@ -1523,9 +1523,12 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev) if (vtpci_packed_queue(hw)) { PMD_INIT_LOG(INFO, "virtio: using packed ring %s Tx path on port %u", - hw->use_inorder_tx ? "inorder" : "standard", + hw->use_vec_tx ? "vectorized" : "standard", eth_dev->data->port_id); - eth_dev->tx_pkt_burst = virtio_xmit_pkts_packed; + if (hw->use_vec_tx) + eth_dev->tx_pkt_burst = virtio_xmit_pkts_packed_vec; + else + eth_dev->tx_pkt_burst = virtio_xmit_pkts_packed; } else { if (hw->use_inorder_tx) { PMD_INIT_LOG(INFO, "virtio: using inorder Tx path on port %u", @@ -1539,7 +1542,13 @@ set_rxtx_funcs(struct rte_eth_dev *eth_dev) } if (vtpci_packed_queue(hw)) { - if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { + if (hw->use_vec_rx) { + PMD_INIT_LOG(INFO, + "virtio: using packed ring vectorized Rx path on port %u", + eth_dev->data->port_id); + eth_dev->rx_pkt_burst = + &virtio_recv_pkts_packed_vec; + } else if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { PMD_INIT_LOG(INFO, "virtio: using packed ring mergeable buffer Rx path on port %u", eth_dev->data->port_id); @@ -1952,8 +1961,17 @@ eth_virtio_dev_init(struct rte_eth_dev *eth_dev) goto err_virtio_init; if (vectorized) { - if (!vtpci_packed_queue(hw)) + if (!vtpci_packed_queue(hw)) { + hw->use_vec_rx = 1; + } else { +#if !defined(CC_AVX512_SUPPORT) + PMD_DRV_LOG(INFO, + "building environment do not support packed ring vectorized"); +#else hw->use_vec_rx = 1; + hw->use_vec_tx = 1; +#endif + } } hw->opened = true; @@ -2288,31 +2306,66 @@ virtio_dev_configure(struct rte_eth_dev *dev) return -EBUSY; } - if (vtpci_with_feature(hw, VIRTIO_F_IN_ORDER)) { - hw->use_inorder_tx = 1; - hw->use_inorder_rx = 1; - hw->use_vec_rx = 0; - } - if (vtpci_packed_queue(hw)) { +#if defined(RTE_ARCH_X86_64) && defined(CC_AVX512_SUPPORT) + if ((hw->use_vec_rx || hw->use_vec_tx) && + (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX512F) || + !vtpci_with_feature(hw, VIRTIO_F_IN_ORDER) || + !vtpci_with_feature(hw, VIRTIO_F_VERSION_1))) { + PMD_DRV_LOG(INFO, + "disabled packed ring vectorized path for requirements not met"); + hw->use_vec_rx = 0; + hw->use_vec_tx = 0; + } +#else hw->use_vec_rx = 0; - hw->use_inorder_rx = 0; - } + hw->use_vec_tx = 0; +#endif + + if (hw->use_vec_rx) { + if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { + PMD_DRV_LOG(INFO, + "disabled packed ring vectorized rx for mrg_rxbuf enabled"); + hw->use_vec_rx = 0; + } + if (rx_offloads & DEV_RX_OFFLOAD_TCP_LRO) { + PMD_DRV_LOG(INFO, + "disabled packed ring vectorized rx for TCP_LRO enabled"); + hw->use_vec_rx = 0; + } + } + } else { + if (vtpci_with_feature(hw, VIRTIO_F_IN_ORDER)) { + hw->use_inorder_tx = 1; + hw->use_inorder_rx = 1; + hw->use_vec_rx = 0; + } + + if (hw->use_vec_rx) { #if defined RTE_ARCH_ARM64 || defined RTE_ARCH_ARM - if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON)) { - hw->use_vec_rx = 0; - } + if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_NEON)) { + PMD_DRV_LOG(INFO, + "disabled split ring vectorized path for requirement not met"); + hw->use_vec_rx = 0; + } #endif - if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { - hw->use_vec_rx = 0; - } + if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF)) { + PMD_DRV_LOG(INFO, + "disabled split ring vectorized rx for mrg_rxbuf enabled"); + hw->use_vec_rx = 0; + } - if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM | - DEV_RX_OFFLOAD_TCP_CKSUM | - DEV_RX_OFFLOAD_TCP_LRO | - DEV_RX_OFFLOAD_VLAN_STRIP)) - hw->use_vec_rx = 0; + if (rx_offloads & (DEV_RX_OFFLOAD_UDP_CKSUM | + DEV_RX_OFFLOAD_TCP_CKSUM | + DEV_RX_OFFLOAD_TCP_LRO | + DEV_RX_OFFLOAD_VLAN_STRIP)) { + PMD_DRV_LOG(INFO, + "disabled split ring vectorized rx for offloading enabled"); + hw->use_vec_rx = 0; + } + } + } return 0; } From patchwork Wed Apr 29 07:28:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marvin Liu X-Patchwork-Id: 69523 X-Patchwork-Delegate: maxime.coquelin@redhat.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C2A82A00BE; Wed, 29 Apr 2020 09:30:01 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id D9C101D93F; Wed, 29 Apr 2020 09:28:52 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id D61CE1D924 for ; Wed, 29 Apr 2020 09:28:41 +0200 (CEST) IronPort-SDR: Mg6qXkIKlrCvuVtpX+Fd97Hp5+jVEcJbDeGdNsI9rQ4ye9w3VxbQIZLuhFDmAa3gkZvSWjrchl E9S/RA/YpyyA== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Apr 2020 00:28:41 -0700 IronPort-SDR: SY8Rfxl06UH6pL56/qp7q2EwyJ6yvlihyhlCY8Se2+l/syv4q4deyh2I02MvgLl7LyIXN7aN91 /8oUxMbgvjLw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.73,330,1583222400"; d="scan'208";a="282415119" Received: from npg-dpdk-virtual-marvin-dev.sh.intel.com ([10.67.119.56]) by fmsmga004.fm.intel.com with ESMTP; 29 Apr 2020 00:28:39 -0700 From: Marvin Liu To: maxime.coquelin@redhat.com, xiaolong.ye@intel.com, zhihong.wang@intel.com Cc: dev@dpdk.org, Marvin Liu Date: Wed, 29 Apr 2020 15:28:22 +0800 Message-Id: <20200429072822.102745-10-yong.liu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20200429072822.102745-1-yong.liu@intel.com> References: <20200313174230.74661-1-yong.liu@intel.com> <20200429072822.102745-1-yong.liu@intel.com> Subject: [dpdk-dev] [PATCH v12 9/9] doc: add packed vectorized path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Document packed virtqueue vectorized path selection logic in virtio net PMD. Signed-off-by: Marvin Liu Reviewed-by: Maxime Coquelin diff --git a/doc/guides/nics/virtio.rst b/doc/guides/nics/virtio.rst index fdd0790e0..226f4308d 100644 --- a/doc/guides/nics/virtio.rst +++ b/doc/guides/nics/virtio.rst @@ -482,6 +482,13 @@ according to below configuration: both negotiated, this path will be selected. #. Packed virtqueue in-order non-mergeable path: If in-order feature is negotiated and Rx mergeable is not negotiated, this path will be selected. +#. Packed virtqueue vectorized Rx path: If building and running environment support + AVX512 && in-order feature is negotiated && Rx mergeable is not negotiated && + TCP_LRO Rx offloading is disabled && vectorized option enabled, + this path will be selected. +#. Packed virtqueue vectorized Tx path: If building and running environment support + AVX512 && in-order feature is negotiated && vectorized option enabled, + this path will be selected. Rx/Tx callbacks of each Virtio path ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -504,6 +511,8 @@ are shown in below table: Packed virtqueue non-meregable path virtio_recv_pkts_packed virtio_xmit_pkts_packed Packed virtqueue in-order mergeable path virtio_recv_mergeable_pkts_packed virtio_xmit_pkts_packed Packed virtqueue in-order non-mergeable path virtio_recv_pkts_packed virtio_xmit_pkts_packed + Packed virtqueue vectorized Rx path virtio_recv_pkts_packed_vec virtio_xmit_pkts_packed + Packed virtqueue vectorized Tx path virtio_recv_pkts_packed virtio_xmit_pkts_packed_vec ============================================ ================================= ======================== Virtio paths Support Status from Release to Release @@ -521,20 +530,22 @@ All virtio paths support status are shown in below table: .. table:: Virtio Paths and Releases - ============================================ ============= ============= ============= - Virtio paths 16.11 ~ 18.05 18.08 ~ 18.11 19.02 ~ 19.11 - ============================================ ============= ============= ============= - Split virtqueue mergeable path Y Y Y - Split virtqueue non-mergeable path Y Y Y - Split virtqueue vectorized Rx path Y Y Y - Split virtqueue simple Tx path Y N N - Split virtqueue in-order mergeable path Y Y - Split virtqueue in-order non-mergeable path Y Y - Packed virtqueue mergeable path Y - Packed virtqueue non-mergeable path Y - Packed virtqueue in-order mergeable path Y - Packed virtqueue in-order non-mergeable path Y - ============================================ ============= ============= ============= + ============================================ ============= ============= ============= ======= + Virtio paths 16.11 ~ 18.05 18.08 ~ 18.11 19.02 ~ 19.11 20.05 ~ + ============================================ ============= ============= ============= ======= + Split virtqueue mergeable path Y Y Y Y + Split virtqueue non-mergeable path Y Y Y Y + Split virtqueue vectorized Rx path Y Y Y Y + Split virtqueue simple Tx path Y N N N + Split virtqueue in-order mergeable path Y Y Y + Split virtqueue in-order non-mergeable path Y Y Y + Packed virtqueue mergeable path Y Y + Packed virtqueue non-mergeable path Y Y + Packed virtqueue in-order mergeable path Y Y + Packed virtqueue in-order non-mergeable path Y Y + Packed virtqueue vectorized Rx path Y + Packed virtqueue vectorized Tx path Y + ============================================ ============= ============= ============= ======= QEMU Support Status ~~~~~~~~~~~~~~~~~~~