From patchwork Tue Oct 26 13:14:25 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Laatz X-Patchwork-Id: 102916 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D8011A0547; Tue, 26 Oct 2021 15:15:36 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 83F6E410F0; Tue, 26 Oct 2021 15:15:33 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 5713B407FF for ; Tue, 26 Oct 2021 15:15:30 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10148"; a="293360808" X-IronPort-AV: E=Sophos;i="5.87,184,1631602800"; d="scan'208";a="293360808" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 26 Oct 2021 06:14:42 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.87,184,1631602800"; d="scan'208";a="497331608" Received: from silpixa00401122.ir.intel.com ([10.55.128.10]) by orsmga008.jf.intel.com with ESMTP; 26 Oct 2021 06:14:40 -0700 From: Kevin Laatz To: dev@dpdk.org Cc: thomas@monjalon.net, bruce.richardson@intel.com, fengchengwen@huawei.com, conor.walsh@intel.com, Konstantin Ananyev , Kevin Laatz Date: Tue, 26 Oct 2021 13:14:25 +0000 Message-Id: <20211026131432.2734145-2-kevin.laatz@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211026131432.2734145-1-kevin.laatz@intel.com> References: <20210910172737.2561156-1-kevin.laatz@intel.com> <20211026131432.2734145-1-kevin.laatz@intel.com> MIME-Version: 1.0 Subject: [dpdk-dev] [PATCH v5 1/8] examples/ioat: always use same lcore for both DMA requests enqueue and dequeue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Konstantin Ananyev Few changes in ioat sample behaviour: - Always do SW copy for packet metadata (mbuf fields) - Always use same lcore for both DMA requests enqueue and dequeue Main reasons for that: a) it is safer, as idxd PMD doesn't support MT safe enqueue/dequeue (yet). b) sort of more apples to apples comparison with sw copy. c) from my testing things are faster that way. Documentation updates to reflect these changes are also included. Signed-off-by: Konstantin Ananyev Signed-off-by: Kevin Laatz Reviewed-by: Conor Walsh --- doc/guides/sample_app_ug/ioat.rst | 39 +++---- examples/ioat/ioatfwd.c | 185 ++++++++++++++++-------------- 2 files changed, 117 insertions(+), 107 deletions(-) diff --git a/doc/guides/sample_app_ug/ioat.rst b/doc/guides/sample_app_ug/ioat.rst index ee0a627b06..2e9d3d6258 100644 --- a/doc/guides/sample_app_ug/ioat.rst +++ b/doc/guides/sample_app_ug/ioat.rst @@ -183,10 +183,8 @@ After that each port application assigns resources needed. :end-before: >8 End of assigning each port resources. :dedent: 1 -Depending on mode set (whether copy should be done by software or by hardware) -special structures are assigned to each port. If software copy was chosen, -application have to assign ring structures for packet exchanging between lcores -assigned to ports. +Ring structures are assigned for exchanging packets between lcores for both SW +and HW copy modes. .. literalinclude:: ../../../examples/ioat/ioatfwd.c :language: c @@ -275,12 +273,8 @@ copying device's buffer using ``ioat_enqueue_packets()`` which calls buffer the copy operations are started by calling ``rte_ioat_perform_ops()``. Function ``rte_ioat_enqueue_copy()`` operates on physical address of the packet. Structure ``rte_mbuf`` contains only physical address to -start of the data buffer (``buf_iova``). Thus the address is adjusted -by ``addr_offset`` value in order to get the address of ``rearm_data`` -member of ``rte_mbuf``. That way both the packet data and metadata can -be copied in a single operation. This method can be used because the mbufs -are direct mbufs allocated by the apps. If another app uses external buffers, -or indirect mbufs, then multiple copy operations must be used. +start of the data buffer (``buf_iova``). Thus the ``rte_pktmbuf_iova()`` API is +used to get the address of the start of the data within the mbuf. .. literalinclude:: ../../../examples/ioat/ioatfwd.c :language: c @@ -289,12 +283,13 @@ or indirect mbufs, then multiple copy operations must be used. :dedent: 0 -All completed copies are processed by ``ioat_tx_port()`` function. When using -hardware copy mode the function invokes ``rte_ioat_completed_ops()`` -on each assigned IOAT channel to gather copied packets. If software copy -mode is used the function dequeues copied packets from the rte_ring. Then each -packet MAC address is changed if it was enabled. After that copies are sent -in burst mode using `` rte_eth_tx_burst()``. +Once the copies have been completed (this includes gathering the completions in +HW copy mode), the copied packets are enqueued to the ``rx_to_tx_ring``, which +is used to pass the packets to the TX function. + +All completed copies are processed by ``ioat_tx_port()`` function. This function +dequeues copied packets from the ``rx_to_tx_ring``. Then each packet MAC address is changed +if it was enabled. After that copies are sent in burst mode using ``rte_eth_tx_burst()``. .. literalinclude:: ../../../examples/ioat/ioatfwd.c @@ -306,11 +301,9 @@ in burst mode using `` rte_eth_tx_burst()``. The Packet Copying Functions ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -In order to perform packet copy there is a user-defined function -``pktmbuf_sw_copy()`` used. It copies a whole packet by copying -metadata from source packet to new mbuf, and then copying a data -chunk of source packet. Both memory copies are done using -``rte_memcpy()``: +In order to perform SW packet copy, there are user-defined functions to first copy +the packet metadata (``pktmbuf_metadata_copy()``) and then the packet data +(``pktmbuf_sw_copy()``): .. literalinclude:: ../../../examples/ioat/ioatfwd.c :language: c @@ -318,8 +311,8 @@ chunk of source packet. Both memory copies are done using :end-before: >8 End of perform packet copy there is a user-defined function. :dedent: 0 -The metadata in this example is copied from ``rearm_data`` member of -``rte_mbuf`` struct up to ``cacheline1``. +The metadata in this example is copied from ``rx_descriptor_fields1`` marker of +``rte_mbuf`` struct up to ``buf_len`` member. In order to understand why software packet copying is done as shown above please refer to the "Mbuf Library" section of the diff --git a/examples/ioat/ioatfwd.c b/examples/ioat/ioatfwd.c index b1aa2767a0..fd690ce326 100644 --- a/examples/ioat/ioatfwd.c +++ b/examples/ioat/ioatfwd.c @@ -331,43 +331,36 @@ update_mac_addrs(struct rte_mbuf *m, uint32_t dest_portid) /* Perform packet copy there is a user-defined function. 8< */ static inline void -pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst) +pktmbuf_metadata_copy(const struct rte_mbuf *src, struct rte_mbuf *dst) { - /* Copy packet metadata */ - rte_memcpy(&dst->rearm_data, - &src->rearm_data, - offsetof(struct rte_mbuf, cacheline1) - - offsetof(struct rte_mbuf, rearm_data)); + dst->data_off = src->data_off; + memcpy(&dst->rx_descriptor_fields1, &src->rx_descriptor_fields1, + offsetof(struct rte_mbuf, buf_len) - + offsetof(struct rte_mbuf, rx_descriptor_fields1)); +} - /* Copy packet data */ +/* Copy packet data */ +static inline void +pktmbuf_sw_copy(struct rte_mbuf *src, struct rte_mbuf *dst) +{ rte_memcpy(rte_pktmbuf_mtod(dst, char *), rte_pktmbuf_mtod(src, char *), src->data_len); } /* >8 End of perform packet copy there is a user-defined function. */ static uint32_t -ioat_enqueue_packets(struct rte_mbuf **pkts, +ioat_enqueue_packets(struct rte_mbuf *pkts[], struct rte_mbuf *pkts_copy[], uint32_t nb_rx, uint16_t dev_id) { int ret; uint32_t i; - struct rte_mbuf *pkts_copy[MAX_PKT_BURST]; - - const uint64_t addr_offset = RTE_PTR_DIFF(pkts[0]->buf_addr, - &pkts[0]->rearm_data); - - ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, - (void *)pkts_copy, nb_rx); - - if (unlikely(ret < 0)) - rte_exit(EXIT_FAILURE, "Unable to allocate memory.\n"); for (i = 0; i < nb_rx; i++) { /* Perform data copy */ ret = rte_ioat_enqueue_copy(dev_id, - pkts[i]->buf_iova - addr_offset, - pkts_copy[i]->buf_iova - addr_offset, - rte_pktmbuf_data_len(pkts[i]) + addr_offset, + rte_pktmbuf_iova(pkts[i]), + rte_pktmbuf_iova(pkts_copy[i]), + rte_pktmbuf_data_len(pkts[i]), (uintptr_t)pkts[i], (uintptr_t)pkts_copy[i]); @@ -376,20 +369,50 @@ ioat_enqueue_packets(struct rte_mbuf **pkts, } ret = i; - /* Free any not enqueued packets. */ - rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts[i], nb_rx - i); - rte_mempool_put_bulk(ioat_pktmbuf_pool, (void *)&pkts_copy[i], - nb_rx - i); - return ret; } +static inline uint32_t +ioat_enqueue(struct rte_mbuf *pkts[], struct rte_mbuf *pkts_copy[], + uint32_t num, uint16_t dev_id) +{ + uint32_t n; + + n = ioat_enqueue_packets(pkts, pkts_copy, num, dev_id); + if (n > 0) + rte_ioat_perform_ops(dev_id); + + return n; +} + +static inline uint32_t +ioat_dequeue(struct rte_mbuf *src[], struct rte_mbuf *dst[], uint32_t num, + uint16_t dev_id) +{ + int32_t rc; + /* Dequeue the mbufs from IOAT device. Since all memory + * is DPDK pinned memory and therefore all addresses should + * be valid, we don't check for copy errors + */ + rc = rte_ioat_completed_ops(dev_id, num, NULL, NULL, + (void *)src, (void *)dst); + if (rc < 0) { + RTE_LOG(CRIT, IOAT, + "rte_ioat_completed_ops(%hu) failedi, error: %d\n", + dev_id, rte_errno); + rc = 0; + } + return rc; +} + /* Receive packets on one port and enqueue to IOAT rawdev or rte_ring. 8< */ static void ioat_rx_port(struct rxtx_port_config *rx_config) { + int32_t ret; uint32_t nb_rx, nb_enq, i, j; struct rte_mbuf *pkts_burst[MAX_PKT_BURST]; + struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST]; for (i = 0; i < rx_config->nb_queues; i++) { @@ -401,40 +424,54 @@ ioat_rx_port(struct rxtx_port_config *rx_config) port_statistics.rx[rx_config->rxtx_port] += nb_rx; + ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, + (void *)pkts_burst_copy, nb_rx); + + if (unlikely(ret < 0)) + rte_exit(EXIT_FAILURE, + "Unable to allocate memory.\n"); + + for (j = 0; j < nb_rx; j++) + pktmbuf_metadata_copy(pkts_burst[j], + pkts_burst_copy[j]); + if (copy_mode == COPY_MODE_IOAT_NUM) { - /* Perform packet hardware copy */ - nb_enq = ioat_enqueue_packets(pkts_burst, + + /* enqueue packets for hardware copy */ + nb_enq = ioat_enqueue(pkts_burst, pkts_burst_copy, nb_rx, rx_config->ioat_ids[i]); - if (nb_enq > 0) - rte_ioat_perform_ops(rx_config->ioat_ids[i]); - } else { - /* Perform packet software copy, free source packets */ - int ret; - struct rte_mbuf *pkts_burst_copy[MAX_PKT_BURST]; - ret = rte_mempool_get_bulk(ioat_pktmbuf_pool, - (void *)pkts_burst_copy, nb_rx); + /* free any not enqueued packets. */ + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)&pkts_burst[nb_enq], + nb_rx - nb_enq); + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)&pkts_burst_copy[nb_enq], + nb_rx - nb_enq); - if (unlikely(ret < 0)) - rte_exit(EXIT_FAILURE, - "Unable to allocate memory.\n"); + port_statistics.copy_dropped[rx_config->rxtx_port] += + (nb_rx - nb_enq); + /* get completed copies */ + nb_rx = ioat_dequeue(pkts_burst, pkts_burst_copy, + MAX_PKT_BURST, rx_config->ioat_ids[i]); + } else { + /* Perform packet software copy, free source packets */ for (j = 0; j < nb_rx; j++) pktmbuf_sw_copy(pkts_burst[j], pkts_burst_copy[j]); + } - rte_mempool_put_bulk(ioat_pktmbuf_pool, - (void *)pkts_burst, nb_rx); + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)pkts_burst, nb_rx); - nb_enq = rte_ring_enqueue_burst( - rx_config->rx_to_tx_ring, - (void *)pkts_burst_copy, nb_rx, NULL); + nb_enq = rte_ring_enqueue_burst(rx_config->rx_to_tx_ring, + (void *)pkts_burst_copy, nb_rx, NULL); - /* Free any not enqueued packets. */ - rte_mempool_put_bulk(ioat_pktmbuf_pool, - (void *)&pkts_burst_copy[nb_enq], - nb_rx - nb_enq); - } + /* Free any not enqueued packets. */ + rte_mempool_put_bulk(ioat_pktmbuf_pool, + (void *)&pkts_burst_copy[nb_enq], + nb_rx - nb_enq); port_statistics.copy_dropped[rx_config->rxtx_port] += (nb_rx - nb_enq); @@ -446,51 +483,33 @@ ioat_rx_port(struct rxtx_port_config *rx_config) static void ioat_tx_port(struct rxtx_port_config *tx_config) { - uint32_t i, j, nb_dq = 0; - struct rte_mbuf *mbufs_src[MAX_PKT_BURST]; - struct rte_mbuf *mbufs_dst[MAX_PKT_BURST]; + uint32_t i, j, nb_dq, nb_tx; + struct rte_mbuf *mbufs[MAX_PKT_BURST]; for (i = 0; i < tx_config->nb_queues; i++) { - if (copy_mode == COPY_MODE_IOAT_NUM) { - /* Dequeue the mbufs from IOAT device. Since all memory - * is DPDK pinned memory and therefore all addresses should - * be valid, we don't check for copy errors - */ - nb_dq = rte_ioat_completed_ops( - tx_config->ioat_ids[i], MAX_PKT_BURST, NULL, NULL, - (void *)mbufs_src, (void *)mbufs_dst); - } else { - /* Dequeue the mbufs from rx_to_tx_ring. */ - nb_dq = rte_ring_dequeue_burst( - tx_config->rx_to_tx_ring, (void *)mbufs_dst, - MAX_PKT_BURST, NULL); - } - - if ((int32_t) nb_dq <= 0) - return; - if (copy_mode == COPY_MODE_IOAT_NUM) - rte_mempool_put_bulk(ioat_pktmbuf_pool, - (void *)mbufs_src, nb_dq); + /* Dequeue the mbufs from rx_to_tx_ring. */ + nb_dq = rte_ring_dequeue_burst(tx_config->rx_to_tx_ring, + (void *)mbufs, MAX_PKT_BURST, NULL); + if (nb_dq == 0) + continue; /* Update macs if enabled */ if (mac_updating) { for (j = 0; j < nb_dq; j++) - update_mac_addrs(mbufs_dst[j], + update_mac_addrs(mbufs[j], tx_config->rxtx_port); } - const uint16_t nb_tx = rte_eth_tx_burst( - tx_config->rxtx_port, 0, - (void *)mbufs_dst, nb_dq); + nb_tx = rte_eth_tx_burst(tx_config->rxtx_port, 0, + (void *)mbufs, nb_dq); port_statistics.tx[tx_config->rxtx_port] += nb_tx; /* Free any unsent packets. */ if (unlikely(nb_tx < nb_dq)) rte_mempool_put_bulk(ioat_pktmbuf_pool, - (void *)&mbufs_dst[nb_tx], - nb_dq - nb_tx); + (void *)&mbufs[nb_tx], nb_dq - nb_tx); } } /* >8 End of transmitting packets from IOAT. */ @@ -852,9 +871,6 @@ port_init(uint16_t portid, struct rte_mempool *mbuf_pool, uint16_t nb_queues) local_port_conf.rx_adv_conf.rss_conf.rss_hf &= dev_info.flow_type_rss_offloads; - if (dev_info.tx_offload_capa & RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE) - local_port_conf.txmode.offloads |= - RTE_ETH_TX_OFFLOAD_MBUF_FAST_FREE; ret = rte_eth_dev_configure(portid, nb_queues, 1, &local_port_conf); if (ret < 0) rte_exit(EXIT_FAILURE, "Cannot configure device:" @@ -973,7 +989,8 @@ main(int argc, char **argv) /* Allocates mempool to hold the mbufs. 8< */ nb_mbufs = RTE_MAX(nb_ports * (nb_queues * (nb_rxd + nb_txd + - 4 * MAX_PKT_BURST) + rte_lcore_count() * MEMPOOL_CACHE_SIZE), + 4 * MAX_PKT_BURST + ring_size) + ring_size + + rte_lcore_count() * MEMPOOL_CACHE_SIZE), MIN_POOL_SIZE); /* Create the mbuf pool */ @@ -1005,8 +1022,8 @@ main(int argc, char **argv) if (copy_mode == COPY_MODE_IOAT_NUM) assign_rawdevs(); - else /* copy_mode == COPY_MODE_SW_NUM */ - assign_rings(); + + assign_rings(); /* >8 End of assigning each port resources. */ start_forwarding_cores();