From patchwork Tue Aug 13 13:37:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ori Kam X-Patchwork-Id: 57663 X-Patchwork-Delegate: ferruh.yigit@amd.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 055F11B94A; Tue, 13 Aug 2019 15:38:05 +0200 (CEST) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 8BBC7378B for ; Tue, 13 Aug 2019 15:38:03 +0200 (CEST) Received: from Internal Mail-Server by MTLPINE1 (envelope-from orika@mellanox.com) with ESMTPS (AES256-SHA encrypted); 13 Aug 2019 16:37:58 +0300 Received: from pegasus04.mtr.labs.mlnx. (pegasus04.mtr.labs.mlnx [10.210.16.126]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id x7DDbwfv025624; Tue, 13 Aug 2019 16:37:58 +0300 From: Ori Kam To: thomas@monjalon.net, ferruh.yigit@intel.com, arybchenko@solarflare.com, shahafs@mellanox.com, viacheslavo@mellanox.com, alexr@mellanox.com Cc: dev@dpdk.org, orika@mellanox.com Date: Tue, 13 Aug 2019 13:37:48 +0000 Message-Id: <1565703468-55617-1-git-send-email-orika@mellanox.com> X-Mailer: git-send-email 1.8.3.1 Subject: [dpdk-dev] [RFC] ethdev: support hairpin queue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" This RFC replaces RFC[1]. The hairpin feature (different name can be forward) acts as "bump on the wire", meaning that a packet that is received from the wire can be modified using offloaded action and then sent back to the wire without application intervention which save CPU cycles. The hairpin is the inverse function of loopback in which application sends a packet then it is received again by the application without being sent to the wire. The hairpin can be used by a number of different NVF, for example load balancer, gateway and so on. As can be seen from the hairpin description, hairpin is basically RX queue connected to TX queue. During the design phase I was thinking of two ways to implement this feature the first one is adding a new rte flow action. and the second one is create a special kind of queue. The advantages of using the queue approch: 1. More control for the application. queue depth (the memory size that should be used). 2. Enable QoS. QoS is normaly a parametr of queue, so in this approch it will be easy to integrate with such system. 3. Native integression with the rte flow API. Just setting the target queue/rss to hairpin queue, will result that the traffic will be routed to the hairpin queue. 4. Enable queue offloading. Each hairpin Rxq can be connected Txq / number of Txqs which can belong to a different ports assuming the PMD supports it. The same goes the other way each hairpin Txq can be connected to one or more Rxqs. This is the reason that both the Txq setup and Rxq setup are getting the hairpin configuration structure. From PMD prespctive the number of Rxq/Txq is the total of standard queues + hairpin queues. To configure hairpin queue the user should call rte_eth_rx_hairpin_queue_setup / rte_eth_tx_hairpin_queue_setup insteed of the normal queue setup functions. The hairpin queues are not part of the normal RSS functiosn. To use the queues the user simply create a flow that points to RSS/queue actions that are hairpin queues. [1] http://inbox.dpdk.org/dev/AM4PR05MB3425E55B721A4090FCBE7D80DB1E0@AM4PR05MB3425.eurprd05.prod.outlook.com/ Signed-off-by: Ori Kam --- lib/librte_ethdev/rte_ethdev.h | 124 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 124 insertions(+) diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h index dc6596b..fb54162 100644 --- a/lib/librte_ethdev/rte_ethdev.h +++ b/lib/librte_ethdev/rte_ethdev.h @@ -804,6 +804,15 @@ struct rte_eth_txconf { }; /** + * A structure used to configure hairpin binding.. + */ +struct rte_eth_hairpin_conf { + uint16_t peer_n; /**< The number of peer queues and queues. */ + uint16_t (*ports)[]; /**< The peer ports. */ + uint16_t (*queues)[]; /**< The peer queues. */ +}; + +/** * A structure contains information about HW descriptor ring limitations. */ struct rte_eth_desc_lim { @@ -1013,6 +1022,7 @@ struct rte_eth_conf { #define DEV_RX_OFFLOAD_KEEP_CRC 0x00010000 #define DEV_RX_OFFLOAD_SCTP_CKSUM 0x00020000 #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM 0x00040000 +#define DEV_RX_OFFLOAD_HAIRPIN 0x00080000 #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \ DEV_RX_OFFLOAD_UDP_CKSUM | \ @@ -1075,6 +1085,7 @@ struct rte_eth_conf { * Application must set PKT_TX_METADATA and mbuf metadata field. */ #define DEV_TX_OFFLOAD_MATCH_METADATA 0x00200000 +#define DEV_TX_OFFLOAD_HAIRPIN 0x00400000 #define RTE_ETH_DEV_CAPA_RUNTIME_RX_QUEUE_SETUP 0x00000001 /**< Device supports Rx queue setup after device started*/ @@ -1769,6 +1780,56 @@ int rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id, struct rte_mempool *mb_pool); /** + * Allocate and set up a hairpin receive queue for an Ethernet device. + * + * The function set up the selected queue to be used in hairpin. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param rx_queue_id + * The index of the receive queue to set up. + * The value must be in the range [0, nb_rx_queue - 1] previously supplied + * to rte_eth_dev_configure(). + * @param nb_rx_desc + * The number of receive descriptors to allocate for the receive ring. + * @param socket_id + * The *socket_id* argument is the socket identifier in case of NUMA. + * The value can be *SOCKET_ID_ANY* if there is no NUMA constraint for + * the DMA memory allocated for the receive descriptors of the ring. + * @param rx_conf + * The pointer to the configuration data to be used for the receive queue. + * NULL value is allowed, in which case default RX configuration + * will be used. + * The *rx_conf* structure contains an *rx_thresh* structure with the values + * of the Prefetch, Host, and Write-Back threshold registers of the receive + * ring. + * In addition it contains the hardware offloads features to activate using + * the DEV_RX_OFFLOAD_* flags. + * If an offloading set in rx_conf->offloads + * hasn't been set in the input argument eth_conf->rxmode.offloads + * to rte_eth_dev_configure(), it is a new added offloading, it must be + * per-queue type and it is enabled for the queue. + * No need to repeat any bit in rx_conf->offloads which has already been + * enabled in rte_eth_dev_configure() at port level. An offloading enabled + * at port level can't be disabled at queue level. + * @param hairpin_conf + * The pointer to the hairpin binding configuration. + * @return + * - 0: Success, receive queue correctly set up. + * - -EINVAL: The size of network buffers which can be allocated from the + * memory pool does not fit the various buffer sizes allowed by the + * device controller. + * - -ENOMEM: Unable to allocate the receive ring descriptors or to + * allocate network memory buffers from the memory pool when + * initializing receive descriptors. + */ +int rte_eth_rx_hairpin_queue_setup + (uint16_t port_id, uint16_t rx_queue_id, + uint16_t nb_rx_desc, unsigned int socket_id, + const struct rte_eth_rxconf *rx_conf, + const struct rte_eth_hairpin_conf *hairpin_conf); + +/** * Allocate and set up a transmit queue for an Ethernet device. * * @param port_id @@ -1821,6 +1882,69 @@ int rte_eth_tx_queue_setup(uint16_t port_id, uint16_t tx_queue_id, const struct rte_eth_txconf *tx_conf); /** + * Allocate and set up a transmit hairpin queue for an Ethernet device. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param tx_queue_id + * The index of the transmit queue to set up. + * The value must be in the range [0, nb_tx_queue - 1] previously supplied + * to rte_eth_dev_configure(). + * @param nb_tx_desc + * The number of transmit descriptors to allocate for the transmit ring. + * @param socket_id + * The *socket_id* argument is the socket identifier in case of NUMA. + * Its value can be *SOCKET_ID_ANY* if there is no NUMA constraint for + * the DMA memory allocated for the transmit descriptors of the ring. + * @param tx_conf + * The pointer to the configuration data to be used for the transmit queue. + * NULL value is allowed, in which case default RX configuration + * will be used. + * The *tx_conf* structure contains the following data: + * - The *tx_thresh* structure with the values of the Prefetch, Host, and + * Write-Back threshold registers of the transmit ring. + * When setting Write-Back threshold to the value greater then zero, + * *tx_rs_thresh* value should be explicitly set to one. + * - The *tx_free_thresh* value indicates the [minimum] number of network + * buffers that must be pending in the transmit ring to trigger their + * [implicit] freeing by the driver transmit function. + * - The *tx_rs_thresh* value indicates the [minimum] number of transmit + * descriptors that must be pending in the transmit ring before setting the + * RS bit on a descriptor by the driver transmit function. + * The *tx_rs_thresh* value should be less or equal then + * *tx_free_thresh* value, and both of them should be less then + * *nb_tx_desc* - 3. + * - The *txq_flags* member contains flags to pass to the TX queue setup + * function to configure the behavior of the TX queue. This should be set + * to 0 if no special configuration is required. + * This API is obsolete and will be deprecated. Applications + * should set it to ETH_TXQ_FLAGS_IGNORE and use + * the offloads field below. + * - The *offloads* member contains Tx offloads to be enabled. + * If an offloading set in tx_conf->offloads + * hasn't been set in the input argument eth_conf->txmode.offloads + * to rte_eth_dev_configure(), it is a new added offloading, it must be + * per-queue type and it is enabled for the queue. + * No need to repeat any bit in tx_conf->offloads which has already been + * enabled in rte_eth_dev_configure() at port level. An offloading enabled + * at port level can't be disabled at queue level. + * + * Note that setting *tx_free_thresh* or *tx_rs_thresh* value to 0 forces + * the transmit function to use default values. + * @param hairpin_conf + * The hairpin binding configuration. + * + * @return + * - 0: Success, the transmit queue is correctly set up. + * - -ENOMEM: Unable to allocate the transmit ring descriptors. + */ +int rte_eth_tx_hairpin_queue_setup + (uint16_t port_id, uint16_t tx_queue_id, + uint16_t nb_tx_desc, unsigned int socket_id, + const struct rte_eth_txconf *tx_conf, + const struct rte_eth_hairpin_conf *hairpin_conf); + +/** * Return the NUMA socket to which an Ethernet device is connected * * @param port_id