Message ID | 20220113102718.3167282-1-jerinj@marvell.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Ferruh Yigit |
Headers | show |
Series | [v2,1/2] ethdev: support queue-based priority flow control | expand |
Context | Check | Description |
---|---|---|
ci/checkpatch | success | coding style OK |
On 1/13/2022 10:27 AM, jerinj@marvell.com wrote: > From: Jerin Jacob <jerinj@marvell.com> > > Based on device support and use-case need, there are two different ways > to enable PFC. The first case is the port level PFC configuration, in > this case, rte_eth_dev_priority_flow_ctrl_set() API shall be used to > configure the PFC, and PFC frames will be generated using based on VLAN > TC value. > > The second case is the queue level PFC configuration, in this > case, Any packet field content can be used to steer the packet to the > specific queue using rte_flow or RSS and then use > rte_eth_dev_priority_flow_ctrl_queue_set() to set the TC mapping on each > queue. Based on congestion selected on the specific queue, configured TC > shall be used to generate PFC frames. > > Operation of these modes are mutually exclusive, when driver sets > non zero value for rte_eth_dev_info::pfc_queue_tc_max, > application must use queue level PFC configuration via > rte_eth_dev_priority_flow_ctrl_queue_set() API instead of port level > PFC configuration via rte_eth_dev_priority_flow_ctrl_set() API to > realize PFC configuration. > > This patch enables the configuration for second case a.k.a queue > based PFC also updates rte_eth_dev_priority_flow_ctrl_set() > implmentaion to adheher to rte_eth_dev_info::pfc_queue_tc_max > handling. > > Also updated libabigail.abignore to ignore the update > to reserved fields in rte_eth_dev_info. > > Signed-off-by: Jerin Jacob <jerinj@marvell.com> > Signed-off-by: Sunil Kumar Kori <skori@marvell.com> > --- > > A driver implemtion based on this API is at > https://patches.dpdk.org/project/dpdk/patch/20220111081831.881374-1-skori@marvell.com/ > > RFC..v1 > - Added queue based PFC config API instead port based > > v1..v2 > - Updated libabigail.ignore as > has_data_member_inserted_between = {offset_of(reserved_64s), end} > - Updated doxygen comments of rte_eth_pfc_queue_conf > > > devtools/libabigail.abignore | 5 ++ > doc/guides/nics/features.rst | 5 +- > doc/guides/rel_notes/release_22_03.rst | 3 + > lib/ethdev/ethdev_driver.h | 6 +- > lib/ethdev/rte_ethdev.c | 109 +++++++++++++++++++++++++ > lib/ethdev/rte_ethdev.h | 83 ++++++++++++++++++- > lib/ethdev/version.map | 3 + > 7 files changed, 208 insertions(+), 6 deletions(-) > > diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore > index 4b676f317d..3bdecaaef0 100644 > --- a/devtools/libabigail.abignore > +++ b/devtools/libabigail.abignore > @@ -11,3 +11,8 @@ > ; Ignore generated PMD information strings > [suppress_variable] > name_regexp = _pmd_info$ > + > +;Ignore fields inserted in place of reserved fields of rte_eth_dev_info > +[suppress_type] > + name = rte_eth_dev_info > + has_data_member_inserted_between = {offset_of(reserved_64s), end} > diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst > index 27be2d2576..277a784f4e 100644 > --- a/doc/guides/nics/features.rst > +++ b/doc/guides/nics/features.rst > @@ -379,9 +379,10 @@ Flow control > Supports configuring link flow control. > > * **[implements] eth_dev_ops**: ``flow_ctrl_get``, ``flow_ctrl_set``, > - ``priority_flow_ctrl_set``. > + ``priority_flow_ctrl_set``, ``priority_flow_ctrl_queue_set``. > * **[related] API**: ``rte_eth_dev_flow_ctrl_get()``, ``rte_eth_dev_flow_ctrl_set()``, > - ``rte_eth_dev_priority_flow_ctrl_set()``. > + ``rte_eth_dev_priority_flow_ctrl_set()``, ``rte_eth_dev_priority_flow_ctrl_queue_set()``. > +* **[provides] rte_eth_dev_info**: ``pfc_queue_tc_max``. > > > .. _nic_features_rate_limitation: > diff --git a/doc/guides/rel_notes/release_22_03.rst b/doc/guides/rel_notes/release_22_03.rst > index 6d99d1eaa9..b75c0356e6 100644 > --- a/doc/guides/rel_notes/release_22_03.rst > +++ b/doc/guides/rel_notes/release_22_03.rst > @@ -55,6 +55,9 @@ New Features > Also, make sure to start the actual text at the margin. > ======================================================= > > +* **Added an API to enable queue based priority flow ctrl(PFC).** > + > + A new API, ``rte_eth_dev_priority_flow_ctrl_queue_set()``, was added. > > Removed Items > ------------- > diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h > index d95605a355..e0bbfe89d7 100644 > --- a/lib/ethdev/ethdev_driver.h > +++ b/lib/ethdev/ethdev_driver.h > @@ -532,6 +532,9 @@ typedef int (*flow_ctrl_set_t)(struct rte_eth_dev *dev, > /** @internal Setup priority flow control parameter on an Ethernet device. */ > typedef int (*priority_flow_ctrl_set_t)(struct rte_eth_dev *dev, > struct rte_eth_pfc_conf *pfc_conf); > +/** @internal Queue setup for priority flow control parameter on an Ethernet device. */ > +typedef int (*priority_flow_ctrl_queue_set_t)(struct rte_eth_dev *dev, > + struct rte_eth_pfc_queue_conf *pfc_queue_conf); > > /** @internal Update RSS redirection table on an Ethernet device. */ > typedef int (*reta_update_t)(struct rte_eth_dev *dev, > @@ -1080,7 +1083,8 @@ struct eth_dev_ops { > flow_ctrl_set_t flow_ctrl_set; /**< Setup flow control */ > /** Setup priority flow control */ > priority_flow_ctrl_set_t priority_flow_ctrl_set; > - > + /** Priority flow control queue setup */ > + priority_flow_ctrl_queue_set_t priority_flow_ctrl_queue_set; > /** Set Unicast Table Array */ > eth_uc_hash_table_set_t uc_hash_table_set; > /** Set Unicast hash bitmap */ > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c > index a1d475a292..6def057720 100644 > --- a/lib/ethdev/rte_ethdev.c > +++ b/lib/ethdev/rte_ethdev.c > @@ -3998,7 +3998,9 @@ int > rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, > struct rte_eth_pfc_conf *pfc_conf) > { > + struct rte_eth_dev_info dev_info; > struct rte_eth_dev *dev; > + int ret; > > RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > dev = &rte_eth_devices[port_id]; > @@ -4010,6 +4012,17 @@ rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, > return -EINVAL; > } > > + ret = rte_eth_dev_info_get(port_id, &dev_info); > + if (ret != 0) > + return ret; > + > + if (dev_info.pfc_queue_tc_max != 0) { > + RTE_ETHDEV_LOG(ERR, > + "Ethdev port %u driver does not support port level PFC config\n", > + port_id); > + return -ENOTSUP; > + } > + > if (pfc_conf->priority > (RTE_ETH_DCB_NUM_USER_PRIORITIES - 1)) { > RTE_ETHDEV_LOG(ERR, "Invalid priority, only 0-7 allowed\n"); > return -EINVAL; > @@ -4022,6 +4035,102 @@ rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, > return -ENOTSUP; > } > > +static inline int > +validate_rx_pause_config(struct rte_eth_dev_info *dev_info, > + struct rte_eth_pfc_queue_conf *pfc_queue_conf) > +{ > + if ((pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) || > + (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) { > + if (pfc_queue_conf->rx_pause.tx_qid >= dev_info->nb_tx_queues) { > + RTE_ETHDEV_LOG(ERR, "Tx queue not in range for Rx pause" > + " (requested: %d configured: %d)\n", > + pfc_queue_conf->rx_pause.tx_qid, > + dev_info->nb_tx_queues); > + return -EINVAL; > + } > + > + if (pfc_queue_conf->rx_pause.tc >= dev_info->pfc_queue_tc_max) { > + RTE_ETHDEV_LOG(ERR, "TC not in range for Rx pause" > + " (requested: %d max: %d)\n", > + pfc_queue_conf->rx_pause.tc, > + dev_info->pfc_queue_tc_max); > + return -EINVAL; > + } > + } > + > + return 0; > +} > + > +static inline int > +validate_tx_pause_config(struct rte_eth_dev_info *dev_info, > + struct rte_eth_pfc_queue_conf *pfc_queue_conf) > +{ > + if ((pfc_queue_conf->mode == RTE_ETH_FC_TX_PAUSE) || > + (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) { > + if (pfc_queue_conf->tx_pause.rx_qid >= dev_info->nb_rx_queues) { > + RTE_ETHDEV_LOG(ERR, "Rx queue not in range for Tx pause" > + "(requested: %d configured: %d)\n", > + pfc_queue_conf->tx_pause.rx_qid, > + dev_info->nb_rx_queues); > + return -EINVAL; > + } > + > + if (pfc_queue_conf->tx_pause.tc >= dev_info->pfc_queue_tc_max) { > + RTE_ETHDEV_LOG(ERR, "TC not in range for Tx pause" > + "(requested: %d max: %d)\n", > + pfc_queue_conf->tx_pause.tc, > + dev_info->pfc_queue_tc_max); > + return -EINVAL; > + } > + } > + > + return 0; > +} > + > +int > +rte_eth_dev_priority_flow_ctrl_queue_set( > + uint16_t port_id, struct rte_eth_pfc_queue_conf *pfc_queue_conf) > +{ > + struct rte_eth_dev_info dev_info; > + struct rte_eth_dev *dev; > + int ret; > + > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); > + dev = &rte_eth_devices[port_id]; > + > + if (pfc_queue_conf == NULL) { > + RTE_ETHDEV_LOG(ERR, "PFC parameters are NULL for port (%u)\n", > + port_id); > + return -EINVAL; > + } > + > + ret = rte_eth_dev_info_get(port_id, &dev_info); > + if (ret != 0) > + return ret; > + > + if (dev_info.pfc_queue_tc_max == 0) { > + RTE_ETHDEV_LOG(ERR, > + "Ethdev port %u does not support PFC TC values\n", > + port_id); > + return -ENOTSUP; > + } > + > + ret = validate_rx_pause_config(&dev_info, pfc_queue_conf); > + if (ret != 0) > + return ret; > + > + ret = validate_tx_pause_config(&dev_info, pfc_queue_conf); > + if (ret != 0) > + return ret; > + > + > + if (*dev->dev_ops->priority_flow_ctrl_queue_set) > + return eth_err(port_id, > + (*dev->dev_ops->priority_flow_ctrl_queue_set)( > + dev, pfc_queue_conf)); > + return -ENOTSUP; > +} > + > static int > eth_check_reta_mask(struct rte_eth_rss_reta_entry64 *reta_conf, > uint16_t reta_size) > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > index fa299c8ad7..33e43f6e1d 100644 > --- a/lib/ethdev/rte_ethdev.h > +++ b/lib/ethdev/rte_ethdev.h > @@ -1395,6 +1395,28 @@ struct rte_eth_pfc_conf { > uint8_t priority; /**< VLAN User Priority. */ > }; > > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice > + * > + * A structure used to configure Ethernet priority flow control parameter for > + * ethdev queues. > + */ > +struct rte_eth_pfc_queue_conf { > + enum rte_eth_fc_mode mode; /**< Link flow control mode */ > + > + struct { > + uint16_t tx_qid; /**< Tx queue ID */ > + uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ > + } rx_pause; /* Valid when (mode == FC_RX_PAUSE || mode == FC_FULL) */ > + > + struct { > + uint16_t pause_time; /**< Pause quota in the Pause frame */ > + uint16_t rx_qid; /**< Rx queue ID */ > + uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ > + } tx_pause; /* Valid when (mode == FC_TX_PAUSE || mode == FC_FULL) */ > +}; > + > /** > * Tunnel type for device-specific classifier configuration. > * @see rte_eth_udp_tunnel > @@ -1841,8 +1863,30 @@ struct rte_eth_dev_info { > * embedded managed interconnect/switch. > */ > struct rte_eth_switch_info switch_info; > - > - uint64_t reserved_64s[2]; /**< Reserved for future fields */ > + /** > + * Maximum supported traffic class as per PFC (802.1Qbb) specification. > + * > + * Based on device support and use-case need, there are two different > + * ways to enable PFC. The first case is the port level PFC > + * configuration, in this case, rte_eth_dev_priority_flow_ctrl_set() > + * API shall be used to configure the PFC, and PFC frames will be > + * generated using based on VLAN TC value. > + * The second case is the queue level PFC configuration, in this case, > + * Any packet field content can be used to steer the packet to the > + * specific queue using rte_flow or RSS and then use > + * rte_eth_dev_priority_flow_ctrl_queue_set() to set the TC mapping > + * on each queue. Based on congestion selected on the specific queue, > + * configured TC shall be used to generate PFC frames. > + * > + * When set to non zero value, application must use queue level > + * PFC configuration via rte_eth_dev_priority_flow_ctrl_queue_set() API > + * instead of port level PFC configuration via > + * rte_eth_dev_priority_flow_ctrl_set() API to realize > + * PFC configuration. > + */ > + uint8_t pfc_queue_tc_max; 'rte_eth_dev_info_get()' is one of the APIs that anyone using ethdev needs to use. Instead of expanding it with less used features, what do you think to have a specific API to get the 'pfc_queue_tc_max'? It also can be used by application to detect queue based PFC is supported by driver or not. Assume API is 'rte_eth_dev_priority_flow_ctrl_get()', if it returns '-ENOTSUP' application can know that PMD doesn't support queue based PFC. > + uint8_t reserved_8s[7]; > + uint64_t reserved_64s[1]; /**< Reserved for future fields */ > void *reserved_ptrs[2]; /**< Reserved for future fields */ > }; > > @@ -4109,6 +4153,9 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id, > * Configure the Ethernet priority flow control under DCB environment > * for Ethernet device. > * > + * @see struct rte_eth_dev_info::pfc_queue_tc_max priority > + * flow control usage models. > + * > * @param port_id > * The port identifier of the Ethernet device. > * @param pfc_conf > @@ -4119,10 +4166,40 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id, > * - (-ENODEV) if *port_id* invalid. > * - (-EINVAL) if bad parameter > * - (-EIO) if flow control setup failure or device is removed. > + * > */ > int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, > - struct rte_eth_pfc_conf *pfc_conf); > + struct rte_eth_pfc_conf *pfc_conf); Above syntax changes are not needed. DPDK coding convention is using tabs (mostly two) for multi line function decleration/definition, please be consistant with usage, this patch has multiple variations. > > +/** > + * @warning > + * @b EXPERIMENTAL: this API may change without prior notice. > + * > + * Configure the Ethernet priority flow control for a given queue > + * for Ethernet device. > + * > + * @see struct rte_eth_dev_info::pfc_queue_tc_max priority flow control > + * usage models. > + * > + * @note When an ethdev port switches to PFC mode, the unconfigured Doesit mean queue based PFC mode? > + * queues shall be configured by the driver with default values such as > + * lower priority value for TC etc. > + * I assume there is no way for application to know what the defaults values are, also not sure if application interested in this. > + * @param port_id > + * The port identifier of the Ethernet device. > + * @param pfc_queue_conf > + * The pointer to the structure of the priority flow control parameters > + * for the queue. > + * @return > + * - (0) if successful. > + * - (-ENOTSUP) if hardware doesn't support priority flow control mode. > + * - (-ENODEV) if *port_id* invalid. > + * - (-EINVAL) if bad parameter > + * - (-EIO) if flow control setup queue failure > + */ > +__rte_experimental > +int rte_eth_dev_priority_flow_ctrl_queue_set(uint16_t port_id, > + struct rte_eth_pfc_queue_conf *pfc_queue_conf); I wonder if Rx/Tx queue id should be API arguments, to be consistent with some other APIs, and will it help application that configures queues in a loop. But I can see 'rx_pause' or 'tx_pause' (in config) can be valid or not based on the 'mode', so I understand to have queue ids in the struct. No strong opinion.
On Tue, Jan 25, 2022 at 11:05 PM Ferruh Yigit <ferruh.yigit@intel.com> wrote: > > On 1/13/2022 10:27 AM, jerinj@marvell.com wrote: > > From: Jerin Jacob <jerinj@marvell.com> > > > > Based on device support and use-case need, there are two different ways > > to enable PFC. The first case is the port level PFC configuration, in > > this case, rte_eth_dev_priority_flow_ctrl_set() API shall be used to > > configure the PFC, and PFC frames will be generated using based on VLAN > > TC value. > > > > The second case is the queue level PFC configuration, in this > > case, Any packet field content can be used to steer the packet to the > > specific queue using rte_flow or RSS and then use > > rte_eth_dev_priority_flow_ctrl_queue_set() to set the TC mapping on each > > queue. Based on congestion selected on the specific queue, configured TC > > shall be used to generate PFC frames. > > > Operation of these modes are mutually exclusive, when driver sets > > non zero value for rte_eth_dev_info::pfc_queue_tc_max, > > application must use queue level PFC configuration via > > rte_eth_dev_priority_flow_ctrl_queue_set() API instead of port level > > PFC configuration via rte_eth_dev_priority_flow_ctrl_set() API to > > realize PFC configuration. > > > > This patch enables the configuration for second case a.k.a queue > > based PFC also updates rte_eth_dev_priority_flow_ctrl_set() > > implmentaion to adheher to rte_eth_dev_info::pfc_queue_tc_max > > handling. > > > > Also updated libabigail.abignore to ignore the update > > to reserved fields in rte_eth_dev_info. > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com> > > Signed-off-by: Sunil Kumar Kori <skori@marvell.com> > > --- > > > > A driver implemtion based on this API is at > > https://patches.dpdk.org/project/dpdk/patch/20220111081831.881374-1-skori@marvell.com/ > > > > RFC..v1 > > - Added queue based PFC config API instead port based > > > > v1..v2 > > + > > /** > > * Tunnel type for device-specific classifier configuration. > > * @see rte_eth_udp_tunnel > > @@ -1841,8 +1863,30 @@ struct rte_eth_dev_info { > > * embedded managed interconnect/switch. > > */ > > struct rte_eth_switch_info switch_info; > > - > > - uint64_t reserved_64s[2]; /**< Reserved for future fields */ > > + /** > > + * Maximum supported traffic class as per PFC (802.1Qbb) specification. > > + * > > + * Based on device support and use-case need, there are two different > > + * ways to enable PFC. The first case is the port level PFC > > + * configuration, in this case, rte_eth_dev_priority_flow_ctrl_set() > > + * API shall be used to configure the PFC, and PFC frames will be > > + * generated using based on VLAN TC value. > > + * The second case is the queue level PFC configuration, in this case, > > + * Any packet field content can be used to steer the packet to the > > + * specific queue using rte_flow or RSS and then use > > + * rte_eth_dev_priority_flow_ctrl_queue_set() to set the TC mapping > > + * on each queue. Based on congestion selected on the specific queue, > > + * configured TC shall be used to generate PFC frames. > > + * > > + * When set to non zero value, application must use queue level > > + * PFC configuration via rte_eth_dev_priority_flow_ctrl_queue_set() API > > + * instead of port level PFC configuration via > > + * rte_eth_dev_priority_flow_ctrl_set() API to realize > > + * PFC configuration. > > + */ > > + uint8_t pfc_queue_tc_max; > > > 'rte_eth_dev_info_get()' is one of the APIs that anyone using ethdev needs to use. > > Instead of expanding it with less used features, what do you think to have a > specific API to get the 'pfc_queue_tc_max'? OK. rte_eth_dev_info_get() was general theme used in ethdev API. Fine for new API specifically for this. > > It also can be used by application to detect queue based PFC is supported by > driver or not. OK. > Assume API is 'rte_eth_dev_priority_flow_ctrl_get()', if it returns '-ENOTSUP' > application can know that PMD doesn't support queue based PFC. OK. > > > > + uint8_t reserved_8s[7]; > > + uint64_t reserved_64s[1]; /**< Reserved for future fields */ > > void *reserved_ptrs[2]; /**< Reserved for future fields */ > > }; > > > > @@ -4109,6 +4153,9 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id, > > * Configure the Ethernet priority flow control under DCB environment > > * for Ethernet device. > > * > > + * @see struct rte_eth_dev_info::pfc_queue_tc_max priority > > + * flow control usage models. > > + * > > * @param port_id > > * The port identifier of the Ethernet device. > > * @param pfc_conf > > @@ -4119,10 +4166,40 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id, > > * - (-ENODEV) if *port_id* invalid. > > * - (-EINVAL) if bad parameter > > * - (-EIO) if flow control setup failure or device is removed. > > + * > > */ > > int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, > > - struct rte_eth_pfc_conf *pfc_conf); > > + struct rte_eth_pfc_conf *pfc_conf); > > Above syntax changes are not needed. > > DPDK coding convention is using tabs (mostly two) for multi line function > decleration/definition, please be consistant with usage, this patch has > multiple variations. Ack. > > > > > +/** > > + * @warning > > + * @b EXPERIMENTAL: this API may change without prior notice. > > + * > > + * Configure the Ethernet priority flow control for a given queue > > + * for Ethernet device. > > + * > > + * @see struct rte_eth_dev_info::pfc_queue_tc_max priority flow control > > + * usage models. > > + * > > + * @note When an ethdev port switches to PFC mode, the unconfigured > > Doesit mean queue based PFC mode? Yes. I will change to PFC mode -> queue-based PFC mode. > > > + * queues shall be configured by the driver with default values such as > > + * lower priority value for TC etc. > > + * > > I assume there is no way for application to know what the defaults values are, > also not sure if application interested in this. Yes. I don't think the application cares. But added in the doc to clarify the state. > > > + * @param port_id > > + * The port identifier of the Ethernet device. > > + * @param pfc_queue_conf > > + * The pointer to the structure of the priority flow control parameters > > + * for the queue. > > + * @return > > + * - (0) if successful. > > + * - (-ENOTSUP) if hardware doesn't support priority flow control mode. > > + * - (-ENODEV) if *port_id* invalid. > > + * - (-EINVAL) if bad parameter > > + * - (-EIO) if flow control setup queue failure > > + */ > > +__rte_experimental > > +int rte_eth_dev_priority_flow_ctrl_queue_set(uint16_t port_id, > > + struct rte_eth_pfc_queue_conf *pfc_queue_conf); > > I wonder if Rx/Tx queue id should be API arguments, to be consistent > with some other APIs, and will it help application that configures queues > in a loop. > But I can see 'rx_pause' or 'tx_pause' (in config) can be valid or not > based on the 'mode', so I understand to have queue ids in the struct. > No strong opinion. Yes. I keep it as is.
diff --git a/devtools/libabigail.abignore b/devtools/libabigail.abignore index 4b676f317d..3bdecaaef0 100644 --- a/devtools/libabigail.abignore +++ b/devtools/libabigail.abignore @@ -11,3 +11,8 @@ ; Ignore generated PMD information strings [suppress_variable] name_regexp = _pmd_info$ + +;Ignore fields inserted in place of reserved fields of rte_eth_dev_info +[suppress_type] + name = rte_eth_dev_info + has_data_member_inserted_between = {offset_of(reserved_64s), end} diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst index 27be2d2576..277a784f4e 100644 --- a/doc/guides/nics/features.rst +++ b/doc/guides/nics/features.rst @@ -379,9 +379,10 @@ Flow control Supports configuring link flow control. * **[implements] eth_dev_ops**: ``flow_ctrl_get``, ``flow_ctrl_set``, - ``priority_flow_ctrl_set``. + ``priority_flow_ctrl_set``, ``priority_flow_ctrl_queue_set``. * **[related] API**: ``rte_eth_dev_flow_ctrl_get()``, ``rte_eth_dev_flow_ctrl_set()``, - ``rte_eth_dev_priority_flow_ctrl_set()``. + ``rte_eth_dev_priority_flow_ctrl_set()``, ``rte_eth_dev_priority_flow_ctrl_queue_set()``. +* **[provides] rte_eth_dev_info**: ``pfc_queue_tc_max``. .. _nic_features_rate_limitation: diff --git a/doc/guides/rel_notes/release_22_03.rst b/doc/guides/rel_notes/release_22_03.rst index 6d99d1eaa9..b75c0356e6 100644 --- a/doc/guides/rel_notes/release_22_03.rst +++ b/doc/guides/rel_notes/release_22_03.rst @@ -55,6 +55,9 @@ New Features Also, make sure to start the actual text at the margin. ======================================================= +* **Added an API to enable queue based priority flow ctrl(PFC).** + + A new API, ``rte_eth_dev_priority_flow_ctrl_queue_set()``, was added. Removed Items ------------- diff --git a/lib/ethdev/ethdev_driver.h b/lib/ethdev/ethdev_driver.h index d95605a355..e0bbfe89d7 100644 --- a/lib/ethdev/ethdev_driver.h +++ b/lib/ethdev/ethdev_driver.h @@ -532,6 +532,9 @@ typedef int (*flow_ctrl_set_t)(struct rte_eth_dev *dev, /** @internal Setup priority flow control parameter on an Ethernet device. */ typedef int (*priority_flow_ctrl_set_t)(struct rte_eth_dev *dev, struct rte_eth_pfc_conf *pfc_conf); +/** @internal Queue setup for priority flow control parameter on an Ethernet device. */ +typedef int (*priority_flow_ctrl_queue_set_t)(struct rte_eth_dev *dev, + struct rte_eth_pfc_queue_conf *pfc_queue_conf); /** @internal Update RSS redirection table on an Ethernet device. */ typedef int (*reta_update_t)(struct rte_eth_dev *dev, @@ -1080,7 +1083,8 @@ struct eth_dev_ops { flow_ctrl_set_t flow_ctrl_set; /**< Setup flow control */ /** Setup priority flow control */ priority_flow_ctrl_set_t priority_flow_ctrl_set; - + /** Priority flow control queue setup */ + priority_flow_ctrl_queue_set_t priority_flow_ctrl_queue_set; /** Set Unicast Table Array */ eth_uc_hash_table_set_t uc_hash_table_set; /** Set Unicast hash bitmap */ diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c index a1d475a292..6def057720 100644 --- a/lib/ethdev/rte_ethdev.c +++ b/lib/ethdev/rte_ethdev.c @@ -3998,7 +3998,9 @@ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, struct rte_eth_pfc_conf *pfc_conf) { + struct rte_eth_dev_info dev_info; struct rte_eth_dev *dev; + int ret; RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); dev = &rte_eth_devices[port_id]; @@ -4010,6 +4012,17 @@ rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, return -EINVAL; } + ret = rte_eth_dev_info_get(port_id, &dev_info); + if (ret != 0) + return ret; + + if (dev_info.pfc_queue_tc_max != 0) { + RTE_ETHDEV_LOG(ERR, + "Ethdev port %u driver does not support port level PFC config\n", + port_id); + return -ENOTSUP; + } + if (pfc_conf->priority > (RTE_ETH_DCB_NUM_USER_PRIORITIES - 1)) { RTE_ETHDEV_LOG(ERR, "Invalid priority, only 0-7 allowed\n"); return -EINVAL; @@ -4022,6 +4035,102 @@ rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, return -ENOTSUP; } +static inline int +validate_rx_pause_config(struct rte_eth_dev_info *dev_info, + struct rte_eth_pfc_queue_conf *pfc_queue_conf) +{ + if ((pfc_queue_conf->mode == RTE_ETH_FC_RX_PAUSE) || + (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) { + if (pfc_queue_conf->rx_pause.tx_qid >= dev_info->nb_tx_queues) { + RTE_ETHDEV_LOG(ERR, "Tx queue not in range for Rx pause" + " (requested: %d configured: %d)\n", + pfc_queue_conf->rx_pause.tx_qid, + dev_info->nb_tx_queues); + return -EINVAL; + } + + if (pfc_queue_conf->rx_pause.tc >= dev_info->pfc_queue_tc_max) { + RTE_ETHDEV_LOG(ERR, "TC not in range for Rx pause" + " (requested: %d max: %d)\n", + pfc_queue_conf->rx_pause.tc, + dev_info->pfc_queue_tc_max); + return -EINVAL; + } + } + + return 0; +} + +static inline int +validate_tx_pause_config(struct rte_eth_dev_info *dev_info, + struct rte_eth_pfc_queue_conf *pfc_queue_conf) +{ + if ((pfc_queue_conf->mode == RTE_ETH_FC_TX_PAUSE) || + (pfc_queue_conf->mode == RTE_ETH_FC_FULL)) { + if (pfc_queue_conf->tx_pause.rx_qid >= dev_info->nb_rx_queues) { + RTE_ETHDEV_LOG(ERR, "Rx queue not in range for Tx pause" + "(requested: %d configured: %d)\n", + pfc_queue_conf->tx_pause.rx_qid, + dev_info->nb_rx_queues); + return -EINVAL; + } + + if (pfc_queue_conf->tx_pause.tc >= dev_info->pfc_queue_tc_max) { + RTE_ETHDEV_LOG(ERR, "TC not in range for Tx pause" + "(requested: %d max: %d)\n", + pfc_queue_conf->tx_pause.tc, + dev_info->pfc_queue_tc_max); + return -EINVAL; + } + } + + return 0; +} + +int +rte_eth_dev_priority_flow_ctrl_queue_set( + uint16_t port_id, struct rte_eth_pfc_queue_conf *pfc_queue_conf) +{ + struct rte_eth_dev_info dev_info; + struct rte_eth_dev *dev; + int ret; + + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -ENODEV); + dev = &rte_eth_devices[port_id]; + + if (pfc_queue_conf == NULL) { + RTE_ETHDEV_LOG(ERR, "PFC parameters are NULL for port (%u)\n", + port_id); + return -EINVAL; + } + + ret = rte_eth_dev_info_get(port_id, &dev_info); + if (ret != 0) + return ret; + + if (dev_info.pfc_queue_tc_max == 0) { + RTE_ETHDEV_LOG(ERR, + "Ethdev port %u does not support PFC TC values\n", + port_id); + return -ENOTSUP; + } + + ret = validate_rx_pause_config(&dev_info, pfc_queue_conf); + if (ret != 0) + return ret; + + ret = validate_tx_pause_config(&dev_info, pfc_queue_conf); + if (ret != 0) + return ret; + + + if (*dev->dev_ops->priority_flow_ctrl_queue_set) + return eth_err(port_id, + (*dev->dev_ops->priority_flow_ctrl_queue_set)( + dev, pfc_queue_conf)); + return -ENOTSUP; +} + static int eth_check_reta_mask(struct rte_eth_rss_reta_entry64 *reta_conf, uint16_t reta_size) diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h index fa299c8ad7..33e43f6e1d 100644 --- a/lib/ethdev/rte_ethdev.h +++ b/lib/ethdev/rte_ethdev.h @@ -1395,6 +1395,28 @@ struct rte_eth_pfc_conf { uint8_t priority; /**< VLAN User Priority. */ }; +/** + * @warning + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice + * + * A structure used to configure Ethernet priority flow control parameter for + * ethdev queues. + */ +struct rte_eth_pfc_queue_conf { + enum rte_eth_fc_mode mode; /**< Link flow control mode */ + + struct { + uint16_t tx_qid; /**< Tx queue ID */ + uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ + } rx_pause; /* Valid when (mode == FC_RX_PAUSE || mode == FC_FULL) */ + + struct { + uint16_t pause_time; /**< Pause quota in the Pause frame */ + uint16_t rx_qid; /**< Rx queue ID */ + uint8_t tc; /**< Traffic class as per PFC (802.1Qbb) spec */ + } tx_pause; /* Valid when (mode == FC_TX_PAUSE || mode == FC_FULL) */ +}; + /** * Tunnel type for device-specific classifier configuration. * @see rte_eth_udp_tunnel @@ -1841,8 +1863,30 @@ struct rte_eth_dev_info { * embedded managed interconnect/switch. */ struct rte_eth_switch_info switch_info; - - uint64_t reserved_64s[2]; /**< Reserved for future fields */ + /** + * Maximum supported traffic class as per PFC (802.1Qbb) specification. + * + * Based on device support and use-case need, there are two different + * ways to enable PFC. The first case is the port level PFC + * configuration, in this case, rte_eth_dev_priority_flow_ctrl_set() + * API shall be used to configure the PFC, and PFC frames will be + * generated using based on VLAN TC value. + * The second case is the queue level PFC configuration, in this case, + * Any packet field content can be used to steer the packet to the + * specific queue using rte_flow or RSS and then use + * rte_eth_dev_priority_flow_ctrl_queue_set() to set the TC mapping + * on each queue. Based on congestion selected on the specific queue, + * configured TC shall be used to generate PFC frames. + * + * When set to non zero value, application must use queue level + * PFC configuration via rte_eth_dev_priority_flow_ctrl_queue_set() API + * instead of port level PFC configuration via + * rte_eth_dev_priority_flow_ctrl_set() API to realize + * PFC configuration. + */ + uint8_t pfc_queue_tc_max; + uint8_t reserved_8s[7]; + uint64_t reserved_64s[1]; /**< Reserved for future fields */ void *reserved_ptrs[2]; /**< Reserved for future fields */ }; @@ -4109,6 +4153,9 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id, * Configure the Ethernet priority flow control under DCB environment * for Ethernet device. * + * @see struct rte_eth_dev_info::pfc_queue_tc_max priority + * flow control usage models. + * * @param port_id * The port identifier of the Ethernet device. * @param pfc_conf @@ -4119,10 +4166,40 @@ int rte_eth_dev_flow_ctrl_set(uint16_t port_id, * - (-ENODEV) if *port_id* invalid. * - (-EINVAL) if bad parameter * - (-EIO) if flow control setup failure or device is removed. + * */ int rte_eth_dev_priority_flow_ctrl_set(uint16_t port_id, - struct rte_eth_pfc_conf *pfc_conf); + struct rte_eth_pfc_conf *pfc_conf); +/** + * @warning + * @b EXPERIMENTAL: this API may change without prior notice. + * + * Configure the Ethernet priority flow control for a given queue + * for Ethernet device. + * + * @see struct rte_eth_dev_info::pfc_queue_tc_max priority flow control + * usage models. + * + * @note When an ethdev port switches to PFC mode, the unconfigured + * queues shall be configured by the driver with default values such as + * lower priority value for TC etc. + * + * @param port_id + * The port identifier of the Ethernet device. + * @param pfc_queue_conf + * The pointer to the structure of the priority flow control parameters + * for the queue. + * @return + * - (0) if successful. + * - (-ENOTSUP) if hardware doesn't support priority flow control mode. + * - (-ENODEV) if *port_id* invalid. + * - (-EINVAL) if bad parameter + * - (-EIO) if flow control setup queue failure + */ +__rte_experimental +int rte_eth_dev_priority_flow_ctrl_queue_set(uint16_t port_id, + struct rte_eth_pfc_queue_conf *pfc_queue_conf); /** * Add a MAC address to the set used for filtering incoming packets. * diff --git a/lib/ethdev/version.map b/lib/ethdev/version.map index c2fb0669a4..8f361ec15a 100644 --- a/lib/ethdev/version.map +++ b/lib/ethdev/version.map @@ -256,6 +256,9 @@ EXPERIMENTAL { rte_flow_flex_item_create; rte_flow_flex_item_release; rte_flow_pick_transfer_proxy; + + # added in 22.03 + rte_eth_dev_priority_flow_ctrl_queue_set; }; INTERNAL {