[RFC] ethdev: support Tx queue free descriptor query

Message ID 20231219172948.3909749-1-jerinj@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [RFC] ethdev: support Tx queue free descriptor query |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS

Commit Message

Jerin Jacob Kollanukkaran Dec. 19, 2023, 5:29 p.m. UTC
From: Jerin Jacob <jerinj@marvell.com>

Introduce a new API to retrieve the number of available free descriptors
in a Tx queue. Applications can leverage this API in the fast path to
inspect the Tx queue occupancy and take appropriate actions based on the
available free descriptors.

A notable use case could be implementing Random Early Discard (RED)
in software based on Tx queue occupancy.

Signed-off-by: Jerin Jacob <jerinj@marvell.com>
---
 doc/guides/nics/features.rst         | 10 ++++
 doc/guides/nics/features/default.ini |  1 +
 lib/ethdev/ethdev_trace_points.c     |  3 ++
 lib/ethdev/rte_ethdev.h              | 78 ++++++++++++++++++++++++++++
 lib/ethdev/rte_ethdev_core.h         |  7 ++-
 lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
 6 files changed, 106 insertions(+), 1 deletion(-)
  

Comments

Cristian Dumitrescu Jan. 4, 2024, 1:16 p.m. UTC | #1
> -----Original Message-----
> From: jerinj@marvell.com <jerinj@marvell.com>
> Sent: Tuesday, December 19, 2023 5:30 PM
> To: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Ferruh Yigit
> <ferruh.yigit@amd.com>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Cc: ferruh.yigit@xilinx.com; ajit.khaparde@broadcom.com;
> aboyer@pensando.io; Xing, Beilei <beilei.xing@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; chas3@att.com; chenbo.xia@intel.com; Loftus,
> Ciara <ciara.loftus@intel.com>; dsinghrawat@marvell.com; Czeck, Ed
> <ed.czeck@atomicrules.com>; evgenys@amazon.com; grive@u256.net;
> g.singh@nxp.com; zhouguoyang@huawei.com; Wang, Haiyue
> <haiyue.wang@intel.com>; hkalra@marvell.com; heinrich.kuhn@corigine.com;
> hemant.agrawal@nxp.com; hyonkim@cisco.com; igorch@amazon.com;
> irusskikh@marvell.com; jgrajcia@cisco.com; Singh, Jasvinder
> <jasvinder.singh@intel.com>; jianwang@trustnetic.com;
> jiawenwu@trustnetic.com; Wu, Jingjing <jingjing.wu@intel.com>;
> johndale@cisco.com; john.miller@atomicrules.com; linville@tuxdriver.com;
> Wiles, Keith <keith.wiles@intel.com>; kirankumark@marvell.com;
> oulijun@huawei.com; lironh@marvell.com; longli@microsoft.com;
> mw@semihalf.com; spinler@cesnet.cz; matan@nvidia.com; Peters, Matt
> <matt.peters@windriver.com>; maxime.coquelin@redhat.com;
> mk@semihalf.com; humin29@huawei.com; pnalla@marvell.com;
> ndabilpuram@marvell.com; Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> <qi.z.zhang@intel.com>; radhac@marvell.com; rahul.lakkireddy@chelsio.com;
> rmody@marvell.com; Xu, Rosen <rosen.xu@intel.com>;
> sachin.saxena@oss.nxp.com; skoteshwar@marvell.com; shshaikh@marvell.com;
> shaibran@amazon.com; Siegel, Shepard <shepard.siegel@atomicrules.com>;
> asomalap@amd.com; somnath.kotur@broadcom.com;
> sthemmin@microsoft.com; Webster, Steven <steven.webster@windriver.com>;
> skori@marvell.com; mtetsuyah@gmail.com; vburru@marvell.com;
> viacheslavo@nvidia.com; Wang, Xiao W <xiao.w.wang@intel.com>;
> cloud.wangxiaoyun@huawei.com; yisen.zhuang@huawei.com; Wang, Yong
> <yongwang@vmware.com>; xuanziyang2@huawei.com; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>; Jerin Jacob <jerinj@marvell.com>
> Subject: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> 
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Introduce a new API to retrieve the number of available free descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> ---
>  doc/guides/nics/features.rst         | 10 ++++
>  doc/guides/nics/features/default.ini |  1 +
>  lib/ethdev/ethdev_trace_points.c     |  3 ++
>  lib/ethdev/rte_ethdev.h              | 78 ++++++++++++++++++++++++++++
>  lib/ethdev/rte_ethdev_core.h         |  7 ++-
>  lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
>  6 files changed, 106 insertions(+), 1 deletion(-)
> 
> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
> index f7d9980849..9d6655473a 100644
> --- a/doc/guides/nics/features.rst
> +++ b/doc/guides/nics/features.rst
> @@ -962,6 +962,16 @@ management (see :doc:`../prog_guide/power_man` for
> more details).
> 
>  * **[implements] eth_dev_ops**: ``get_monitor_addr``
> 
> +.. _nic_features_tx_queue_free_desc_query:
> +
> +Tx queue free descriptor query
> +------------------------------
> +
> +Supports to get the number of free descriptors in a Tx queue.
> +
> +* **[implements] eth_dev_ops**: ``tx_queue_free_desc_get``.
> +* **[related] API**: ``rte_eth_tx_queue_free_desc_get()``.
> +
>  .. _nic_features_other:
> 
>  Other dev ops not represented by a Feature
> diff --git a/doc/guides/nics/features/default.ini
> b/doc/guides/nics/features/default.ini
> index 806cb033ff..b30002b1c1 100644
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -59,6 +59,7 @@ Packet type parsing  =
>  Timesync             =
>  Rx descriptor status =
>  Tx descriptor status =
> +Tx free descriptor query =
>  Basic stats          =
>  Extended stats       =
>  Stats per queue      =
> diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
> index 91f71d868b..346f37f2e4 100644
> --- a/lib/ethdev/ethdev_trace_points.c
> +++ b/lib/ethdev/ethdev_trace_points.c
> @@ -481,6 +481,9 @@
> RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
>  RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
>  	lib.ethdev.map_aggr_tx_affinity)
> 
> +RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_free_desc_get,
> +	lib.ethdev.tx_queue_free_desc_get)
> +
>  RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
>  	lib.ethdev.flow.copy)
> 
> diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
> index 77331ce652..033fcb8c9b 100644
> --- a/lib/ethdev/rte_ethdev.h
> +++ b/lib/ethdev/rte_ethdev.h
> @@ -6802,6 +6802,84 @@ rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t
> rx_queue_id,
>  __rte_experimental
>  int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t
> *ptypes, int num);
> 
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
> + *
> + * Get the number of free descriptors in a Tx queue.
> + *
> + * This function retrieves the number of available free descriptors in a
> + * transmit queue. Applications can use this API in the fast path to inspect
> + * Tx queue occupancy and take appropriate actions based on the available
> + * free descriptors. An example action could be implementing the
> + * Random Early Discard (RED).
> + *
> + * If there are no packets in the Tx queue, the function returns the value
> + * of `nb_tx_desc` provided during the initialization of the Tx queue using
> + * rte_eth_tx_queue_setup(), signifying that all descriptors are free.
> + *
> + * @param port_id
> + *   The port identifier of the device.
> + * @param tx_queue_id
> + *   The index of the transmit queue.
> + *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
> + *   to rte_eth_dev_configure().
> + * @return
> + *   - (<= UINT16_MAX) Number of free descriptors in a Tx queue
> + *   - (> UINT16_MAX) if error. Enabled only when RTE_ETHDEV_DEBUG_TX is
> enabled
> + *
> + * @note This function is designed for fast-path use.
> + *
> + */
> +__rte_experimental
> +static inline uint32_t
> +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)
> +{
> +	struct rte_eth_fp_ops *fops;
> +	uint32_t rc;
> +	void *qd;
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	rc = UINT32_MAX;
> +	if (port_id >= RTE_MAX_ETHPORTS || tx_queue_id >=
> RTE_MAX_QUEUES_PER_PORT) {
> +		RTE_ETHDEV_LOG(ERR, "Invalid port_id=%u or
> tx_queue_id=%u\n",
> +				port_id, tx_queue_id);
> +
> +		rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id,
> rc);
> +		return rc;
> +	}
> +#endif
> +
> +	/* Fetch pointer to Tx queue data */
> +	fops = &rte_eth_fp_ops[port_id];
> +	qd = fops->txq.data[tx_queue_id];
> +
> +#ifdef RTE_ETHDEV_DEBUG_TX
> +	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> +
> +	if (qd == NULL) {
> +		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u for
> port_id=%u\n",
> +				tx_queue_id, port_id);
> +
> +		rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id,
> rc);
> +		return rc;
> +	}
> +
> +	if (fops->tx_queue_free_desc_get == NULL) {
> +		RTE_ETHDEV_LOG(ERR, "tx_queue_free_desc_get callback not
> implementedd Tx queue_id=%u for port_id=%u\n",
> +				tx_queue_id, port_id);
> +
> +		rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id,
> rc);
> +		return rc;
> +	}
> +#endif
> +	rc = fops->tx_queue_free_desc_get(qd);
> +
> +	rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id, rc);
> +
> +	return rc;
> +}
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
> index 4bfaf79c6c..5b7ee66ee7 100644
> --- a/lib/ethdev/rte_ethdev_core.h
> +++ b/lib/ethdev/rte_ethdev_core.h
> @@ -60,6 +60,9 @@ typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void
> *txq,
>  /** @internal Refill Rx descriptors with the recycling mbufs */
>  typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
> 
> +/** @internal Get the number of free descriptors count of a Tx queue */
> +typedef uint16_t (*eth_tx_queue_free_desc_get_t)(void *txq);
> +
>  /**
>   * @internal
>   * Structure used to hold opaque pointers to internal ethdev Rx/Tx
> @@ -116,7 +119,9 @@ struct rte_eth_fp_ops {
>  	eth_tx_descriptor_status_t tx_descriptor_status;
>  	/** Copy used mbufs from Tx mbuf ring into Rx. */
>  	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
> -	uintptr_t reserved2[2];
> +	/** Get the number of free descriptors count of a Tx queue. */
> +	eth_tx_queue_free_desc_get_t tx_queue_free_desc_get;
> +	uintptr_t reserved2[1];
>  	/**@}*/
> 
>  } __rte_cache_aligned;
> diff --git a/lib/ethdev/rte_ethdev_trace_fp.h b/lib/ethdev/rte_ethdev_trace_fp.h
> index 186271c9ff..2c57b39bd2 100644
> --- a/lib/ethdev/rte_ethdev_trace_fp.h
> +++ b/lib/ethdev/rte_ethdev_trace_fp.h
> @@ -73,6 +73,14 @@ RTE_TRACE_POINT_FP(
>  	rte_trace_point_emit_u64(count);
>  )
> 
> +RTE_TRACE_POINT_FP(
> +	rte_eth_trace_tx_queue_free_desc_get,
> +	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id,
> uint32_t nb_free_desc),
> +	rte_trace_point_emit_u16(port_id);
> +	rte_trace_point_emit_u16(tx_queue_id);
> +	rte_trace_point_emit_u32(nb_free_desc);
> +)
> +
>  #ifdef __cplusplus
>  }
>  #endif
> --
> 2.43.0


Hi Jerin,

I think having an API to get the number of free descriptors per queue is a good idea. Why have it only for TX queues and not for RX queues as well?

Regards,
Cristian
  
Jerin Jacob Jan. 4, 2024, 1:35 p.m. UTC | #2
On Thu, Jan 4, 2024 at 6:46 PM Dumitrescu, Cristian
<cristian.dumitrescu@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: jerinj@marvell.com <jerinj@marvell.com>
> > Sent: Tuesday, December 19, 2023 5:30 PM
> > To: dev@dpdk.org; Thomas Monjalon <thomas@monjalon.net>; Ferruh Yigit
> > <ferruh.yigit@amd.com>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> > Cc: ferruh.yigit@xilinx.com; ajit.khaparde@broadcom.com;
> > aboyer@pensando.io; Xing, Beilei <beilei.xing@intel.com>; Richardson, Bruce
> > <bruce.richardson@intel.com>; chas3@att.com; chenbo.xia@intel.com; Loftus,
> > Ciara <ciara.loftus@intel.com>; dsinghrawat@marvell.com; Czeck, Ed
> > <ed.czeck@atomicrules.com>; evgenys@amazon.com; grive@u256.net;
> > g.singh@nxp.com; zhouguoyang@huawei.com; Wang, Haiyue
> > <haiyue.wang@intel.com>; hkalra@marvell.com; heinrich.kuhn@corigine.com;
> > hemant.agrawal@nxp.com; hyonkim@cisco.com; igorch@amazon.com;
> > irusskikh@marvell.com; jgrajcia@cisco.com; Singh, Jasvinder
> > <jasvinder.singh@intel.com>; jianwang@trustnetic.com;
> > jiawenwu@trustnetic.com; Wu, Jingjing <jingjing.wu@intel.com>;
> > johndale@cisco.com; john.miller@atomicrules.com; linville@tuxdriver.com;
> > Wiles, Keith <keith.wiles@intel.com>; kirankumark@marvell.com;
> > oulijun@huawei.com; lironh@marvell.com; longli@microsoft.com;
> > mw@semihalf.com; spinler@cesnet.cz; matan@nvidia.com; Peters, Matt
> > <matt.peters@windriver.com>; maxime.coquelin@redhat.com;
> > mk@semihalf.com; humin29@huawei.com; pnalla@marvell.com;
> > ndabilpuram@marvell.com; Yang, Qiming <qiming.yang@intel.com>; Zhang, Qi Z
> > <qi.z.zhang@intel.com>; radhac@marvell.com; rahul.lakkireddy@chelsio.com;
> > rmody@marvell.com; Xu, Rosen <rosen.xu@intel.com>;
> > sachin.saxena@oss.nxp.com; skoteshwar@marvell.com; shshaikh@marvell.com;
> > shaibran@amazon.com; Siegel, Shepard <shepard.siegel@atomicrules.com>;
> > asomalap@amd.com; somnath.kotur@broadcom.com;
> > sthemmin@microsoft.com; Webster, Steven <steven.webster@windriver.com>;
> > skori@marvell.com; mtetsuyah@gmail.com; vburru@marvell.com;
> > viacheslavo@nvidia.com; Wang, Xiao W <xiao.w.wang@intel.com>;
> > cloud.wangxiaoyun@huawei.com; yisen.zhuang@huawei.com; Wang, Yong
> > <yongwang@vmware.com>; xuanziyang2@huawei.com; Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com>; Jerin Jacob <jerinj@marvell.com>
> > Subject: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> >
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of available free descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > ---
> >  doc/guides/nics/features.rst         | 10 ++++
> >  doc/guides/nics/features/default.ini |  1 +
> >  lib/ethdev/ethdev_trace_points.c     |  3 ++
> >  lib/ethdev/rte_ethdev.h              | 78 ++++++++++++++++++++++++++++
> >  lib/ethdev/rte_ethdev_core.h         |  7 ++-
> >  lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
> >  6 files changed, 106 insertions(+), 1 deletion(-)
>
> Hi Jerin,

Hi Cristian,

>
> I think having an API to get the number of free descriptors per queue is a good idea. Why have it only for TX queues and not for RX queues as well?

I see no harm in adding for Rx as well. I think, it is better to have
separate API for each instead of adding argument as it is fast path
API.
If so, we could add a new API when there is any PMD implementation or
need for this.

>
> Regards,
> Cristian
  
Konstantin Ananyev Jan. 4, 2024, 2:21 p.m. UTC | #3
> > > Introduce a new API to retrieve the number of available free descriptors
> > > in a Tx queue. Applications can leverage this API in the fast path to
> > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > available free descriptors.
> > >
> > > A notable use case could be implementing Random Early Discard (RED)
> > > in software based on Tx queue occupancy.
> > >
> > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > ---
> > >  doc/guides/nics/features.rst         | 10 ++++
> > >  doc/guides/nics/features/default.ini |  1 +
> > >  lib/ethdev/ethdev_trace_points.c     |  3 ++
> > >  lib/ethdev/rte_ethdev.h              | 78 ++++++++++++++++++++++++++++
> > >  lib/ethdev/rte_ethdev_core.h         |  7 ++-
> > >  lib/ethdev/rte_ethdev_trace_fp.h     |  8 +++
> > >  6 files changed, 106 insertions(+), 1 deletion(-)
> >
> > Hi Jerin,
> 
> Hi Cristian,
> 
> >
> > I think having an API to get the number of free descriptors per queue is a good idea. Why have it only for TX queues and not for RX
> queues as well?
> 
> I see no harm in adding for Rx as well. I think, it is better to have
> separate API for each instead of adding argument as it is fast path
> API.
> If so, we could add a new API when there is any PMD implementation or
> need for this.

I think for RX we already have similar one:
/** @internal Get number of used descriptors on a receive queue. */
typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
  
Thomas Monjalon Jan. 4, 2024, 6:29 p.m. UTC | #4
04/01/2024 15:21, Konstantin Ananyev:
> 
> > > > Introduce a new API to retrieve the number of available free descriptors
> > > > in a Tx queue. Applications can leverage this API in the fast path to
> > > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > > available free descriptors.
> > > >
> > > > A notable use case could be implementing Random Early Discard (RED)
> > > > in software based on Tx queue occupancy.
> > > >
> > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > >
> > > I think having an API to get the number of free descriptors per queue is a good idea. Why have it only for TX queues and not for RX
> > queues as well?
> > 
> > I see no harm in adding for Rx as well. I think, it is better to have
> > separate API for each instead of adding argument as it is fast path
> > API.
> > If so, we could add a new API when there is any PMD implementation or
> > need for this.
> 
> I think for RX we already have similar one:
> /** @internal Get number of used descriptors on a receive queue. */
> typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);

rte_eth_rx_queue_count() gives the number of Rx used descriptors
rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
rte_eth_tx_descriptor_status() gives the status of one Tx descriptor

This patch is adding a function to get Tx available descriptors,
rte_eth_tx_queue_free_desc_get().
I can see a symmetry with rte_eth_rx_queue_count().
For consistency I would rename it to rte_eth_tx_queue_free_count().

Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?
  
Thomas Monjalon Jan. 4, 2024, 9:17 p.m. UTC | #5
19/12/2023 18:29, jerinj@marvell.com:
> --- a/doc/guides/nics/features/default.ini
> +++ b/doc/guides/nics/features/default.ini
> @@ -59,6 +59,7 @@ Packet type parsing  =
> 
>  Timesync             =
>  Rx descriptor status =
>  Tx descriptor status =
> +Tx free descriptor query =

I think we can drop "query" here.


> +__rte_experimental
> +static inline uint32_t
> +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)

For consistency with rte_eth_rx_queue_count(),
I propose the name rte_eth_tx_queue_free_count().
  
Jerin Jacob Jan. 5, 2024, 9:54 a.m. UTC | #6
On Fri, Jan 5, 2024 at 4:04 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 19/12/2023 18:29, jerinj@marvell.com:
> > --- a/doc/guides/nics/features/default.ini
> > +++ b/doc/guides/nics/features/default.ini
> > @@ -59,6 +59,7 @@ Packet type parsing  =
> >
> >  Timesync             =
> >  Rx descriptor status =
> >  Tx descriptor status =
> > +Tx free descriptor query =
>
> I think we can drop "query" here.

How about "Tx queue free count" then?

>
>
> > +__rte_experimental
> > +static inline uint32_t
> > +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)
>
> For consistency with rte_eth_rx_queue_count(),
> I propose the name rte_eth_tx_queue_free_count().

Make sense. I will change it in next version.


>
>
>
  
Jerin Jacob Jan. 5, 2024, 9:57 a.m. UTC | #7
On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 04/01/2024 15:21, Konstantin Ananyev:
> >
> > > > > Introduce a new API to retrieve the number of available free descriptors
> > > > > in a Tx queue. Applications can leverage this API in the fast path to
> > > > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > > > available free descriptors.
> > > > >
> > > > > A notable use case could be implementing Random Early Discard (RED)
> > > > > in software based on Tx queue occupancy.
> > > > >
> > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > >
> > > > I think having an API to get the number of free descriptors per queue is a good idea. Why have it only for TX queues and not for RX
> > > queues as well?
> > >
> > > I see no harm in adding for Rx as well. I think, it is better to have
> > > separate API for each instead of adding argument as it is fast path
> > > API.
> > > If so, we could add a new API when there is any PMD implementation or
> > > need for this.
> >
> > I think for RX we already have similar one:
> > /** @internal Get number of used descriptors on a receive queue. */
> > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
>
> rte_eth_rx_queue_count() gives the number of Rx used descriptors
> rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
> rte_eth_tx_descriptor_status() gives the status of one Tx descriptor
>
> This patch is adding a function to get Tx available descriptors,
> rte_eth_tx_queue_free_desc_get().
> I can see a symmetry with rte_eth_rx_queue_count().
> For consistency I would rename it to rte_eth_tx_queue_free_count().
>
> Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?

IMO, rte_eth_rx_queue_free_count() is enough as
used count =  total desc number(configured via nb_tx_desc with
rte_eth_tx_queue_setup())  - free count

>
>
  
Thomas Monjalon Jan. 5, 2024, 10:02 a.m. UTC | #8
05/01/2024 10:54, Jerin Jacob:
> On Fri, Jan 5, 2024 at 4:04 AM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 19/12/2023 18:29, jerinj@marvell.com:
> > > --- a/doc/guides/nics/features/default.ini
> > > +++ b/doc/guides/nics/features/default.ini
> > > @@ -59,6 +59,7 @@ Packet type parsing  =
> > >
> > >  Timesync             =
> > >  Rx descriptor status =
> > >  Tx descriptor status =
> > > +Tx free descriptor query =
> >
> > I think we can drop "query" here.
> 
> How about "Tx queue free count" then?

No strong opinion. What others think?


> > > +__rte_experimental
> > > +static inline uint32_t
> > > +rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)
> >
> > For consistency with rte_eth_rx_queue_count(),
> > I propose the name rte_eth_tx_queue_free_count().
> 
> Make sense. I will change it in next version.
  
Thomas Monjalon Jan. 5, 2024, 10:03 a.m. UTC | #9
05/01/2024 10:57, Jerin Jacob:
> On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 04/01/2024 15:21, Konstantin Ananyev:
> > >
> > > > > > Introduce a new API to retrieve the number of available free descriptors
> > > > > > in a Tx queue. Applications can leverage this API in the fast path to
> > > > > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > > > > available free descriptors.
> > > > > >
> > > > > > A notable use case could be implementing Random Early Discard (RED)
> > > > > > in software based on Tx queue occupancy.
> > > > > >
> > > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > > >
> > > > > I think having an API to get the number of free descriptors per queue is a good idea. Why have it only for TX queues and not for RX
> > > > queues as well?
> > > >
> > > > I see no harm in adding for Rx as well. I think, it is better to have
> > > > separate API for each instead of adding argument as it is fast path
> > > > API.
> > > > If so, we could add a new API when there is any PMD implementation or
> > > > need for this.
> > >
> > > I think for RX we already have similar one:
> > > /** @internal Get number of used descriptors on a receive queue. */
> > > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
> >
> > rte_eth_rx_queue_count() gives the number of Rx used descriptors
> > rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
> > rte_eth_tx_descriptor_status() gives the status of one Tx descriptor
> >
> > This patch is adding a function to get Tx available descriptors,
> > rte_eth_tx_queue_free_desc_get().
> > I can see a symmetry with rte_eth_rx_queue_count().
> > For consistency I would rename it to rte_eth_tx_queue_free_count().
> >
> > Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?
> 
> IMO, rte_eth_rx_queue_free_count() is enough as
> used count =  total desc number(configured via nb_tx_desc with
> rte_eth_tx_queue_setup())  - free count

I'm fine with that.
  
Konstantin Ananyev Jan. 5, 2024, 11:12 a.m. UTC | #10
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, January 5, 2024 10:04 AM
> To: Jerin Jacob <jerinjacobk@gmail.com>
> Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Konstantin Ananyev <konstantin.ananyev@huawei.com>;
> jerinj@marvell.com; dev@dpdk.org; Ferruh Yigit <ferruh.yigit@amd.com>; Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>;
> ferruh.yigit@xilinx.com; ajit.khaparde@broadcom.com; aboyer@pensando.io; Xing, Beilei <beilei.xing@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>; chas3@att.com; chenbo.xia@intel.com; Loftus, Ciara <ciara.loftus@intel.com>;
> dsinghrawat@marvell.com; Czeck, Ed <ed.czeck@atomicrules.com>; evgenys@amazon.com; grive@u256.net; g.singh@nxp.com;
> zhouguoyang@huawei.com; Wang, Haiyue <haiyue.wang@intel.com>; hkalra@marvell.com; heinrich.kuhn@corigine.com;
> hemant.agrawal@nxp.com; hyonkim@cisco.com; igorch@amazon.com; irusskikh@marvell.com; jgrajcia@cisco.com; Singh, Jasvinder
> <jasvinder.singh@intel.com>; jianwang@trustnetic.com; jiawenwu@trustnetic.com; Wu, Jingjing <jingjing.wu@intel.com>;
> johndale@cisco.com; john.miller@atomicrules.com; linville@tuxdriver.com; Wiles, Keith <keith.wiles@intel.com>;
> kirankumark@marvell.com; oulijun@huawei.com; lironh@marvell.com; longli@microsoft.com; mw@semihalf.com;
> spinler@cesnet.cz; matan@nvidia.com; Peters, Matt <matt.peters@windriver.com>; maxime.coquelin@redhat.com;
> mk@semihalf.com; humin (Q) <humin29@huawei.com>; pnalla@marvell.com; ndabilpuram@marvell.com; Yang, Qiming
> <qiming.yang@intel.com>; Zhang, Qi Z <qi.z.zhang@intel.com>; radhac@marvell.com; rahul.lakkireddy@chelsio.com;
> rmody@marvell.com; Xu, Rosen <rosen.xu@intel.com>; sachin.saxena@oss.nxp.com; skoteshwar@marvell.com;
> shshaikh@marvell.com; shaibran@amazon.com; Siegel, Shepard <shepard.siegel@atomicrules.com>; asomalap@amd.com;
> somnath.kotur@broadcom.com; sthemmin@microsoft.com; Webster, Steven <steven.webster@windriver.com>;
> skori@marvell.com; mtetsuyah@gmail.com; vburru@marvell.com; viacheslavo@nvidia.com; Wang, Xiao W
> <xiao.w.wang@intel.com>; Wangxiaoyun (Cloud) <cloud.wangxiaoyun@huawei.com>; Zhuangyuzeng (Yisen)
> <yisen.zhuang@huawei.com>; Wang, Yong <yongwang@vmware.com>; Xuanziyang (William) <william.xuanziyang@huawei.com>
> Subject: Re: [dpdk-dev] [RFC] ethdev: support Tx queue free descriptor query
> 
> 05/01/2024 10:57, Jerin Jacob:
> > On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 04/01/2024 15:21, Konstantin Ananyev:
> > > >
> > > > > > > Introduce a new API to retrieve the number of available free descriptors
> > > > > > > in a Tx queue. Applications can leverage this API in the fast path to
> > > > > > > inspect the Tx queue occupancy and take appropriate actions based on the
> > > > > > > available free descriptors.
> > > > > > >
> > > > > > > A notable use case could be implementing Random Early Discard (RED)
> > > > > > > in software based on Tx queue occupancy.
> > > > > > >
> > > > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > > > >
> > > > > > I think having an API to get the number of free descriptors per queue is a good idea. Why have it only for TX queues and not
> for RX
> > > > > queues as well?
> > > > >
> > > > > I see no harm in adding for Rx as well. I think, it is better to have
> > > > > separate API for each instead of adding argument as it is fast path
> > > > > API.
> > > > > If so, we could add a new API when there is any PMD implementation or
> > > > > need for this.
> > > >
> > > > I think for RX we already have similar one:
> > > > /** @internal Get number of used descriptors on a receive queue. */
> > > > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
> > >
> > > rte_eth_rx_queue_count() gives the number of Rx used descriptors
> > > rte_eth_rx_descriptor_status() gives the status of one Rx descriptor
> > > rte_eth_tx_descriptor_status() gives the status of one Tx descriptor
> > >
> > > This patch is adding a function to get Tx available descriptors,
> > > rte_eth_tx_queue_free_desc_get().
> > > I can see a symmetry with rte_eth_rx_queue_count().
> > > For consistency I would rename it to rte_eth_tx_queue_free_count().
> > >
> > > Should we add rte_eth_tx_queue_count() and rte_eth_rx_queue_free_count()?
> >
> > IMO, rte_eth_rx_queue_free_count() is enough as
> > used count =  total desc number(configured via nb_tx_desc with
> > rte_eth_tx_queue_setup())  - free count
> 
> I'm fine with that.
> 

Yep, agree.
If we ever need  rte_eth_rx_queue_free_count() and rte_eth_tx_queue_used_count(),
it could be done via slow-path as Jerin outlined above, no need to waste entries in fp_ops
for that.
  
Bruce Richardson Jan. 8, 2024, 10:54 a.m. UTC | #11
On Tue, Dec 19, 2023 at 10:59:48PM +0530, jerinj@marvell.com wrote:
> From: Jerin Jacob <jerinj@marvell.com>
> 
> Introduce a new API to retrieve the number of available free descriptors
> in a Tx queue. Applications can leverage this API in the fast path to
> inspect the Tx queue occupancy and take appropriate actions based on the
> available free descriptors.
> 
> A notable use case could be implementing Random Early Discard (RED)
> in software based on Tx queue occupancy.
> 
> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> ---

Hi Jerin,

while I don't strongly object to this patch, I wonder if it encourages
sub-optimal code implementations. To determine the number of free
descriptors in a ring, the driver in many cases will need to do a scan of
the descriptor ring to see how many are free/used. However, I suspect that
in most cases we will see something like the following being done:

    count = rte_eth_rx_free_count();
    if (count > X) {
        /* Do something */
    }

For instances like this, scanning the entire ring is wasteful of resources.
Instead it would be better to just check the descriptor at position X
explicitly. Going to the trouble of checking the exact descriptor count is
unnecessary in this case.

Out of interest, are you aware of a case where an app would need to know
exactly the number of free descriptors, and where the result would not be
just compared to one or more threshold values? Do we see cases where it
would be used in a computation, perhaps?

/Bruce
  
Morten Brørup Jan. 8, 2024, 8:54 p.m. UTC | #12
> From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> Sent: Friday, 5 January 2024 12.13
> 
> > From: Thomas Monjalon <thomas@monjalon.net>
> > Sent: Friday, January 5, 2024 10:04 AM
> >
> > 05/01/2024 10:57, Jerin Jacob:
> > > On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon
> <thomas@monjalon.net> wrote:
> > > >
> > > > 04/01/2024 15:21, Konstantin Ananyev:
> > > > >
> > > > > > > > Introduce a new API to retrieve the number of available
> free descriptors
> > > > > > > > in a Tx queue. Applications can leverage this API in the
> fast path to
> > > > > > > > inspect the Tx queue occupancy and take appropriate
> actions based on the
> > > > > > > > available free descriptors.
> > > > > > > >
> > > > > > > > A notable use case could be implementing Random Early
> Discard (RED)
> > > > > > > > in software based on Tx queue occupancy.
> > > > > > > >
> > > > > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > > > > >
> > > > > > > I think having an API to get the number of free descriptors
> per queue is a good idea. Why have it only for TX queues and not
> > for RX
> > > > > > queues as well?
> > > > > >
> > > > > > I see no harm in adding for Rx as well. I think, it is better
> to have
> > > > > > separate API for each instead of adding argument as it is
> fast path
> > > > > > API.
> > > > > > If so, we could add a new API when there is any PMD
> implementation or
> > > > > > need for this.
> > > > >
> > > > > I think for RX we already have similar one:
> > > > > /** @internal Get number of used descriptors on a receive
> queue. */
> > > > > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
> > > >
> > > > rte_eth_rx_queue_count() gives the number of Rx used descriptors
> > > > rte_eth_rx_descriptor_status() gives the status of one Rx
> descriptor
> > > > rte_eth_tx_descriptor_status() gives the status of one Tx
> descriptor
> > > >
> > > > This patch is adding a function to get Tx available descriptors,
> > > > rte_eth_tx_queue_free_desc_get().
> > > > I can see a symmetry with rte_eth_rx_queue_count().
> > > > For consistency I would rename it to
> rte_eth_tx_queue_free_count().

For consistency with rte_eth_rx_queue_count() regarding both naming and behavior of the API, I would prefer:

rte_eth_tx_queue_count(), returning the number of used descriptors.

> > > >
> > > > Should we add rte_eth_tx_queue_count() and
> rte_eth_rx_queue_free_count()?
> > >
> > > IMO, rte_eth_rx_queue_free_count() is enough as
> > > used count =  total desc number(configured via nb_tx_desc with
> > > rte_eth_tx_queue_setup())  - free count
> >
> > I'm fine with that.
> >
> 
> Yep, agree.
> If we ever need  rte_eth_rx_queue_free_count() and
> rte_eth_tx_queue_used_count(),
> it could be done via slow-path as Jerin outlined above, no need to
> waste entries in fp_ops
> for that.

Yes, rte_eth_rx/tx_queue_avail_count() could be added in the ethdev library, without driver callbacks, but simply getting data from configured descriptor ring sizes and rte_eth_rx/tx_queue_count().

PS: For naming conventions, I sought inspiration from the mempool library. Also, I don't see any need to use "descs" as part of the names of the proposed functions.

The driver callback can be either "free count" or "used count", whichever is easier for the drivers to implement, or (preferably) whichever is likelier to be faster to execute in the most common case. I would assume that the TX descriptor "used count" is relatively low most of the time.
  
Morten Brørup Jan. 8, 2024, 9:15 p.m. UTC | #13
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Monday, 8 January 2024 11.54
> 
> On Tue, Dec 19, 2023 at 10:59:48PM +0530, jerinj@marvell.com wrote:
> > From: Jerin Jacob <jerinj@marvell.com>
> >
> > Introduce a new API to retrieve the number of available free
> descriptors
> > in a Tx queue. Applications can leverage this API in the fast path to
> > inspect the Tx queue occupancy and take appropriate actions based on
> the
> > available free descriptors.
> >
> > A notable use case could be implementing Random Early Discard (RED)
> > in software based on Tx queue occupancy.
> >
> > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > ---
> 
> Hi Jerin,
> 
> while I don't strongly object to this patch, I wonder if it encourages
> sub-optimal code implementations. To determine the number of free
> descriptors in a ring, the driver in many cases will need to do a scan
> of
> the descriptor ring to see how many are free/used. However, I suspect
> that
> in most cases we will see something like the following being done:
> 
>     count = rte_eth_rx_free_count();

Typo: rte_eth_rx_free_count() -> rte_eth_tx_free_count()

>     if (count > X) {
>         /* Do something */
>     }
> 
> For instances like this, scanning the entire ring is wasteful of
> resources.
> Instead it would be better to just check the descriptor at position X
> explicitly. Going to the trouble of checking the exact descriptor count
> is
> unnecessary in this case.

Yes, it introduces such a risk.
All DPDK examples simply call tx_burst() without checking free space first, so I think the probability (of the simple case) is low.
And the probability for the case comparing to X could be mitigated by referring to rte_eth_tx_descriptor_status() in the function description.

> 
> Out of interest, are you aware of a case where an app would need to
> know
> exactly the number of free descriptors, and where the result would not
> be
> just compared to one or more threshold values? Do we see cases where it
> would be used in a computation, perhaps?

Yes: RED. When exceeding the minimum threshold, the probability of marking/dropping a packet increases linearly with the queue's fill level.
E.g.:
0-900 packets in queue: don't drop,
901-1000 packets in queue: probability of dropping the packet is 1-100 % (e.g. 980 packets in queue = 80 % drop probability).
  
Bruce Richardson Jan. 9, 2024, 8:47 a.m. UTC | #14
On Mon, Jan 08, 2024 at 10:15:40PM +0100, Morten Brørup wrote:
> > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > Sent: Monday, 8 January 2024 11.54
> > 
> > On Tue, Dec 19, 2023 at 10:59:48PM +0530, jerinj@marvell.com wrote:
> > > From: Jerin Jacob <jerinj@marvell.com>
> > >
> > > Introduce a new API to retrieve the number of available free
> > descriptors
> > > in a Tx queue. Applications can leverage this API in the fast path to
> > > inspect the Tx queue occupancy and take appropriate actions based on
> > the
> > > available free descriptors.
> > >
> > > A notable use case could be implementing Random Early Discard (RED)
> > > in software based on Tx queue occupancy.
> > >
> > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > ---
> > 
> > Hi Jerin,
> > 
> > while I don't strongly object to this patch, I wonder if it encourages
> > sub-optimal code implementations. To determine the number of free
> > descriptors in a ring, the driver in many cases will need to do a scan
> > of
> > the descriptor ring to see how many are free/used. However, I suspect
> > that
> > in most cases we will see something like the following being done:
> > 
> >     count = rte_eth_rx_free_count();
> 
> Typo: rte_eth_rx_free_count() -> rte_eth_tx_free_count()
> 
> >     if (count > X) {
> >         /* Do something */
> >     }
> > 
> > For instances like this, scanning the entire ring is wasteful of
> > resources.
> > Instead it would be better to just check the descriptor at position X
> > explicitly. Going to the trouble of checking the exact descriptor count
> > is
> > unnecessary in this case.
> 
> Yes, it introduces such a risk.
> All DPDK examples simply call tx_burst() without checking free space first, so I think the probability (of the simple case) is low.
> And the probability for the case comparing to X could be mitigated by referring to rte_eth_tx_descriptor_status() in the function description.
> 
> > 
> > Out of interest, are you aware of a case where an app would need to
> > know
> > exactly the number of free descriptors, and where the result would not
> > be
> > just compared to one or more threshold values? Do we see cases where it
> > would be used in a computation, perhaps?
> 
> Yes: RED. When exceeding the minimum threshold, the probability of marking/dropping a packet increases linearly with the queue's fill level.
> E.g.:
> 0-900 packets in queue: don't drop,
> 901-1000 packets in queue: probability of dropping the packet is 1-100 % (e.g. 980 packets in queue = 80 % drop probability).
> 
Ok, thanks for the info. All good with me so.

/Bruce
  
Jerin Jacob Jan. 9, 2024, 2:45 p.m. UTC | #15
On Tue, Jan 9, 2024 at 2:24 AM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Konstantin Ananyev [mailto:konstantin.ananyev@huawei.com]
> > Sent: Friday, 5 January 2024 12.13
> >
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > Sent: Friday, January 5, 2024 10:04 AM
> > >
> > > 05/01/2024 10:57, Jerin Jacob:
> > > > On Thu, Jan 4, 2024 at 11:59 PM Thomas Monjalon
> > <thomas@monjalon.net> wrote:
> > > > >
> > > > > 04/01/2024 15:21, Konstantin Ananyev:
> > > > > >
> > > > > > > > > Introduce a new API to retrieve the number of available
> > free descriptors
> > > > > > > > > in a Tx queue. Applications can leverage this API in the
> > fast path to
> > > > > > > > > inspect the Tx queue occupancy and take appropriate
> > actions based on the
> > > > > > > > > available free descriptors.
> > > > > > > > >
> > > > > > > > > A notable use case could be implementing Random Early
> > Discard (RED)
> > > > > > > > > in software based on Tx queue occupancy.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Jerin Jacob <jerinj@marvell.com>
> > > > > > > >
> > > > > > > > I think having an API to get the number of free descriptors
> > per queue is a good idea. Why have it only for TX queues and not
> > > for RX
> > > > > > > queues as well?
> > > > > > >
> > > > > > > I see no harm in adding for Rx as well. I think, it is better
> > to have
> > > > > > > separate API for each instead of adding argument as it is
> > fast path
> > > > > > > API.
> > > > > > > If so, we could add a new API when there is any PMD
> > implementation or
> > > > > > > need for this.
> > > > > >
> > > > > > I think for RX we already have similar one:
> > > > > > /** @internal Get number of used descriptors on a receive
> > queue. */
> > > > > > typedef uint32_t (*eth_rx_queue_count_t)(void *rxq);
> > > > >
> > > > > rte_eth_rx_queue_count() gives the number of Rx used descriptors
> > > > > rte_eth_rx_descriptor_status() gives the status of one Rx
> > descriptor
> > > > > rte_eth_tx_descriptor_status() gives the status of one Tx
> > descriptor
> > > > >
> > > > > This patch is adding a function to get Tx available descriptors,
> > > > > rte_eth_tx_queue_free_desc_get().
> > > > > I can see a symmetry with rte_eth_rx_queue_count().
> > > > > For consistency I would rename it to
> > rte_eth_tx_queue_free_count().
>
> For consistency with rte_eth_rx_queue_count() regarding both naming and behavior of the API, I would prefer:
>
> rte_eth_tx_queue_count(), returning the number of used descriptors.

Ack. I will change to "used" version.

>
> > > > >
> > > > > Should we add rte_eth_tx_queue_count() and
> > rte_eth_rx_queue_free_count()?
> > > >
> > > > IMO, rte_eth_rx_queue_free_count() is enough as
> > > > used count =  total desc number(configured via nb_tx_desc with
> > > > rte_eth_tx_queue_setup())  - free count
> > >
> > > I'm fine with that.
> > >
> >
> > Yep, agree.
> > If we ever need  rte_eth_rx_queue_free_count() and
> > rte_eth_tx_queue_used_count(),
> > it could be done via slow-path as Jerin outlined above, no need to
> > waste entries in fp_ops
> > for that.
>
> Yes, rte_eth_rx/tx_queue_avail_count() could be added in the ethdev library, without driver callbacks, but simply getting data from configured descriptor ring sizes and rte_eth_rx/tx_queue_count().
>
> PS: For naming conventions, I sought inspiration from the mempool library. Also, I don't see any need to use "descs" as part of the names of the proposed functions.
>
> The driver callback can be either "free count" or "used count", whichever is easier for the drivers to implement, or (preferably) whichever is likelier to be faster to execute in the most common case. I would assume that the TX descriptor "used count" is relatively low most of the time.

I will change driver callback to "used count" to have better synergy
with public ethdev API.


>
  
Ferruh Yigit Jan. 12, 2024, 10:56 a.m. UTC | #16
On 1/8/2024 9:15 PM, Morten Brørup wrote:
>> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
>> Sent: Monday, 8 January 2024 11.54
>>
>> On Tue, Dec 19, 2023 at 10:59:48PM +0530, jerinj@marvell.com wrote:
>>> From: Jerin Jacob <jerinj@marvell.com>
>>>
>>> Introduce a new API to retrieve the number of available free
>> descriptors
>>> in a Tx queue. Applications can leverage this API in the fast path to
>>> inspect the Tx queue occupancy and take appropriate actions based on
>> the
>>> available free descriptors.
>>>
>>> A notable use case could be implementing Random Early Discard (RED)
>>> in software based on Tx queue occupancy.
>>>
>>> Signed-off-by: Jerin Jacob <jerinj@marvell.com>
>>> ---
>>
>> Hi Jerin,
>>
>> while I don't strongly object to this patch, I wonder if it encourages
>> sub-optimal code implementations. To determine the number of free
>> descriptors in a ring, the driver in many cases will need to do a scan
>> of
>> the descriptor ring to see how many are free/used. However, I suspect
>> that
>> in most cases we will see something like the following being done:
>>
>>     count = rte_eth_rx_free_count();
> 
> Typo: rte_eth_rx_free_count() -> rte_eth_tx_free_count()
> 
>>     if (count > X) {
>>         /* Do something */
>>     }
>>
>> For instances like this, scanning the entire ring is wasteful of
>> resources.
>> Instead it would be better to just check the descriptor at position X
>> explicitly. Going to the trouble of checking the exact descriptor count
>> is
>> unnecessary in this case.
> 
> Yes, it introduces such a risk.
> All DPDK examples simply call tx_burst() without checking free space first, so I think the probability (of the simple case) is low.
> And the probability for the case comparing to X could be mitigated by referring to rte_eth_tx_descriptor_status() in the function description.
> 

Agree on the risk,
perhaps adding some documentation to the API helps, we can highlight
intended usecase is QoS implementations, like RED etc..

>>
>> Out of interest, are you aware of a case where an app would need to
>> know
>> exactly the number of free descriptors, and where the result would not
>> be
>> just compared to one or more threshold values? Do we see cases where it
>> would be used in a computation, perhaps?
> 
> Yes: RED. When exceeding the minimum threshold, the probability of marking/dropping a packet increases linearly with the queue's fill level.
> E.g.:
> 0-900 packets in queue: don't drop,
> 901-1000 packets in queue: probability of dropping the packet is 1-100 % (e.g. 980 packets in queue = 80 % drop probability).
> 
>
  

Patch

diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst
index f7d9980849..9d6655473a 100644
--- a/doc/guides/nics/features.rst
+++ b/doc/guides/nics/features.rst
@@ -962,6 +962,16 @@  management (see :doc:`../prog_guide/power_man` for more details).
 
 * **[implements] eth_dev_ops**: ``get_monitor_addr``
 
+.. _nic_features_tx_queue_free_desc_query:
+
+Tx queue free descriptor query
+------------------------------
+
+Supports to get the number of free descriptors in a Tx queue.
+
+* **[implements] eth_dev_ops**: ``tx_queue_free_desc_get``.
+* **[related] API**: ``rte_eth_tx_queue_free_desc_get()``.
+
 .. _nic_features_other:
 
 Other dev ops not represented by a Feature
diff --git a/doc/guides/nics/features/default.ini b/doc/guides/nics/features/default.ini
index 806cb033ff..b30002b1c1 100644
--- a/doc/guides/nics/features/default.ini
+++ b/doc/guides/nics/features/default.ini
@@ -59,6 +59,7 @@  Packet type parsing  =
 Timesync             =
 Rx descriptor status =
 Tx descriptor status =
+Tx free descriptor query =
 Basic stats          =
 Extended stats       =
 Stats per queue      =
diff --git a/lib/ethdev/ethdev_trace_points.c b/lib/ethdev/ethdev_trace_points.c
index 91f71d868b..346f37f2e4 100644
--- a/lib/ethdev/ethdev_trace_points.c
+++ b/lib/ethdev/ethdev_trace_points.c
@@ -481,6 +481,9 @@  RTE_TRACE_POINT_REGISTER(rte_eth_trace_count_aggr_ports,
 RTE_TRACE_POINT_REGISTER(rte_eth_trace_map_aggr_tx_affinity,
 	lib.ethdev.map_aggr_tx_affinity)
 
+RTE_TRACE_POINT_REGISTER(rte_eth_trace_tx_queue_free_desc_get,
+	lib.ethdev.tx_queue_free_desc_get)
+
 RTE_TRACE_POINT_REGISTER(rte_flow_trace_copy,
 	lib.ethdev.flow.copy)
 
diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h
index 77331ce652..033fcb8c9b 100644
--- a/lib/ethdev/rte_ethdev.h
+++ b/lib/ethdev/rte_ethdev.h
@@ -6802,6 +6802,84 @@  rte_eth_recycle_mbufs(uint16_t rx_port_id, uint16_t rx_queue_id,
 __rte_experimental
 int rte_eth_buffer_split_get_supported_hdr_ptypes(uint16_t port_id, uint32_t *ptypes, int num);
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this API may change, or be removed, without prior notice
+ *
+ * Get the number of free descriptors in a Tx queue.
+ *
+ * This function retrieves the number of available free descriptors in a
+ * transmit queue. Applications can use this API in the fast path to inspect
+ * Tx queue occupancy and take appropriate actions based on the available
+ * free descriptors. An example action could be implementing the
+ * Random Early Discard (RED).
+ *
+ * If there are no packets in the Tx queue, the function returns the value
+ * of `nb_tx_desc` provided during the initialization of the Tx queue using
+ * rte_eth_tx_queue_setup(), signifying that all descriptors are free.
+ *
+ * @param port_id
+ *   The port identifier of the device.
+ * @param tx_queue_id
+ *   The index of the transmit queue.
+ *   The value must be in the range [0, nb_tx_queue - 1] previously supplied
+ *   to rte_eth_dev_configure().
+ * @return
+ *   - (<= UINT16_MAX) Number of free descriptors in a Tx queue
+ *   - (> UINT16_MAX) if error. Enabled only when RTE_ETHDEV_DEBUG_TX is enabled
+ *
+ * @note This function is designed for fast-path use.
+ *
+ */
+__rte_experimental
+static inline uint32_t
+rte_eth_tx_queue_free_desc_get(uint16_t port_id, uint16_t tx_queue_id)
+{
+	struct rte_eth_fp_ops *fops;
+	uint32_t rc;
+	void *qd;
+
+#ifdef RTE_ETHDEV_DEBUG_TX
+	rc = UINT32_MAX;
+	if (port_id >= RTE_MAX_ETHPORTS || tx_queue_id >= RTE_MAX_QUEUES_PER_PORT) {
+		RTE_ETHDEV_LOG(ERR, "Invalid port_id=%u or tx_queue_id=%u\n",
+				port_id, tx_queue_id);
+
+		rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id, rc);
+		return rc;
+	}
+#endif
+
+	/* Fetch pointer to Tx queue data */
+	fops = &rte_eth_fp_ops[port_id];
+	qd = fops->txq.data[tx_queue_id];
+
+#ifdef RTE_ETHDEV_DEBUG_TX
+	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
+
+	if (qd == NULL) {
+		RTE_ETHDEV_LOG(ERR, "Invalid Tx queue_id=%u for port_id=%u\n",
+				tx_queue_id, port_id);
+
+		rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id, rc);
+		return rc;
+	}
+
+	if (fops->tx_queue_free_desc_get == NULL) {
+		RTE_ETHDEV_LOG(ERR, "tx_queue_free_desc_get callback not implementedd Tx queue_id=%u for port_id=%u\n",
+				tx_queue_id, port_id);
+
+		rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id, rc);
+		return rc;
+	}
+#endif
+	rc = fops->tx_queue_free_desc_get(qd);
+
+	rte_eth_trace_tx_queue_free_desc_get(port_id, tx_queue_id, rc);
+
+	return rc;
+}
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
index 4bfaf79c6c..5b7ee66ee7 100644
--- a/lib/ethdev/rte_ethdev_core.h
+++ b/lib/ethdev/rte_ethdev_core.h
@@ -60,6 +60,9 @@  typedef uint16_t (*eth_recycle_tx_mbufs_reuse_t)(void *txq,
 /** @internal Refill Rx descriptors with the recycling mbufs */
 typedef void (*eth_recycle_rx_descriptors_refill_t)(void *rxq, uint16_t nb);
 
+/** @internal Get the number of free descriptors count of a Tx queue */
+typedef uint16_t (*eth_tx_queue_free_desc_get_t)(void *txq);
+
 /**
  * @internal
  * Structure used to hold opaque pointers to internal ethdev Rx/Tx
@@ -116,7 +119,9 @@  struct rte_eth_fp_ops {
 	eth_tx_descriptor_status_t tx_descriptor_status;
 	/** Copy used mbufs from Tx mbuf ring into Rx. */
 	eth_recycle_tx_mbufs_reuse_t recycle_tx_mbufs_reuse;
-	uintptr_t reserved2[2];
+	/** Get the number of free descriptors count of a Tx queue. */
+	eth_tx_queue_free_desc_get_t tx_queue_free_desc_get;
+	uintptr_t reserved2[1];
 	/**@}*/
 
 } __rte_cache_aligned;
diff --git a/lib/ethdev/rte_ethdev_trace_fp.h b/lib/ethdev/rte_ethdev_trace_fp.h
index 186271c9ff..2c57b39bd2 100644
--- a/lib/ethdev/rte_ethdev_trace_fp.h
+++ b/lib/ethdev/rte_ethdev_trace_fp.h
@@ -73,6 +73,14 @@  RTE_TRACE_POINT_FP(
 	rte_trace_point_emit_u64(count);
 )
 
+RTE_TRACE_POINT_FP(
+	rte_eth_trace_tx_queue_free_desc_get,
+	RTE_TRACE_POINT_ARGS(uint16_t port_id, uint16_t tx_queue_id, uint32_t nb_free_desc),
+	rte_trace_point_emit_u16(port_id);
+	rte_trace_point_emit_u16(tx_queue_id);
+	rte_trace_point_emit_u32(nb_free_desc);
+)
+
 #ifdef __cplusplus
 }
 #endif