[v2,06/16] net/dpaa2: support multiple txqs en-queue for ordered

Message ID 20211227161645.24359-7-nipun.gupta@nxp.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series features and fixes on NXP eth devices |

Checks

Context Check Description
ci/checkpatch warning coding style issues

Commit Message

Nipun Gupta Dec. 27, 2021, 4:16 p.m. UTC
  From: Jun Yang <jun.yang@nxp.com>

Support the tx enqueue in order queue mode, where queue id
for each event may be different.

Signed-off-by: Jun Yang <jun.yang@nxp.com>
---
 drivers/event/dpaa2/dpaa2_eventdev.c |  12 ++-
 drivers/net/dpaa2/dpaa2_ethdev.h     |   4 +
 drivers/net/dpaa2/dpaa2_rxtx.c       | 142 +++++++++++++++++++++++++++
 drivers/net/dpaa2/version.map        |   1 +
 4 files changed, 155 insertions(+), 4 deletions(-)
  

Comments

Stephen Hemminger Dec. 27, 2021, 6:01 p.m. UTC | #1
On Mon, 27 Dec 2021 21:46:35 +0530
nipun.gupta@nxp.com wrote:

> @@ -1003,16 +1003,20 @@ dpaa2_eventdev_txa_enqueue(void *port,
>  			   struct rte_event ev[],
>  			   uint16_t nb_events)
>  {
> -	struct rte_mbuf *m = (struct rte_mbuf *)ev[0].mbuf;
> +	void *txq[32];
> +	struct rte_mbuf *m[32];

You are assuming nb_events <= 32.
Why not size the array based on nb_events.

>  	uint8_t qid, i;
>  
>  	RTE_SET_USED(port);
>  
>  	for (i = 0; i < nb_events; i++) {
> -		qid = rte_event_eth_tx_adapter_txq_get(m);
> -		rte_eth_tx_burst(m->port, qid, &m, 1);
> +		m[i] = (struct rte_mbuf *)ev[i].mbuf;

Why the cast? it is already the right type.

> +		qid = rte_event_eth_tx_adapter_txq_get(m[i]);
> +		txq[i] = rte_eth_devices[m[i]->port].data->tx_queues[qid];
  
Nipun Gupta Jan. 3, 2022, 5:47 a.m. UTC | #2
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: 27 December 2021 23:32
> To: Nipun Gupta <nipun.gupta@nxp.com>
> Cc: dev@dpdk.org; thomas@monjalon.net; ferruh.yigit@intel.com; Hemant
> Agrawal <hemant.agrawal@nxp.com>; Jun Yang <jun.yang@nxp.com>
> Subject: Re: [PATCH v2 06/16] net/dpaa2: support multiple txqs en-queue for
> ordered
> 
> On Mon, 27 Dec 2021 21:46:35 +0530
> nipun.gupta@nxp.com wrote:
> 
> > @@ -1003,16 +1003,20 @@ dpaa2_eventdev_txa_enqueue(void *port,
> >  			   struct rte_event ev[],
> >  			   uint16_t nb_events)
> >  {
> > -	struct rte_mbuf *m = (struct rte_mbuf *)ev[0].mbuf;
> > +	void *txq[32];
> > +	struct rte_mbuf *m[32];
> 
> You are assuming nb_events <= 32.
> Why not size the array based on nb_events.

Agree. Actually I will use DPAA2_EVENT_MAX_PORT_ENQUEUE_DEPTH here.

> 
> >  	uint8_t qid, i;
> >
> >  	RTE_SET_USED(port);
> >
> >  	for (i = 0; i < nb_events; i++) {
> > -		qid = rte_event_eth_tx_adapter_txq_get(m);
> > -		rte_eth_tx_burst(m->port, qid, &m, 1);
> > +		m[i] = (struct rte_mbuf *)ev[i].mbuf;
> 
> Why the cast? it is already the right type.

Will remove the cast.

Thanks,
Nipun

> 
> > +		qid = rte_event_eth_tx_adapter_txq_get(m[i]);
> > +		txq[i] = rte_eth_devices[m[i]->port].data->tx_queues[qid];
  
Nipun Gupta Jan. 3, 2022, 8:39 a.m. UTC | #3
> -----Original Message-----
> From: Nipun Gupta
> Sent: 03 January 2022 11:17
> To: Stephen Hemminger <stephen@networkplumber.org>
> Cc: dev@dpdk.org; thomas@monjalon.net; ferruh.yigit@intel.com; Hemant
> Agrawal <hemant.agrawal@nxp.com>; Jun Yang <jun.yang@nxp.com>
> Subject: RE: [PATCH v2 06/16] net/dpaa2: support multiple txqs en-queue for
> ordered
> 
> 
> 
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: 27 December 2021 23:32
> > To: Nipun Gupta <nipun.gupta@nxp.com>
> > Cc: dev@dpdk.org; thomas@monjalon.net; ferruh.yigit@intel.com; Hemant
> > Agrawal <hemant.agrawal@nxp.com>; Jun Yang <jun.yang@nxp.com>
> > Subject: Re: [PATCH v2 06/16] net/dpaa2: support multiple txqs en-queue for
> > ordered
> >
> > On Mon, 27 Dec 2021 21:46:35 +0530
> > nipun.gupta@nxp.com wrote:
> >
> > > @@ -1003,16 +1003,20 @@ dpaa2_eventdev_txa_enqueue(void *port,
> > >  			   struct rte_event ev[],
> > >  			   uint16_t nb_events)
> > >  {
> > > -	struct rte_mbuf *m = (struct rte_mbuf *)ev[0].mbuf;
> > > +	void *txq[32];
> > > +	struct rte_mbuf *m[32];
> >
> > You are assuming nb_events <= 32.
> > Why not size the array based on nb_events.
> 
> Agree. Actually I will use DPAA2_EVENT_MAX_PORT_ENQUEUE_DEPTH here.
> 
> >
> > >  	uint8_t qid, i;
> > >
> > >  	RTE_SET_USED(port);
> > >
> > >  	for (i = 0; i < nb_events; i++) {
> > > -		qid = rte_event_eth_tx_adapter_txq_get(m);
> > > -		rte_eth_tx_burst(m->port, qid, &m, 1);
> > > +		m[i] = (struct rte_mbuf *)ev[i].mbuf;
> >
> > Why the cast? it is already the right type.
> 
> Will remove the cast.

mbuf is void *type in event structure, so it seems better to cast here.

> 
> Thanks,
> Nipun
> 
> >
> > > +		qid = rte_event_eth_tx_adapter_txq_get(m[i]);
> > > +		txq[i] = rte_eth_devices[m[i]->port].data->tx_queues[qid];
  

Patch

diff --git a/drivers/event/dpaa2/dpaa2_eventdev.c b/drivers/event/dpaa2/dpaa2_eventdev.c
index 4d94c315d2..f3d8a7e4f1 100644
--- a/drivers/event/dpaa2/dpaa2_eventdev.c
+++ b/drivers/event/dpaa2/dpaa2_eventdev.c
@@ -1,5 +1,5 @@ 
 /* SPDX-License-Identifier: BSD-3-Clause
- * Copyright 2017,2019 NXP
+ * Copyright 2017,2019-2021 NXP
  */
 
 #include <assert.h>
@@ -1003,16 +1003,20 @@  dpaa2_eventdev_txa_enqueue(void *port,
 			   struct rte_event ev[],
 			   uint16_t nb_events)
 {
-	struct rte_mbuf *m = (struct rte_mbuf *)ev[0].mbuf;
+	void *txq[32];
+	struct rte_mbuf *m[32];
 	uint8_t qid, i;
 
 	RTE_SET_USED(port);
 
 	for (i = 0; i < nb_events; i++) {
-		qid = rte_event_eth_tx_adapter_txq_get(m);
-		rte_eth_tx_burst(m->port, qid, &m, 1);
+		m[i] = (struct rte_mbuf *)ev[i].mbuf;
+		qid = rte_event_eth_tx_adapter_txq_get(m[i]);
+		txq[i] = rte_eth_devices[m[i]->port].data->tx_queues[qid];
 	}
 
+	dpaa2_dev_tx_multi_txq_ordered(txq, m, nb_events);
+
 	return nb_events;
 }
 
diff --git a/drivers/net/dpaa2/dpaa2_ethdev.h b/drivers/net/dpaa2/dpaa2_ethdev.h
index c21571e63d..e001a7e49d 100644
--- a/drivers/net/dpaa2/dpaa2_ethdev.h
+++ b/drivers/net/dpaa2/dpaa2_ethdev.h
@@ -241,6 +241,10 @@  void dpaa2_dev_process_ordered_event(struct qbman_swp *swp,
 uint16_t dpaa2_dev_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts);
 uint16_t dpaa2_dev_tx_ordered(void *queue, struct rte_mbuf **bufs,
 			      uint16_t nb_pkts);
+__rte_internal
+uint16_t dpaa2_dev_tx_multi_txq_ordered(void **queue,
+		struct rte_mbuf **bufs, uint16_t nb_pkts);
+
 uint16_t dummy_dev_tx(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts);
 void dpaa2_dev_free_eqresp_buf(uint16_t eqresp_ci);
 void dpaa2_flow_clean(struct rte_eth_dev *dev);
diff --git a/drivers/net/dpaa2/dpaa2_rxtx.c b/drivers/net/dpaa2/dpaa2_rxtx.c
index ee3ed1b152..1096b1cf1d 100644
--- a/drivers/net/dpaa2/dpaa2_rxtx.c
+++ b/drivers/net/dpaa2/dpaa2_rxtx.c
@@ -1468,6 +1468,148 @@  dpaa2_set_enqueue_descriptor(struct dpaa2_queue *dpaa2_q,
 	*dpaa2_seqn(m) = DPAA2_INVALID_MBUF_SEQN;
 }
 
+uint16_t
+dpaa2_dev_tx_multi_txq_ordered(void **queue,
+		struct rte_mbuf **bufs, uint16_t nb_pkts)
+{
+	/* Function to transmit the frames to multiple queues respectively.*/
+	uint32_t loop, retry_count;
+	int32_t ret;
+	struct qbman_fd fd_arr[MAX_TX_RING_SLOTS];
+	uint32_t frames_to_send;
+	struct rte_mempool *mp;
+	struct qbman_eq_desc eqdesc[MAX_TX_RING_SLOTS];
+	struct dpaa2_queue *dpaa2_q[MAX_TX_RING_SLOTS];
+	struct qbman_swp *swp;
+	uint16_t bpid;
+	struct rte_mbuf *mi;
+	struct rte_eth_dev_data *eth_data;
+	struct dpaa2_dev_priv *priv;
+	struct dpaa2_queue *order_sendq;
+
+	if (unlikely(!DPAA2_PER_LCORE_DPIO)) {
+		ret = dpaa2_affine_qbman_swp();
+		if (ret) {
+			DPAA2_PMD_ERR(
+				"Failed to allocate IO portal, tid: %d\n",
+				rte_gettid());
+			return 0;
+		}
+	}
+	swp = DPAA2_PER_LCORE_PORTAL;
+
+	for (loop = 0; loop < nb_pkts; loop++) {
+		dpaa2_q[loop] = (struct dpaa2_queue *)queue[loop];
+		eth_data = dpaa2_q[loop]->eth_data;
+		priv = eth_data->dev_private;
+		qbman_eq_desc_clear(&eqdesc[loop]);
+		if (*dpaa2_seqn(*bufs) && priv->en_ordered) {
+			order_sendq = (struct dpaa2_queue *)priv->tx_vq[0];
+			dpaa2_set_enqueue_descriptor(order_sendq,
+							     (*bufs),
+							     &eqdesc[loop]);
+		} else {
+			qbman_eq_desc_set_no_orp(&eqdesc[loop],
+							 DPAA2_EQ_RESP_ERR_FQ);
+			qbman_eq_desc_set_fq(&eqdesc[loop],
+						     dpaa2_q[loop]->fqid);
+		}
+
+		retry_count = 0;
+		while (qbman_result_SCN_state(dpaa2_q[loop]->cscn)) {
+			retry_count++;
+			/* Retry for some time before giving up */
+			if (retry_count > CONG_RETRY_COUNT)
+				goto send_frames;
+		}
+
+		if (likely(RTE_MBUF_DIRECT(*bufs))) {
+			mp = (*bufs)->pool;
+			/* Check the basic scenario and set
+			 * the FD appropriately here itself.
+			 */
+			if (likely(mp && mp->ops_index ==
+				priv->bp_list->dpaa2_ops_index &&
+				(*bufs)->nb_segs == 1 &&
+				rte_mbuf_refcnt_read((*bufs)) == 1)) {
+				if (unlikely((*bufs)->ol_flags
+					& RTE_MBUF_F_TX_VLAN)) {
+					ret = rte_vlan_insert(bufs);
+					if (ret)
+						goto send_frames;
+				}
+				DPAA2_MBUF_TO_CONTIG_FD((*bufs),
+					&fd_arr[loop],
+					mempool_to_bpid(mp));
+				bufs++;
+				dpaa2_q[loop]++;
+				continue;
+			}
+		} else {
+			mi = rte_mbuf_from_indirect(*bufs);
+			mp = mi->pool;
+		}
+		/* Not a hw_pkt pool allocated frame */
+		if (unlikely(!mp || !priv->bp_list)) {
+			DPAA2_PMD_ERR("Err: No buffer pool attached");
+			goto send_frames;
+		}
+
+		if (mp->ops_index != priv->bp_list->dpaa2_ops_index) {
+			DPAA2_PMD_WARN("Non DPAA2 buffer pool");
+			/* alloc should be from the default buffer pool
+			 * attached to this interface
+			 */
+			bpid = priv->bp_list->buf_pool.bpid;
+
+			if (unlikely((*bufs)->nb_segs > 1)) {
+				DPAA2_PMD_ERR(
+					"S/G not supp for non hw offload buffer");
+				goto send_frames;
+			}
+			if (eth_copy_mbuf_to_fd(*bufs,
+						&fd_arr[loop], bpid)) {
+				goto send_frames;
+			}
+			/* free the original packet */
+			rte_pktmbuf_free(*bufs);
+		} else {
+			bpid = mempool_to_bpid(mp);
+			if (unlikely((*bufs)->nb_segs > 1)) {
+				if (eth_mbuf_to_sg_fd(*bufs,
+						      &fd_arr[loop],
+						      mp,
+						      bpid))
+					goto send_frames;
+			} else {
+				eth_mbuf_to_fd(*bufs,
+					       &fd_arr[loop], bpid);
+			}
+		}
+
+		bufs++;
+		dpaa2_q[loop]++;
+	}
+
+send_frames:
+	frames_to_send = loop;
+	loop = 0;
+	while (loop < frames_to_send) {
+		ret = qbman_swp_enqueue_multiple_desc(swp, &eqdesc[loop],
+				&fd_arr[loop],
+				frames_to_send - loop);
+		if (likely(ret > 0)) {
+			loop += ret;
+		} else {
+			retry_count++;
+			if (retry_count > DPAA2_MAX_TX_RETRY_COUNT)
+				break;
+		}
+	}
+
+	return loop;
+}
+
 /* Callback to handle sending ordered packets through WRIOP based interface */
 uint16_t
 dpaa2_dev_tx_ordered(void *queue, struct rte_mbuf **bufs, uint16_t nb_pkts)
diff --git a/drivers/net/dpaa2/version.map b/drivers/net/dpaa2/version.map
index 2fe61f3442..cc82b8579d 100644
--- a/drivers/net/dpaa2/version.map
+++ b/drivers/net/dpaa2/version.map
@@ -21,6 +21,7 @@  EXPERIMENTAL {
 INTERNAL {
 	global:
 
+	dpaa2_dev_tx_multi_txq_ordered;
 	dpaa2_eth_eventq_attach;
 	dpaa2_eth_eventq_detach;