net/tap: allow more that 4 queues

Message ID 20240229175620.122949-1-stephen@networkplumber.org (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series net/tap: allow more that 4 queues |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/github-robot: build success github build: passed
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Stephen Hemminger Feb. 29, 2024, 5:56 p.m. UTC
  The tap device needs to exchange file descriptors for tx and rx.
But the EAL MP layer has limit of 8 file descriptors per message.
The ideal resolution would be to increase the number of file
descriptors allowed for rte_mp_sendmsg(), but this would break
the ABI. Workaround the constraint by breaking into multiple messages.

Do not hide errors about MP message failures.

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
 drivers/net/tap/rte_eth_tap.c | 40 +++++++++++++++++++++++++++++------
 1 file changed, 33 insertions(+), 7 deletions(-)
  

Comments

Ferruh Yigit March 6, 2024, 4:14 p.m. UTC | #1
On 2/29/2024 5:56 PM, Stephen Hemminger wrote:
> The tap device needs to exchange file descriptors for tx and rx.
> But the EAL MP layer has limit of 8 file descriptors per message.
> The ideal resolution would be to increase the number of file
> descriptors allowed for rte_mp_sendmsg(), but this would break
> the ABI. Workaround the constraint by breaking into multiple messages.
> 
> Do not hide errors about MP message failures.
> 
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
>  drivers/net/tap/rte_eth_tap.c | 40 +++++++++++++++++++++++++++++------
>  1 file changed, 33 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index 69d9da695bed..df18c328f498 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -863,21 +863,44 @@ tap_mp_req_on_rxtx(struct rte_eth_dev *dev)
>  		msg.fds[fd_iterator++] = process_private->txq_fds[i];
>  		msg.num_fds++;
>  		request_param->txq_count++;
> +
> +		/* Need to break request into chunks */
> +		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
> +			err = rte_mp_sendmsg(&msg);
> +			if (err < 0)
> +				goto fail;
> +
> +			fd_iterator = 0;
> +			msg.num_fds = 0;
> +			request_param->txq_count = 0;
> +		}
>  	}
>  	for (i = 0; i < dev->data->nb_rx_queues; i++) {
>  		msg.fds[fd_iterator++] = process_private->rxq_fds[i];
>  		msg.num_fds++;
>  		request_param->rxq_count++;
> +
> +		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
> +			err = rte_mp_sendmsg(&msg);
> +			if (err < 0)
> +				goto fail;
> +
> +			fd_iterator = 0;
> +			msg.num_fds = 0;
> +			request_param->rxq_count = 0;
> +		}
>  	}

Hi Stephen,

Did you able to verify with more than 4 queues?

As far as I can see, in the secondary counterpart of the
'rte_mp_sendmsg()', each time secondary index starts from 0, and
subsequent calls overwrites the fds in secondary.
So practically still only 4 queues works.
  
Stephen Hemminger March 6, 2024, 8:21 p.m. UTC | #2
On Wed, 6 Mar 2024 16:14:51 +0000
Ferruh Yigit <ferruh.yigit@amd.com> wrote:

> On 2/29/2024 5:56 PM, Stephen Hemminger wrote:
> > The tap device needs to exchange file descriptors for tx and rx.
> > But the EAL MP layer has limit of 8 file descriptors per message.
> > The ideal resolution would be to increase the number of file
> > descriptors allowed for rte_mp_sendmsg(), but this would break
> > the ABI. Workaround the constraint by breaking into multiple messages.
> > 
> > Do not hide errors about MP message failures.
> > 
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> >  drivers/net/tap/rte_eth_tap.c | 40 +++++++++++++++++++++++++++++------
> >  1 file changed, 33 insertions(+), 7 deletions(-)
> > 
> > diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> > index 69d9da695bed..df18c328f498 100644
> > --- a/drivers/net/tap/rte_eth_tap.c
> > +++ b/drivers/net/tap/rte_eth_tap.c
> > @@ -863,21 +863,44 @@ tap_mp_req_on_rxtx(struct rte_eth_dev *dev)
> >  		msg.fds[fd_iterator++] = process_private->txq_fds[i];
> >  		msg.num_fds++;
> >  		request_param->txq_count++;
> > +
> > +		/* Need to break request into chunks */
> > +		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
> > +			err = rte_mp_sendmsg(&msg);
> > +			if (err < 0)
> > +				goto fail;
> > +
> > +			fd_iterator = 0;
> > +			msg.num_fds = 0;
> > +			request_param->txq_count = 0;
> > +		}
> >  	}
> >  	for (i = 0; i < dev->data->nb_rx_queues; i++) {
> >  		msg.fds[fd_iterator++] = process_private->rxq_fds[i];
> >  		msg.num_fds++;
> >  		request_param->rxq_count++;
> > +
> > +		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
> > +			err = rte_mp_sendmsg(&msg);
> > +			if (err < 0)
> > +				goto fail;
> > +
> > +			fd_iterator = 0;
> > +			msg.num_fds = 0;
> > +			request_param->rxq_count = 0;
> > +		}
> >  	}  
> 
> Hi Stephen,
> 
> Did you able to verify with more than 4 queues?
> 
> As far as I can see, in the secondary counterpart of the
> 'rte_mp_sendmsg()', each time secondary index starts from 0, and
> subsequent calls overwrites the fds in secondary.
> So practically still only 4 queues works.

I got 4 queues setup, but looks like they are trash in secondary.
Probably best to revert this and fix it by bumping RTE_MP_MAX_FD_NUM.
This is better, but does take some ABI issue handling.
  
Ferruh Yigit March 7, 2024, 10:25 a.m. UTC | #3
On 3/6/2024 8:21 PM, Stephen Hemminger wrote:
> On Wed, 6 Mar 2024 16:14:51 +0000
> Ferruh Yigit <ferruh.yigit@amd.com> wrote:
> 
>> On 2/29/2024 5:56 PM, Stephen Hemminger wrote:
>>> The tap device needs to exchange file descriptors for tx and rx.
>>> But the EAL MP layer has limit of 8 file descriptors per message.
>>> The ideal resolution would be to increase the number of file
>>> descriptors allowed for rte_mp_sendmsg(), but this would break
>>> the ABI. Workaround the constraint by breaking into multiple messages.
>>>
>>> Do not hide errors about MP message failures.
>>>
>>> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
>>> ---
>>>  drivers/net/tap/rte_eth_tap.c | 40 +++++++++++++++++++++++++++++------
>>>  1 file changed, 33 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
>>> index 69d9da695bed..df18c328f498 100644
>>> --- a/drivers/net/tap/rte_eth_tap.c
>>> +++ b/drivers/net/tap/rte_eth_tap.c
>>> @@ -863,21 +863,44 @@ tap_mp_req_on_rxtx(struct rte_eth_dev *dev)
>>>  		msg.fds[fd_iterator++] = process_private->txq_fds[i];
>>>  		msg.num_fds++;
>>>  		request_param->txq_count++;
>>> +
>>> +		/* Need to break request into chunks */
>>> +		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
>>> +			err = rte_mp_sendmsg(&msg);
>>> +			if (err < 0)
>>> +				goto fail;
>>> +
>>> +			fd_iterator = 0;
>>> +			msg.num_fds = 0;
>>> +			request_param->txq_count = 0;
>>> +		}
>>>  	}
>>>  	for (i = 0; i < dev->data->nb_rx_queues; i++) {
>>>  		msg.fds[fd_iterator++] = process_private->rxq_fds[i];
>>>  		msg.num_fds++;
>>>  		request_param->rxq_count++;
>>> +
>>> +		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
>>> +			err = rte_mp_sendmsg(&msg);
>>> +			if (err < 0)
>>> +				goto fail;
>>> +
>>> +			fd_iterator = 0;
>>> +			msg.num_fds = 0;
>>> +			request_param->rxq_count = 0;
>>> +		}
>>>  	}  
>>
>> Hi Stephen,
>>
>> Did you able to verify with more than 4 queues?
>>
>> As far as I can see, in the secondary counterpart of the
>> 'rte_mp_sendmsg()', each time secondary index starts from 0, and
>> subsequent calls overwrites the fds in secondary.
>> So practically still only 4 queues works.
> 
> I got 4 queues setup, but looks like they are trash in secondary.
> Probably best to revert this and fix it by bumping RTE_MP_MAX_FD_NUM.
> This is better, but does take some ABI issue handling.
>

We can increase RTE_MP_MAX_FD_NUM but still there will be a limit.

Can't it be possible to update 'rte_mp_sendmsg()' to support multiple
'rte_mp_sendmsg()' calls in this patch?

Also need to check if fds size is less than 'RTE_PMD_TAP_MAX_QUEUES'
with multiple 'rte_mp_sendmsg()' call support.
  
Stephen Hemminger March 7, 2024, 4:53 p.m. UTC | #4
On Thu, 7 Mar 2024 10:25:48 +0000
Ferruh Yigit <ferruh.yigit@amd.com> wrote:

> > I got 4 queues setup, but looks like they are trash in secondary.
> > Probably best to revert this and fix it by bumping RTE_MP_MAX_FD_NUM.
> > This is better, but does take some ABI issue handling.
> >  
> 
> We can increase RTE_MP_MAX_FD_NUM but still there will be a limit.
> 
> Can't it be possible to update 'rte_mp_sendmsg()' to support multiple
> 'rte_mp_sendmsg()' calls in this patch?
> 
> Also need to check if fds size is less than 'RTE_PMD_TAP_MAX_QUEUES'
> with multiple 'rte_mp_sendmsg()' call support.

Kernel allows up to 253 fd's to be passed.
So for tap that would limit it to 126 queues; because TAP dups the
fd's for rx and tx but that could be fixable.

Tap should have a static assert about max queues and this as well.

Increasing RTE_MP_MAX_FD_NUM would also fix similar issues in af_xdp PMD
and when af_packet gets MP support.
  

Patch

diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
index 69d9da695bed..df18c328f498 100644
--- a/drivers/net/tap/rte_eth_tap.c
+++ b/drivers/net/tap/rte_eth_tap.c
@@ -863,21 +863,44 @@  tap_mp_req_on_rxtx(struct rte_eth_dev *dev)
 		msg.fds[fd_iterator++] = process_private->txq_fds[i];
 		msg.num_fds++;
 		request_param->txq_count++;
+
+		/* Need to break request into chunks */
+		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
+			err = rte_mp_sendmsg(&msg);
+			if (err < 0)
+				goto fail;
+
+			fd_iterator = 0;
+			msg.num_fds = 0;
+			request_param->txq_count = 0;
+		}
 	}
 	for (i = 0; i < dev->data->nb_rx_queues; i++) {
 		msg.fds[fd_iterator++] = process_private->rxq_fds[i];
 		msg.num_fds++;
 		request_param->rxq_count++;
+
+		if (fd_iterator >= RTE_MP_MAX_FD_NUM) {
+			err = rte_mp_sendmsg(&msg);
+			if (err < 0)
+				goto fail;
+
+			fd_iterator = 0;
+			msg.num_fds = 0;
+			request_param->rxq_count = 0;
+		}
 	}
 
-	err = rte_mp_sendmsg(&msg);
-	if (err < 0) {
-		TAP_LOG(ERR, "Failed to send start req to secondary %d",
-			rte_errno);
-		return -1;
+	if (msg.num_fds > 0) {
+		err = rte_mp_sendmsg(&msg);
+		if (err < 0)
+			goto fail;
 	}
 
 	return 0;
+fail:
+	TAP_LOG(ERR, "Failed to send start req to secondary %d", rte_errno);
+	return err;
 }
 
 static int
@@ -885,8 +908,11 @@  tap_dev_start(struct rte_eth_dev *dev)
 {
 	int err, i;
 
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
-		tap_mp_req_on_rxtx(dev);
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		err = tap_mp_req_on_rxtx(dev);
+		if (err)
+			return err;
+	}
 
 	err = tap_intr_handle_set(dev, 1);
 	if (err)