[v4] app/testpmd: fix primary process not polling all queues
Checks
Commit Message
Here's how the problem arises.
step1: Start the app.
dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
step2: Perform the following steps and send traffic. As expected,
queue 7 does not send or receive packets, and other queues do.
port 0 rxq 7 stop
port 0 txq 7 stop
set fwd mac
start
step3: Perform the following steps and send traffic. All queues
are expected to send and receive packets normally, but that's not
the case for queue 7.
stop
port stop all
port start all
start
show port xstats all
In fact, only the value of rx_q7_packets for queue 7 is not zero,
which means queue 7 is enabled for the driver but is not involved
in packet receiving and forwarding by software. If we check queue
state by command 'show rxq info 0 7' and 'show txq info 0 7',
we see queue 7 is started as other queues are.
Rx queue state: started
Tx queue state: started
The queue 7 is started but cannot forward. That's the problem.
We know that each stream has a read-only "disabled" field that
control if this stream should be used to forward. This field
depends on testpmd local queue state, please see
commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
DPDK framework maintains ethdev queue state that drivers reported,
which indicates the real state of queues.
There are commands that update these two kind queue state such as
'port X rxq|txq start|stop'. But these operations take effect only
in one stop-start round. In the following stop-start round, the
preceding operations do not take effect anymore. However, only
the ethdev queue state is updated, causing the testpmd and ethdev
state information to diverge and causing unexpected side effects
as above problem.
There was a similar problem for the secondary process, please see
commit 5028f207a4fa ("app/testpmd: fix secondary process packet
forwarding").
This patch applies its workaround with some difference to the
primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
rte_eth_tx_queue_info_get, however they may support deferred_start
with primary process. To not break their behavior, retain the original
testpmd local queue state for those PMDs.
Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
Cc: stable@dpdk.org
Signed-off-by: Jie Hai <haijie1@huawei.com>
---
v1->v2:
1. Fix misspelled word 'deferred'.
2. Fix incorrect format of reference to commits.
v2->v3:
1. Fix incorrect format of reference to commits.
v3->v4:
1. Remove deferred_start change.
2. Modify commit log.
---
app/test-pmd/testpmd.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
Comments
On 6/9/2023 10:03 AM, Jie Hai wrote:
> Here's how the problem arises.
> step1: Start the app.
> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
>
> step2: Perform the following steps and send traffic. As expected,
> queue 7 does not send or receive packets, and other queues do.
> port 0 rxq 7 stop
> port 0 txq 7 stop
> set fwd mac
> start
>
> step3: Perform the following steps and send traffic. All queues
> are expected to send and receive packets normally, but that's not
> the case for queue 7.
> stop
> port stop all
> port start all
> start
> show port xstats all
>
> In fact, only the value of rx_q7_packets for queue 7 is not zero,
> which means queue 7 is enabled for the driver but is not involved
> in packet receiving and forwarding by software. If we check queue
> state by command 'show rxq info 0 7' and 'show txq info 0 7',
> we see queue 7 is started as other queues are.
> Rx queue state: started
> Tx queue state: started
> The queue 7 is started but cannot forward. That's the problem.
>
> We know that each stream has a read-only "disabled" field that
> control if this stream should be used to forward. This field
> depends on testpmd local queue state, please see
> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
> DPDK framework maintains ethdev queue state that drivers reported,
> which indicates the real state of queues.
>
> There are commands that update these two kind queue state such as
> 'port X rxq|txq start|stop'. But these operations take effect only
> in one stop-start round. In the following stop-start round, the
> preceding operations do not take effect anymore. However, only
> the ethdev queue state is updated, causing the testpmd and ethdev
> state information to diverge and causing unexpected side effects
> as above problem.
>
> There was a similar problem for the secondary process, please see
> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
> forwarding").
>
> This patch applies its workaround with some difference to the
> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
> rte_eth_tx_queue_info_get, however they may support deferred_start
> with primary process. To not break their behavior, retain the original
> testpmd local queue state for those PMDs.
>
> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
> Cc: stable@dpdk.org
>
> Signed-off-by: Jie Hai <haijie1@huawei.com>
>
Patch looks good to me, but since it has potential side effects,
Can some from test team verify following before continue:
a) Secondary testpmd
b) Deferred Queue
Thanks,
Ferruh
On 2023/6/9 19:10, Ferruh Yigit wrote:
> On 6/9/2023 10:03 AM, Jie Hai wrote:
>> Here's how the problem arises.
>> step1: Start the app.
>> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
>>
>> step2: Perform the following steps and send traffic. As expected,
>> queue 7 does not send or receive packets, and other queues do.
>> port 0 rxq 7 stop
>> port 0 txq 7 stop
>> set fwd mac
>> start
>>
>> step3: Perform the following steps and send traffic. All queues
>> are expected to send and receive packets normally, but that's not
>> the case for queue 7.
>> stop
>> port stop all
>> port start all
>> start
>> show port xstats all
>>
>> In fact, only the value of rx_q7_packets for queue 7 is not zero,
>> which means queue 7 is enabled for the driver but is not involved
>> in packet receiving and forwarding by software. If we check queue
>> state by command 'show rxq info 0 7' and 'show txq info 0 7',
>> we see queue 7 is started as other queues are.
>> Rx queue state: started
>> Tx queue state: started
>> The queue 7 is started but cannot forward. That's the problem.
>>
>> We know that each stream has a read-only "disabled" field that
>> control if this stream should be used to forward. This field
>> depends on testpmd local queue state, please see
>> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
>> DPDK framework maintains ethdev queue state that drivers reported,
>> which indicates the real state of queues.
>>
>> There are commands that update these two kind queue state such as
>> 'port X rxq|txq start|stop'. But these operations take effect only
>> in one stop-start round. In the following stop-start round, the
>> preceding operations do not take effect anymore. However, only
>> the ethdev queue state is updated, causing the testpmd and ethdev
>> state information to diverge and causing unexpected side effects
>> as above problem.
>>
>> There was a similar problem for the secondary process, please see
>> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
>> forwarding").
>>
>> This patch applies its workaround with some difference to the
>> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
>> rte_eth_tx_queue_info_get, however they may support deferred_start
>> with primary process. To not break their behavior, retain the original
>> testpmd local queue state for those PMDs.
>>
>> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Jie Hai <haijie1@huawei.com>
>>
>
> Patch looks good to me, but since it has potential side effects,
>
> Can some from test team verify following before continue:
> a) Secondary testpmd
> b) Deferred Queue
>
> Thanks,
> Ferruh
>
>
Hi Ferruh,
I tested them with hns3 driver. The results are the same before and
after the patch is applied. The results are as follows:
case1: Secondary testpmd
Action: Secondary testpmd stop a queue and primary testpmd start the queue.
Result: The queue can forward for both process.
case2:
Action: Set a queue with deferred_start on for a primary process.
Result: The queue cannot forward until deferred_start is off.
Thanks,
Jie Hai
>
> .
On 6/20/2023 11:07 AM, Jie Hai wrote:
> On 2023/6/9 19:10, Ferruh Yigit wrote:
>> On 6/9/2023 10:03 AM, Jie Hai wrote:
>>> Here's how the problem arises.
>>> step1: Start the app.
>>> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
>>>
>>> step2: Perform the following steps and send traffic. As expected,
>>> queue 7 does not send or receive packets, and other queues do.
>>> port 0 rxq 7 stop
>>> port 0 txq 7 stop
>>> set fwd mac
>>> start
>>>
>>> step3: Perform the following steps and send traffic. All queues
>>> are expected to send and receive packets normally, but that's not
>>> the case for queue 7.
>>> stop
>>> port stop all
>>> port start all
>>> start
>>> show port xstats all
>>>
>>> In fact, only the value of rx_q7_packets for queue 7 is not zero,
>>> which means queue 7 is enabled for the driver but is not involved
>>> in packet receiving and forwarding by software. If we check queue
>>> state by command 'show rxq info 0 7' and 'show txq info 0 7',
>>> we see queue 7 is started as other queues are.
>>> Rx queue state: started
>>> Tx queue state: started
>>> The queue 7 is started but cannot forward. That's the problem.
>>>
>>> We know that each stream has a read-only "disabled" field that
>>> control if this stream should be used to forward. This field
>>> depends on testpmd local queue state, please see
>>> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
>>> DPDK framework maintains ethdev queue state that drivers reported,
>>> which indicates the real state of queues.
>>>
>>> There are commands that update these two kind queue state such as
>>> 'port X rxq|txq start|stop'. But these operations take effect only
>>> in one stop-start round. In the following stop-start round, the
>>> preceding operations do not take effect anymore. However, only
>>> the ethdev queue state is updated, causing the testpmd and ethdev
>>> state information to diverge and causing unexpected side effects
>>> as above problem.
>>>
>>> There was a similar problem for the secondary process, please see
>>> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
>>> forwarding").
>>>
>>> This patch applies its workaround with some difference to the
>>> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
>>> rte_eth_tx_queue_info_get, however they may support deferred_start
>>> with primary process. To not break their behavior, retain the original
>>> testpmd local queue state for those PMDs.
>>>
>>> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Jie Hai <haijie1@huawei.com>
>>>
>>
>> Patch looks good to me, but since it has potential side effects,
>>
>> Can some from test team verify following before continue:
>> a) Secondary testpmd
>> b) Deferred Queue
>>
>> Thanks,
>> Ferruh
>>
>>
> Hi Ferruh,
>
> I tested them with hns3 driver. The results are the same before and
> after the patch is applied. The results are as follows:
>
> case1: Secondary testpmd
> Action: Secondary testpmd stop a queue and primary testpmd start the
> queue.
> Result: The queue can forward for both process.
>
> case2:
> Action: Set a queue with deferred_start on for a primary process.
> Result: The queue cannot forward until deferred_start is off.
>
Thanks Jie, I will continue to process the patch.
On 6/20/2023 11:07 AM, Jie Hai wrote:
> On 2023/6/9 19:10, Ferruh Yigit wrote:
>> On 6/9/2023 10:03 AM, Jie Hai wrote:
>>> Here's how the problem arises.
>>> step1: Start the app.
>>> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
>>>
>>> step2: Perform the following steps and send traffic. As expected,
>>> queue 7 does not send or receive packets, and other queues do.
>>> port 0 rxq 7 stop
>>> port 0 txq 7 stop
>>> set fwd mac
>>> start
>>>
>>> step3: Perform the following steps and send traffic. All queues
>>> are expected to send and receive packets normally, but that's not
>>> the case for queue 7.
>>> stop
>>> port stop all
>>> port start all
>>> start
>>> show port xstats all
>>>
>>> In fact, only the value of rx_q7_packets for queue 7 is not zero,
>>> which means queue 7 is enabled for the driver but is not involved
>>> in packet receiving and forwarding by software. If we check queue
>>> state by command 'show rxq info 0 7' and 'show txq info 0 7',
>>> we see queue 7 is started as other queues are.
>>> Rx queue state: started
>>> Tx queue state: started
>>> The queue 7 is started but cannot forward. That's the problem.
>>>
>>> We know that each stream has a read-only "disabled" field that
>>> control if this stream should be used to forward. This field
>>> depends on testpmd local queue state, please see
>>> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
>>> DPDK framework maintains ethdev queue state that drivers reported,
>>> which indicates the real state of queues.
>>>
>>> There are commands that update these two kind queue state such as
>>> 'port X rxq|txq start|stop'. But these operations take effect only
>>> in one stop-start round. In the following stop-start round, the
>>> preceding operations do not take effect anymore. However, only
>>> the ethdev queue state is updated, causing the testpmd and ethdev
>>> state information to diverge and causing unexpected side effects
>>> as above problem.
>>>
>>> There was a similar problem for the secondary process, please see
>>> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
>>> forwarding").
>>>
>>> This patch applies its workaround with some difference to the
>>> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
>>> rte_eth_tx_queue_info_get, however they may support deferred_start
>>> with primary process. To not break their behavior, retain the original
>>> testpmd local queue state for those PMDs.
>>>
>>> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Jie Hai <haijie1@huawei.com>
>>>
>>
>> Patch looks good to me, but since it has potential side effects,
>>
>> Can some from test team verify following before continue:
>> a) Secondary testpmd
>> b) Deferred Queue
>>
>> Thanks,
>> Ferruh
>>
>>
> Hi Ferruh,
>
> I tested them with hns3 driver. The results are the same before and
> after the patch is applied. The results are as follows:
>
> case1: Secondary testpmd
> Action: Secondary testpmd stop a queue and primary testpmd start the
> queue.
> Result: The queue can forward for both process.
>
> case2:
> Action: Set a queue with deferred_start on for a primary process.
> Result: The queue cannot forward until deferred_start is off.
>
Acked-by: Ferruh Yigit <ferruh.yigit@amd.com>
Applied to dpdk-next-net/main, thanks.
> -----Original Message-----
> From: Jie Hai <haijie1@huawei.com>
> Sent: Friday, June 9, 2023 12:04 PM
> To: Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
> <yuying.zhang@intel.com>; Anatoly Burakov <anatoly.burakov@intel.com>;
> Matan Azrad <matan@nvidia.com>; Dmitry Kozlyuk
> <dmitry.kozliuk@gmail.com>
> Cc: dev@dpdk.org; liudongdong3@huawei.com; shiyangx.he@intel.com;
> ferruh.yigit@amd.com
> Subject: [PATCH v4] app/testpmd: fix primary process not polling all queues
>
> Here's how the problem arises.
> step1: Start the app.
> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
>
> step2: Perform the following steps and send traffic. As expected,
> queue 7 does not send or receive packets, and other queues do.
> port 0 rxq 7 stop
> port 0 txq 7 stop
> set fwd mac
> start
>
> step3: Perform the following steps and send traffic. All queues
> are expected to send and receive packets normally, but that's not
> the case for queue 7.
> stop
> port stop all
> port start all
> start
> show port xstats all
>
> In fact, only the value of rx_q7_packets for queue 7 is not zero,
> which means queue 7 is enabled for the driver but is not involved
> in packet receiving and forwarding by software. If we check queue
> state by command 'show rxq info 0 7' and 'show txq info 0 7',
> we see queue 7 is started as other queues are.
> Rx queue state: started
> Tx queue state: started
> The queue 7 is started but cannot forward. That's the problem.
>
> We know that each stream has a read-only "disabled" field that
> control if this stream should be used to forward. This field
> depends on testpmd local queue state, please see
> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
> DPDK framework maintains ethdev queue state that drivers reported,
> which indicates the real state of queues.
>
> There are commands that update these two kind queue state such as
> 'port X rxq|txq start|stop'. But these operations take effect only
> in one stop-start round. In the following stop-start round, the
> preceding operations do not take effect anymore. However, only
> the ethdev queue state is updated, causing the testpmd and ethdev
> state information to diverge and causing unexpected side effects
> as above problem.
>
> There was a similar problem for the secondary process, please see
> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
> forwarding").
>
> This patch applies its workaround with some difference to the
> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
> rte_eth_tx_queue_info_get, however they may support deferred_start
> with primary process. To not break their behavior, retain the original
> testpmd local queue state for those PMDs.
>
> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
> Cc: stable@dpdk.org
>
> Signed-off-by: Jie Hai <haijie1@huawei.com>
> ---
Hi Jie,
I see the error below when starting a representor port after reattaching it with this patch, is it expected?
$ sudo ./build /app/dpdk-testpmd -n 4 -a 0000:08:00.0,dv_esw_en=1,representor=vf0-1 -a auxiliary: -a 00:00.0 --iova-mode="va" -- -i
[..]
testpmd> port stop all
testpmd> port close 0
testpmd> device detach 0000:08:00.0
testpmd> port attach 0000:08:00.0,dv_esw_en=1,representor=0-1
testpmd> port start 1
Configuring Port 1 (socket 0)
Port 1: FA:9E:D8:5F:D7:D8
Invalid Rx queue_id=0
testpmd: Failed to get rx queue info
Invalid Tx queue_id=0
testpmd: Failed to get tx queue info
Regards,
Ali
On 2023/6/23 0:40, Ali Alnubani wrote:
>> -----Original Message-----
>> From: Jie Hai <haijie1@huawei.com>
>> Sent: Friday, June 9, 2023 12:04 PM
>> To: Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
>> <yuying.zhang@intel.com>; Anatoly Burakov <anatoly.burakov@intel.com>;
>> Matan Azrad <matan@nvidia.com>; Dmitry Kozlyuk
>> <dmitry.kozliuk@gmail.com>
>> Cc: dev@dpdk.org; liudongdong3@huawei.com; shiyangx.he@intel.com;
>> ferruh.yigit@amd.com
>> Subject: [PATCH v4] app/testpmd: fix primary process not polling all queues
>>
>> Here's how the problem arises.
>> step1: Start the app.
>> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
>>
>> step2: Perform the following steps and send traffic. As expected,
>> queue 7 does not send or receive packets, and other queues do.
>> port 0 rxq 7 stop
>> port 0 txq 7 stop
>> set fwd mac
>> start
>>
>> step3: Perform the following steps and send traffic. All queues
>> are expected to send and receive packets normally, but that's not
>> the case for queue 7.
>> stop
>> port stop all
>> port start all
>> start
>> show port xstats all
>>
>> In fact, only the value of rx_q7_packets for queue 7 is not zero,
>> which means queue 7 is enabled for the driver but is not involved
>> in packet receiving and forwarding by software. If we check queue
>> state by command 'show rxq info 0 7' and 'show txq info 0 7',
>> we see queue 7 is started as other queues are.
>> Rx queue state: started
>> Tx queue state: started
>> The queue 7 is started but cannot forward. That's the problem.
>>
>> We know that each stream has a read-only "disabled" field that
>> control if this stream should be used to forward. This field
>> depends on testpmd local queue state, please see
>> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
>> DPDK framework maintains ethdev queue state that drivers reported,
>> which indicates the real state of queues.
>>
>> There are commands that update these two kind queue state such as
>> 'port X rxq|txq start|stop'. But these operations take effect only
>> in one stop-start round. In the following stop-start round, the
>> preceding operations do not take effect anymore. However, only
>> the ethdev queue state is updated, causing the testpmd and ethdev
>> state information to diverge and causing unexpected side effects
>> as above problem.
>>
>> There was a similar problem for the secondary process, please see
>> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
>> forwarding").
>>
>> This patch applies its workaround with some difference to the
>> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
>> rte_eth_tx_queue_info_get, however they may support deferred_start
>> with primary process. To not break their behavior, retain the original
>> testpmd local queue state for those PMDs.
>>
>> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
>> Cc: stable@dpdk.org
>>
>> Signed-off-by: Jie Hai <haijie1@huawei.com>
>> ---
>
> Hi Jie,
>
> I see the error below when starting a representor port after reattaching it with this patch, is it expected?
>
> $ sudo ./build /app/dpdk-testpmd -n 4 -a 0000:08:00.0,dv_esw_en=1,representor=vf0-1 -a auxiliary: -a 00:00.0 --iova-mode="va" -- -i
> [..]
> testpmd> port stop all
> testpmd> port close 0
> testpmd> device detach 0000:08:00.0
> testpmd> port attach 0000:08:00.0,dv_esw_en=1,representor=0-1
> testpmd> port start 1
> Configuring Port 1 (socket 0)
> Port 1: FA:9E:D8:5F:D7:D8
> Invalid Rx queue_id=0
> testpmd: Failed to get rx queue info
> Invalid Tx queue_id=0
> testpmd: Failed to get tx queue info
>
> Regards,
> Ali
Hi Ali,
Thanks for your feedback.
When update_queue_state is called, the status of all queues on all ports
are updated.
The number of queues is nb_rxq|nb_txq which is stored locally by testpmd
process.
All ports on the same process shares the same nb_rxq|nb_txq.
After detached and attached, the number of queues of port 0 is 0.
And it changes only when the port is reconfigured by testpmd,
which is when port 0 is started.
If we start port 1 first, update_queue_state will update nb_rxq|nb_txq
queues state of port 0, and that's invalid because there's zero queues.
If this patch is not applied, the same problem occurs when the secondary
process detaches and attaches the port, and then starts the port in the
multi-process scenario.
I will submit a patch to fix this problem. When port starts, update
queue state based on the number of queues reported by the driver.
Thanks,
Jie Hai
On 6/26/2023 10:30 AM, Jie Hai wrote:
> On 2023/6/23 0:40, Ali Alnubani wrote:
>>> -----Original Message-----
>>> From: Jie Hai <haijie1@huawei.com>
>>> Sent: Friday, June 9, 2023 12:04 PM
>>> To: Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
>>> <yuying.zhang@intel.com>; Anatoly Burakov <anatoly.burakov@intel.com>;
>>> Matan Azrad <matan@nvidia.com>; Dmitry Kozlyuk
>>> <dmitry.kozliuk@gmail.com>
>>> Cc: dev@dpdk.org; liudongdong3@huawei.com; shiyangx.he@intel.com;
>>> ferruh.yigit@amd.com
>>> Subject: [PATCH v4] app/testpmd: fix primary process not polling all
>>> queues
>>>
>>> Here's how the problem arises.
>>> step1: Start the app.
>>> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
>>>
>>> step2: Perform the following steps and send traffic. As expected,
>>> queue 7 does not send or receive packets, and other queues do.
>>> port 0 rxq 7 stop
>>> port 0 txq 7 stop
>>> set fwd mac
>>> start
>>>
>>> step3: Perform the following steps and send traffic. All queues
>>> are expected to send and receive packets normally, but that's not
>>> the case for queue 7.
>>> stop
>>> port stop all
>>> port start all
>>> start
>>> show port xstats all
>>>
>>> In fact, only the value of rx_q7_packets for queue 7 is not zero,
>>> which means queue 7 is enabled for the driver but is not involved
>>> in packet receiving and forwarding by software. If we check queue
>>> state by command 'show rxq info 0 7' and 'show txq info 0 7',
>>> we see queue 7 is started as other queues are.
>>> Rx queue state: started
>>> Tx queue state: started
>>> The queue 7 is started but cannot forward. That's the problem.
>>>
>>> We know that each stream has a read-only "disabled" field that
>>> control if this stream should be used to forward. This field
>>> depends on testpmd local queue state, please see
>>> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
>>> DPDK framework maintains ethdev queue state that drivers reported,
>>> which indicates the real state of queues.
>>>
>>> There are commands that update these two kind queue state such as
>>> 'port X rxq|txq start|stop'. But these operations take effect only
>>> in one stop-start round. In the following stop-start round, the
>>> preceding operations do not take effect anymore. However, only
>>> the ethdev queue state is updated, causing the testpmd and ethdev
>>> state information to diverge and causing unexpected side effects
>>> as above problem.
>>>
>>> There was a similar problem for the secondary process, please see
>>> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
>>> forwarding").
>>>
>>> This patch applies its workaround with some difference to the
>>> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
>>> rte_eth_tx_queue_info_get, however they may support deferred_start
>>> with primary process. To not break their behavior, retain the original
>>> testpmd local queue state for those PMDs.
>>>
>>> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Jie Hai <haijie1@huawei.com>
>>> ---
>>
>> Hi Jie,
>>
>> I see the error below when starting a representor port after
>> reattaching it with this patch, is it expected?
>>
>> $ sudo ./build /app/dpdk-testpmd -n 4 -a
>> 0000:08:00.0,dv_esw_en=1,representor=vf0-1 -a auxiliary: -a 00:00.0
>> --iova-mode="va" -- -i
>> [..]
>> testpmd> port stop all
>> testpmd> port close 0
>> testpmd> device detach 0000:08:00.0
>> testpmd> port attach 0000:08:00.0,dv_esw_en=1,representor=0-1
>> testpmd> port start 1
>> Configuring Port 1 (socket 0)
>> Port 1: FA:9E:D8:5F:D7:D8
>> Invalid Rx queue_id=0
>> testpmd: Failed to get rx queue info
>> Invalid Tx queue_id=0
>> testpmd: Failed to get tx queue info
>>
>> Regards,
>> Ali
> Hi Ali,
> Thanks for your feedback.
>
> When update_queue_state is called, the status of all queues on all ports
> are updated.
> The number of queues is nb_rxq|nb_txq which is stored locally by testpmd
> process.
> All ports on the same process shares the same nb_rxq|nb_txq.
>
> After detached and attached, the number of queues of port 0 is 0.
> And it changes only when the port is reconfigured by testpmd,
> which is when port 0 is started.
>
> If we start port 1 first, update_queue_state will update nb_rxq|nb_txq
> queues state of port 0, and that's invalid because there's zero queues.
>
> If this patch is not applied, the same problem occurs when the secondary
> process detaches and attaches the port, and then starts the port in the
> multi-process scenario.
>
> I will submit a patch to fix this problem. When port starts, update
> queue state based on the number of queues reported by the driver.
>
Hi Ali,
How big a blocker is this issue, should the fix be part of -rc2?
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Tuesday, June 27, 2023 2:05 PM
> To: Jie Hai <haijie1@huawei.com>; Ali Alnubani <alialnu@nvidia.com>; Aman
> Singh <aman.deep.singh@intel.com>; Yuying Zhang
> <yuying.zhang@intel.com>; Anatoly Burakov <anatoly.burakov@intel.com>;
> Matan Azrad <matan@nvidia.com>; Dmitry Kozlyuk
> <dmitry.kozliuk@gmail.com>
> Cc: dev@dpdk.org; liudongdong3@huawei.com; shiyangx.he@intel.com;
> Raslan Darawsheh <rasland@nvidia.com>; NBU-Contact-Thomas Monjalon
> (EXTERNAL) <thomas@monjalon.net>
> Subject: Re: [PATCH v4] app/testpmd: fix primary process not polling all
> queues
>
> On 6/26/2023 10:30 AM, Jie Hai wrote:
> > On 2023/6/23 0:40, Ali Alnubani wrote:
> >>> -----Original Message-----
> >>> From: Jie Hai <haijie1@huawei.com>
> >>> Sent: Friday, June 9, 2023 12:04 PM
> >>> To: Aman Singh <aman.deep.singh@intel.com>; Yuying Zhang
> >>> <yuying.zhang@intel.com>; Anatoly Burakov
> <anatoly.burakov@intel.com>;
> >>> Matan Azrad <matan@nvidia.com>; Dmitry Kozlyuk
> >>> <dmitry.kozliuk@gmail.com>
> >>> Cc: dev@dpdk.org; liudongdong3@huawei.com; shiyangx.he@intel.com;
> >>> ferruh.yigit@amd.com
> >>> Subject: [PATCH v4] app/testpmd: fix primary process not polling all
> >>> queues
> >>>
> >>> Here's how the problem arises.
> >>> step1: Start the app.
> >>> dpdk-testpmd -a 0000:35:00.0 -l 0-3 -- -i --rxq=10 --txq=10
> >>>
> >>> step2: Perform the following steps and send traffic. As expected,
> >>> queue 7 does not send or receive packets, and other queues do.
> >>> port 0 rxq 7 stop
> >>> port 0 txq 7 stop
> >>> set fwd mac
> >>> start
> >>>
> >>> step3: Perform the following steps and send traffic. All queues
> >>> are expected to send and receive packets normally, but that's not
> >>> the case for queue 7.
> >>> stop
> >>> port stop all
> >>> port start all
> >>> start
> >>> show port xstats all
> >>>
> >>> In fact, only the value of rx_q7_packets for queue 7 is not zero,
> >>> which means queue 7 is enabled for the driver but is not involved
> >>> in packet receiving and forwarding by software. If we check queue
> >>> state by command 'show rxq info 0 7' and 'show txq info 0 7',
> >>> we see queue 7 is started as other queues are.
> >>> Rx queue state: started
> >>> Tx queue state: started
> >>> The queue 7 is started but cannot forward. That's the problem.
> >>>
> >>> We know that each stream has a read-only "disabled" field that
> >>> control if this stream should be used to forward. This field
> >>> depends on testpmd local queue state, please see
> >>> commit 3c4426db54fc ("app/testpmd: do not poll stopped queues").
> >>> DPDK framework maintains ethdev queue state that drivers reported,
> >>> which indicates the real state of queues.
> >>>
> >>> There are commands that update these two kind queue state such as
> >>> 'port X rxq|txq start|stop'. But these operations take effect only
> >>> in one stop-start round. In the following stop-start round, the
> >>> preceding operations do not take effect anymore. However, only
> >>> the ethdev queue state is updated, causing the testpmd and ethdev
> >>> state information to diverge and causing unexpected side effects
> >>> as above problem.
> >>>
> >>> There was a similar problem for the secondary process, please see
> >>> commit 5028f207a4fa ("app/testpmd: fix secondary process packet
> >>> forwarding").
> >>>
> >>> This patch applies its workaround with some difference to the
> >>> primary process. Not all PMDs implement rte_eth_rx_queue_info_get and
> >>> rte_eth_tx_queue_info_get, however they may support deferred_start
> >>> with primary process. To not break their behavior, retain the original
> >>> testpmd local queue state for those PMDs.
> >>>
> >>> Fixes: 3c4426db54fc ("app/testpmd: do not poll stopped queues")
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Jie Hai <haijie1@huawei.com>
> >>> ---
> >>
> >> Hi Jie,
> >>
> >> I see the error below when starting a representor port after
> >> reattaching it with this patch, is it expected?
> >>
> >> $ sudo ./build /app/dpdk-testpmd -n 4 -a
> >> 0000:08:00.0,dv_esw_en=1,representor=vf0-1 -a auxiliary: -a 00:00.0
> >> --iova-mode="va" -- -i
> >> [..]
> >> testpmd> port stop all
> >> testpmd> port close 0
> >> testpmd> device detach 0000:08:00.0
> >> testpmd> port attach 0000:08:00.0,dv_esw_en=1,representor=0-1
> >> testpmd> port start 1
> >> Configuring Port 1 (socket 0)
> >> Port 1: FA:9E:D8:5F:D7:D8
> >> Invalid Rx queue_id=0
> >> testpmd: Failed to get rx queue info
> >> Invalid Tx queue_id=0
> >> testpmd: Failed to get tx queue info
> >>
> >> Regards,
> >> Ali
> > Hi Ali,
> > Thanks for your feedback.
> >
> > When update_queue_state is called, the status of all queues on all ports
> > are updated.
> > The number of queues is nb_rxq|nb_txq which is stored locally by testpmd
> > process.
> > All ports on the same process shares the same nb_rxq|nb_txq.
> >
> > After detached and attached, the number of queues of port 0 is 0.
> > And it changes only when the port is reconfigured by testpmd,
> > which is when port 0 is started.
> >
> > If we start port 1 first, update_queue_state will update nb_rxq|nb_txq
> > queues state of port 0, and that's invalid because there's zero queues.
> >
> > If this patch is not applied, the same problem occurs when the secondary
> > process detaches and attaches the port, and then starts the port in the
> > multi-process scenario.
> >
> > I will submit a patch to fix this problem. When port starts, update
> > queue state based on the number of queues reported by the driver.
> >
>
> Hi Ali,
>
> How big a blocker is this issue, should the fix be part of -rc2?
Hi Ferruh,
I missed your email, sorry about that.
Jie already sent a patch and it resolved it for me.
Thanks,
Ali
@@ -2424,6 +2424,13 @@ update_rx_queue_state(uint16_t port_id, uint16_t queue_id)
ports[port_id].rxq[queue_id].state =
rx_qinfo.queue_state;
} else if (rc == -ENOTSUP) {
+ /*
+ * Do not change the rxq state for primary process
+ * to ensure that the PMDs do not implement
+ * rte_eth_rx_queue_info_get can forward as before.
+ */
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ return;
/*
* Set the rxq state to RTE_ETH_QUEUE_STATE_STARTED
* to ensure that the PMDs do not implement
@@ -2449,6 +2456,13 @@ update_tx_queue_state(uint16_t port_id, uint16_t queue_id)
ports[port_id].txq[queue_id].state =
tx_qinfo.queue_state;
} else if (rc == -ENOTSUP) {
+ /*
+ * Do not change the txq state for primary process
+ * to ensure that the PMDs do not implement
+ * rte_eth_tx_queue_info_get can forward as before.
+ */
+ if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+ return;
/*
* Set the txq state to RTE_ETH_QUEUE_STATE_STARTED
* to ensure that the PMDs do not implement
@@ -2516,8 +2530,7 @@ start_packet_forwarding(int with_tx_first)
return;
if (stream_init != NULL) {
- if (rte_eal_process_type() == RTE_PROC_SECONDARY)
- update_queue_state();
+ update_queue_state();
for (i = 0; i < cur_fwd_config.nb_fwd_streams; i++)
stream_init(fwd_streams[i]);
}
@@ -3280,8 +3293,7 @@ start_port(portid_t pid)
pl[cfg_pi++] = pi;
}
- if (rte_eal_process_type() == RTE_PROC_SECONDARY)
- update_queue_state();
+ update_queue_state();
if (at_least_one_port_successfully_started && !no_link_check)
check_all_ports_link_status(RTE_PORT_ALL);