[v2] net/pcap: fix timeout of stopping device

Message ID 20220906080511.46088-1-yidingx.zhou@intel.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] net/pcap: fix timeout of stopping device |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS

Commit Message

Yiding Zhou Sept. 6, 2022, 8:05 a.m. UTC
  The pcap file will be synchronized to the disk when stopping the device.
It takes a long time if the file is large that would cause the
'detach sync request' timeout when the device is closed under multi-process
scenario.

This commit fixes the issue by using alarm handler to release dumper.

Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
Cc: stable@dpdk.org

Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>

---
v2: use alarm handler to release dumper
---
 drivers/net/pcap/pcap_ethdev.c | 22 +++++++++++++++++++++-
 1 file changed, 21 insertions(+), 1 deletion(-)
  

Comments

Stephen Hemminger Sept. 6, 2022, 2:57 p.m. UTC | #1
On Tue,  6 Sep 2022 16:05:11 +0800
Yiding Zhou <yidingx.zhou@intel.com> wrote:

> The pcap file will be synchronized to the disk when stopping the device.
> It takes a long time if the file is large that would cause the
> 'detach sync request' timeout when the device is closed under multi-process
> scenario.
> 
> This commit fixes the issue by using alarm handler to release dumper.
> 
> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>


I think you need to redesign the handshake if this the case.
Forcing 30 second delay at the end of all uses of pcap is not acceptable.
  
Yiding Zhou Sept. 6, 2022, 4:21 p.m. UTC | #2
On Tue,  6 Sep 2022 16:05:11 +0800
Yiding Zhou <yidingx.zhou@intel.com> wrote:

> The pcap file will be synchronized to the disk when stopping the device.
> It takes a long time if the file is large that would cause the 'detach 
> sync request' timeout when the device is closed under multi-process 
> scenario.
> 
> This commit fixes the issue by using alarm handler to release dumper.
> 
> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> Cc: stable@dpdk.org
> 
> Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>


I think you need to redesign the handshake if this the case.
Forcing 30 second delay at the end of all uses of pcap is not acceptable.

Thanks for your comments.
According to my test, the time required to sync a 100G pcap file is about 20s, 
so I set a delay of 30s. I also tried to use AIO, but some issue in the multi-process scenario.
And I also consider io_uring,  it is only supported in linux-5.x, we need to consider compatibility,. 
Maybe better way is to do more work to redesign the handshake.
  
Yiding Zhou Sept. 21, 2022, 7:14 a.m. UTC | #3
> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Tuesday, September 6, 2022 10:58 PM
> To: Zhou, YidingX <yidingx.zhou@intel.com>
> Cc: dev@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>; Burakov, Anatoly
> <anatoly.burakov@intel.com>; He, Xingguang <xingguang.he@intel.com>;
> stable@dpdk.org
> Subject: Re: [PATCH v2] net/pcap: fix timeout of stopping device
> 
> On Tue,  6 Sep 2022 16:05:11 +0800
> Yiding Zhou <yidingx.zhou@intel.com> wrote:
> 
> > The pcap file will be synchronized to the disk when stopping the device.
> > It takes a long time if the file is large that would cause the 'detach
> > sync request' timeout when the device is closed under multi-process
> > scenario.
> >
> > This commit fixes the issue by using alarm handler to release dumper.
> >
> > Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>
> 
> 
> I think you need to redesign the handshake if this the case.
> Forcing 30 second delay at the end of all uses of pcap is not acceptable.

@Zhang, Qi Z Do we need to redesign the handshake to fix this?
  
Ferruh Yigit Oct. 3, 2022, 3 p.m. UTC | #4
On 9/21/2022 8:14 AM, Zhou, YidingX wrote:
> 
> 
>> -----Original Message-----
>> From: Stephen Hemminger <stephen@networkplumber.org>
>> Sent: Tuesday, September 6, 2022 10:58 PM
>> To: Zhou, YidingX <yidingx.zhou@intel.com>
>> Cc: dev@dpdk.org; Zhang, Qi Z <qi.z.zhang@intel.com>; Burakov, Anatoly
>> <anatoly.burakov@intel.com>; He, Xingguang <xingguang.he@intel.com>;
>> stable@dpdk.org
>> Subject: Re: [PATCH v2] net/pcap: fix timeout of stopping device
>>
>> On Tue,  6 Sep 2022 16:05:11 +0800
>> Yiding Zhou <yidingx.zhou@intel.com> wrote:
>>
>>> The pcap file will be synchronized to the disk when stopping the device.
>>> It takes a long time if the file is large that would cause the 'detach
>>> sync request' timeout when the device is closed under multi-process
>>> scenario.
>>>
>>> This commit fixes the issue by using alarm handler to release dumper.
>>>
>>> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Yiding Zhou <yidingx.zhou@intel.com>
>>
>>
>> I think you need to redesign the handshake if this the case.
>> Forcing 30 second delay at the end of all uses of pcap is not acceptable.
> 
> @Zhang, Qi Z Do we need to redesign the handshake to fix this?

Hi Yiding,

Your approach works, but Stephen's comment is to not add this delay by 
default, because this delay is not always required.

Can you please provide more details on multi-process communication and 
call trace, to help us think about a solution to address this issue in a 
more generic way (not just for pcap but for any case device close takes 
more than multi-process timeout)?
  
Yiding Zhou Nov. 22, 2022, 9:25 a.m. UTC | #5
> -----Original Message-----
> From: Zhou, YidingX <yidingx.zhou@intel.com>
> Sent: Wednesday, September 21, 2022 3:15 PM
> To: Stephen Hemminger <stephen@networkplumber.org>; Zhang, Qi Z
> <qi.z.zhang@intel.com>
> Cc: dev@dpdk.org; Burakov, Anatoly <anatoly.burakov@intel.com>; He,
> Xingguang <xingguang.he@intel.com>; stable@dpdk.org
> Subject: RE: [PATCH v2] net/pcap: fix timeout of stopping device
> 
> 
> 
> > -----Original Message-----
> > From: Stephen Hemminger <mailto:stephen@networkplumber.org>
> > Sent: Tuesday, September 6, 2022 10:58 PM
> > To: Zhou, YidingX <mailto:yidingx.zhou@intel.com>
> > Cc: mailto:dev@dpdk.org; Zhang, Qi Z <mailto:qi.z.zhang@intel.com>; Burakov, Anatoly
> > <mailto:anatoly.burakov@intel.com>; He, Xingguang <mailto:xingguang.he@intel.com>;
> > mailto:stable@dpdk.org
> > Subject: Re: [PATCH v2] net/pcap: fix timeout of stopping device
> >
> > On Tue,  6 Sep 2022 16:05:11 +0800
> > Yiding Zhou <mailto:yidingx.zhou@intel.com> wrote:
> >
> > > The pcap file will be synchronized to the disk when stopping the device.
> > > It takes a long time if the file is large that would cause the
> > > 'detach sync request' timeout when the device is closed under
> > > multi-process scenario.
> > >
> > > This commit fixes the issue by using alarm handler to release dumper.
> > >
> > > Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> > > Cc: mailto:stable@dpdk.org
> > >
> > > Signed-off-by: Yiding Zhou <mailto:yidingx.zhou@intel.com>
> >
> >
> > I think you need to redesign the handshake if this the case.
> > Forcing 30 second delay at the end of all uses of pcap is not acceptable.
> 
> @Zhang, Qi Z Do we need to redesign the handshake to fix this?

Hi, Ferruh
Sorry for the late reply.
I did not receive your email on Oct 6, I got your comments from patchwork.

"Can you please provide more details on multi-process communication and 
call trace, to help us think about a solution to address this issue in a 
more generic way (not just for pcap but for any case device close takes 
more than multi-process timeout)?"

I try to explain this issue with a sequence diagram, hope it can be displayed correctly in the mail.

       thread                                 intr thread           intr thread             thread
    of secondary                       of secondary          of primary          of primary
             |                                              |                         |                          |
             |                                              |                         |                          |
rte_eal_hotplug_remove
rte_dev_remove
eal_dev_hotplug_request_to_primary
rte_mp_request_sync ------------------------------------------------------->|
                                                                                                                    |
                                                                                              handle_secondary_request
                                                                                         |<-----------------|
                                                                                         |
                                                                   __handle_secondary_request
                                                          eal_dev_hotplug_request_to_secondary
           |<------------------------------------- rte_mp_request_sync
           |
handle_primary_request--------->|
                                                           |
                            __handle_primary_request
                               local_dev_remove(this will take long time)
                                            rte_mp_reply -------------------------------->|                              
                                                                                         |
                                                                             local_dev_remove
          |<------------------------------------------------- rte_mp_reply

The marked 'local_dev_remove()' in the secondary process will perform a pcap file synchronization operation.
When the pcap file is too large, it will take a lot of time (according to my test 100G takes 20+ seconds).
This caused the processing of hot_plug message to time out.
  
Stephen Hemminger Nov. 22, 2022, 5:28 p.m. UTC | #6
On Tue, 22 Nov 2022 09:25:33 +0000
"Zhou, YidingX" <yidingx.zhou@intel.com> wrote:

> > -----Original Message-----
> > From: Zhou, YidingX <yidingx.zhou@intel.com>
> > Sent: Wednesday, September 21, 2022 3:15 PM
> > To: Stephen Hemminger <stephen@networkplumber.org>; Zhang, Qi Z
> > <qi.z.zhang@intel.com>
> > Cc: dev@dpdk.org; Burakov, Anatoly <anatoly.burakov@intel.com>; He,
> > Xingguang <xingguang.he@intel.com>; stable@dpdk.org
> > Subject: RE: [PATCH v2] net/pcap: fix timeout of stopping device
> > 
> > 
> >   
> > > -----Original Message-----
> > > From: Stephen Hemminger <mailto:stephen@networkplumber.org>
> > > Sent: Tuesday, September 6, 2022 10:58 PM
> > > To: Zhou, YidingX <mailto:yidingx.zhou@intel.com>
> > > Cc: mailto:dev@dpdk.org; Zhang, Qi Z <mailto:qi.z.zhang@intel.com>; Burakov, Anatoly
> > > <mailto:anatoly.burakov@intel.com>; He, Xingguang <mailto:xingguang.he@intel.com>;
> > > mailto:stable@dpdk.org
> > > Subject: Re: [PATCH v2] net/pcap: fix timeout of stopping device
> > >
> > > On Tue,  6 Sep 2022 16:05:11 +0800
> > > Yiding Zhou <mailto:yidingx.zhou@intel.com> wrote:
> > >  
> > > > The pcap file will be synchronized to the disk when stopping the device.
> > > > It takes a long time if the file is large that would cause the
> > > > 'detach sync request' timeout when the device is closed under
> > > > multi-process scenario.
> > > >
> > > > This commit fixes the issue by using alarm handler to release dumper.
> > > >
> > > > Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> > > > Cc: mailto:stable@dpdk.org
> > > >
> > > > Signed-off-by: Yiding Zhou <mailto:yidingx.zhou@intel.com>  
> > >
> > >
> > > I think you need to redesign the handshake if this the case.
> > > Forcing 30 second delay at the end of all uses of pcap is not acceptable.  
> > 
> > @Zhang, Qi Z Do we need to redesign the handshake to fix this?  
> 
> Hi, Ferruh
> Sorry for the late reply.
> I did not receive your email on Oct 6, I got your comments from patchwork.
> 
> "Can you please provide more details on multi-process communication and 
> call trace, to help us think about a solution to address this issue in a 
> more generic way (not just for pcap but for any case device close takes 
> more than multi-process timeout)?"
> 
> I try to explain this issue with a sequence diagram, hope it can be displayed correctly in the mail.
> 
>        thread                                 intr thread           intr thread             thread
>     of secondary                       of secondary          of primary          of primary
>              |                                              |                         |                          |
>              |                                              |                         |                          |
> rte_eal_hotplug_remove
> rte_dev_remove
> eal_dev_hotplug_request_to_primary
> rte_mp_request_sync ------------------------------------------------------->|
>                                                                                                                     |
>                                                                                               handle_secondary_request
>                                                                                          |<-----------------|
>                                                                                          |
>                                                                    __handle_secondary_request
>                                                           eal_dev_hotplug_request_to_secondary
>            |<------------------------------------- rte_mp_request_sync
>            |
> handle_primary_request--------->|
>                                                            |
>                             __handle_primary_request
>                                local_dev_remove(this will take long time)
>                                             rte_mp_reply -------------------------------->|                              
>                                                                                          |
>                                                                              local_dev_remove
>           |<------------------------------------------------- rte_mp_reply
> 
> The marked 'local_dev_remove()' in the secondary process will perform a pcap file synchronization operation.
> When the pcap file is too large, it will take a lot of time (according to my test 100G takes 20+ seconds).
> This caused the processing of hot_plug message to time out.


Part of the problem maybe a hidden file sync in some library.
Normally, closing a file should be fast even with lots of outstanding data.
The actual write done by OS will continue from file cache.

I wonder if doing some kind of fadvise call might help see POSIX_FADV_SEQUENTIAL or POSIX_FADV_DONTNEED
  
Ferruh Yigit Nov. 29, 2022, 2:11 p.m. UTC | #7
On 11/22/2022 9:25 AM, Zhou, YidingX wrote:
> 
> 
>> -----Original Message-----
>> From: Zhou, YidingX <yidingx.zhou@intel.com>
>> Sent: Wednesday, September 21, 2022 3:15 PM
>> To: Stephen Hemminger <stephen@networkplumber.org>; Zhang, Qi Z
>> <qi.z.zhang@intel.com>
>> Cc: dev@dpdk.org; Burakov, Anatoly <anatoly.burakov@intel.com>; He,
>> Xingguang <xingguang.he@intel.com>; stable@dpdk.org
>> Subject: RE: [PATCH v2] net/pcap: fix timeout of stopping device
>>
>>
>>
>>> -----Original Message-----
>>> From: Stephen Hemminger <mailto:stephen@networkplumber.org>
>>> Sent: Tuesday, September 6, 2022 10:58 PM
>>> To: Zhou, YidingX <mailto:yidingx.zhou@intel.com>
>>> Cc: mailto:dev@dpdk.org; Zhang, Qi Z <mailto:qi.z.zhang@intel.com>; Burakov, Anatoly
>>> <mailto:anatoly.burakov@intel.com>; He, Xingguang <mailto:xingguang.he@intel.com>;
>>> mailto:stable@dpdk.org
>>> Subject: Re: [PATCH v2] net/pcap: fix timeout of stopping device
>>>
>>> On Tue,  6 Sep 2022 16:05:11 +0800
>>> Yiding Zhou <mailto:yidingx.zhou@intel.com> wrote:
>>>
>>>> The pcap file will be synchronized to the disk when stopping the device.
>>>> It takes a long time if the file is large that would cause the
>>>> 'detach sync request' timeout when the device is closed under
>>>> multi-process scenario.
>>>>
>>>> This commit fixes the issue by using alarm handler to release dumper.
>>>>
>>>> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
>>>> Cc: mailto:stable@dpdk.org
>>>>
>>>> Signed-off-by: Yiding Zhou <mailto:yidingx.zhou@intel.com>
>>>
>>>
>>> I think you need to redesign the handshake if this the case.
>>> Forcing 30 second delay at the end of all uses of pcap is not acceptable.
>>
>> @Zhang, Qi Z Do we need to redesign the handshake to fix this?
> 
> Hi, Ferruh
> Sorry for the late reply.
> I did not receive your email on Oct 6, I got your comments from patchwork.
> 
> "Can you please provide more details on multi-process communication and 
> call trace, to help us think about a solution to address this issue in a 
> more generic way (not just for pcap but for any case device close takes 
> more than multi-process timeout)?"
> 
> I try to explain this issue with a sequence diagram, hope it can be displayed correctly in the mail.
> 
>        thread                                 intr thread           intr thread             thread
>     of secondary                       of secondary          of primary          of primary
>              |                                              |                         |                          |
>              |                                              |                         |                          |
> rte_eal_hotplug_remove
> rte_dev_remove
> eal_dev_hotplug_request_to_primary
> rte_mp_request_sync ------------------------------------------------------->|
>                                                                                                                     |
>                                                                                               handle_secondary_request
>                                                                                          |<-----------------|
>                                                                                          |
>                                                                    __handle_secondary_request
>                                                           eal_dev_hotplug_request_to_secondary
>            |<------------------------------------- rte_mp_request_sync
>            |
> handle_primary_request--------->|
>                                                            |
>                             __handle_primary_request
>                                local_dev_remove(this will take long time)
>                                             rte_mp_reply -------------------------------->|                              
>                                                                                          |
>                                                                              local_dev_remove
>           |<------------------------------------------------- rte_mp_reply
> 
> The marked 'local_dev_remove()' in the secondary process will perform a pcap file synchronization operation.
> When the pcap file is too large, it will take a lot of time (according to my test 100G takes 20+ seconds).
> This caused the processing of hot_plug message to time out.

Hi Yiding,

Thanks for the information,

Right now all MP operations timeout is hardcoded in the code and it is 5
seconds.
Do you think does it work to have an API to set custom timeout,
something like `rte_mp_timeout_set()`, and call this from pdump?

This gives a generic solution for similar cases, not just for pcap.
But my concern is if this is too much multi-process related internal
detail to update, @Anatoly may comment on this.
  
Yiding Zhou Dec. 2, 2022, 10:13 a.m. UTC | #8
> >>> On Tue,  6 Sep 2022 16:05:11 +0800
> >>> Yiding Zhou <mailto:yidingx.zhou@intel.com> wrote:
> >>>
> >>>> The pcap file will be synchronized to the disk when stopping the device.
> >>>> It takes a long time if the file is large that would cause the
> >>>> 'detach sync request' timeout when the device is closed under
> >>>> multi-process scenario.
> >>>>
> >>>> This commit fixes the issue by using alarm handler to release dumper.
> >>>>
> >>>> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> >>>> Cc: mailto:stable@dpdk.org
> >>>>
> >>>> Signed-off-by: Yiding Zhou <mailto:yidingx.zhou@intel.com>
> >>>
> >>>
> >>> I think you need to redesign the handshake if this the case.
> >>> Forcing 30 second delay at the end of all uses of pcap is not acceptable.
> >>
> >> @Zhang, Qi Z Do we need to redesign the handshake to fix this?
> >
> > Hi, Ferruh
> > Sorry for the late reply.
> > I did not receive your email on Oct 6, I got your comments from patchwork.
> >
> > "Can you please provide more details on multi-process communication
> > and call trace, to help us think about a solution to address this
> > issue in a more generic way (not just for pcap but for any case device
> > close takes more than multi-process timeout)?"
> >
> > I try to explain this issue with a sequence diagram, hope it can be displayed
> correctly in the mail.
> >
> >        thread                                 intr thread           intr thread             thread
> >     of secondary                       of secondary          of primary          of primary
> >              |                                              |                         |                          |
> >              |                                              |                         |                          |
> > rte_eal_hotplug_remove
> > rte_dev_remove
> > eal_dev_hotplug_request_to_primary
> > rte_mp_request_sync ------------------------------------------------------->|
> >                                                                                                                     |
> >
> handle_secondary_request
> >                                                                                          |<-----------------|
> >                                                                                          |
> >                                                                    __handle_secondary_request
> >                                                           eal_dev_hotplug_request_to_secondary
> >            |<------------------------------------- rte_mp_request_sync
> >            |
> > handle_primary_request--------->|
> >                                                            |
> >                             __handle_primary_request
> >                                local_dev_remove(this will take long time)
> >                                             rte_mp_reply -------------------------------->|
> >                                                                                          |
> >                                                                              local_dev_remove
> >           |<-------------------------------------------------
> > rte_mp_reply
> >
> > The marked 'local_dev_remove()' in the secondary process will perform a
> pcap file synchronization operation.
> > When the pcap file is too large, it will take a lot of time (according to my test
> 100G takes 20+ seconds).
> > This caused the processing of hot_plug message to time out.
> 
> Hi Yiding,
> 
> Thanks for the information,
> 
> Right now all MP operations timeout is hardcoded in the code and it is 5
> seconds.
> Do you think does it work to have an API to set custom timeout, something like
> `rte_mp_timeout_set()`, and call this from pdump?
> 
> This gives a generic solution for similar cases, not just for pcap.
> But my concern is if this is too much multi-process related internal detail to
> update, @Anatoly may comment on this.

Hi, Ferruh
For pdump case only, I think the timeout is affected by pcap's size and other system components, such as the type of FS, system memory size.
It may be difficult to predict the specific time value for setting.
  
Yiding Zhou Dec. 2, 2022, 10:22 a.m. UTC | #9
> > > > On Tue,  6 Sep 2022 16:05:11 +0800 Yiding Zhou
> > > > <mailto:yidingx.zhou@intel.com> wrote:
> > > >
> > > > > The pcap file will be synchronized to the disk when stopping the device.
> > > > > It takes a long time if the file is large that would cause the
> > > > > 'detach sync request' timeout when the device is closed under
> > > > > multi-process scenario.
> > > > >
> > > > > This commit fixes the issue by using alarm handler to release dumper.
> > > > >
> > > > > Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process
> > > > > private")
> > > > > Cc: mailto:stable@dpdk.org
> > > > >
> > > > > Signed-off-by: Yiding Zhou <mailto:yidingx.zhou@intel.com>
> > > >
> > > >
> > > > I think you need to redesign the handshake if this the case.
> > > > Forcing 30 second delay at the end of all uses of pcap is not acceptable.
> > >
> > > @Zhang, Qi Z Do we need to redesign the handshake to fix this?
> >
> > Hi, Ferruh
> > Sorry for the late reply.
> > I did not receive your email on Oct 6, I got your comments from patchwork.
> >
> > "Can you please provide more details on multi-process communication
> > and call trace, to help us think about a solution to address this
> > issue in a more generic way (not just for pcap but for any case device
> > close takes more than multi-process timeout)?"
> >
> > I try to explain this issue with a sequence diagram, hope it can be displayed
> correctly in the mail.
> >
> >        thread                                 intr thread           intr thread             thread
> >     of secondary                       of secondary          of primary          of primary
> >              |                                              |                         |                          |
> >              |                                              |                         |                          |
> > rte_eal_hotplug_remove
> > rte_dev_remove
> > eal_dev_hotplug_request_to_primary
> > rte_mp_request_sync ------------------------------------------------------->|
> >                                                                                                                     |
> >
> handle_secondary_request
> >                                                                                          |<-----------------|
> >                                                                                          |
> >                                                                    __handle_secondary_request
> >                                                           eal_dev_hotplug_request_to_secondary
> >            |<------------------------------------- rte_mp_request_sync
> >            |
> > handle_primary_request--------->|
> >                                                            |
> >                             __handle_primary_request
> >                                local_dev_remove(this will take long time)
> >                                             rte_mp_reply -------------------------------->|
> >                                                                                          |
> >                                                                              local_dev_remove
> >           |<-------------------------------------------------
> > rte_mp_reply
> >
> > The marked 'local_dev_remove()' in the secondary process will perform a
> pcap file synchronization operation.
> > When the pcap file is too large, it will take a lot of time (according to my test
> 100G takes 20+ seconds).
> > This caused the processing of hot_plug message to time out.
> 
> 
> Part of the problem maybe a hidden file sync in some library.
> Normally, closing a file should be fast even with lots of outstanding data.
> The actual write done by OS will continue from file cache.
> 
> I wonder if doing some kind of fadvise call might help see
> POSIX_FADV_SEQUENTIAL or POSIX_FADV_DONTNEED

Thanks for your advice, I will try it and give you feedback.
  
Ferruh Yigit Dec. 2, 2022, 11:19 a.m. UTC | #10
On 12/2/2022 10:13 AM, Zhou, YidingX wrote:
> 
> 
>>>>> On Tue,  6 Sep 2022 16:05:11 +0800
>>>>> Yiding Zhou <mailto:yidingx.zhou@intel.com> wrote:
>>>>>
>>>>>> The pcap file will be synchronized to the disk when stopping the device.
>>>>>> It takes a long time if the file is large that would cause the
>>>>>> 'detach sync request' timeout when the device is closed under
>>>>>> multi-process scenario.
>>>>>>
>>>>>> This commit fixes the issue by using alarm handler to release dumper.
>>>>>>
>>>>>> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
>>>>>> Cc: mailto:stable@dpdk.org
>>>>>>
>>>>>> Signed-off-by: Yiding Zhou <mailto:yidingx.zhou@intel.com>
>>>>>
>>>>>
>>>>> I think you need to redesign the handshake if this the case.
>>>>> Forcing 30 second delay at the end of all uses of pcap is not acceptable.
>>>>
>>>> @Zhang, Qi Z Do we need to redesign the handshake to fix this?
>>>
>>> Hi, Ferruh
>>> Sorry for the late reply.
>>> I did not receive your email on Oct 6, I got your comments from patchwork.
>>>
>>> "Can you please provide more details on multi-process communication
>>> and call trace, to help us think about a solution to address this
>>> issue in a more generic way (not just for pcap but for any case device
>>> close takes more than multi-process timeout)?"
>>>
>>> I try to explain this issue with a sequence diagram, hope it can be displayed
>> correctly in the mail.
>>>
>>>        thread                                 intr thread           intr thread             thread
>>>     of secondary                       of secondary          of primary          of primary
>>>              |                                              |                         |                          |
>>>              |                                              |                         |                          |
>>> rte_eal_hotplug_remove
>>> rte_dev_remove
>>> eal_dev_hotplug_request_to_primary
>>> rte_mp_request_sync ------------------------------------------------------->|
>>>                                                                                                                     |
>>>
>> handle_secondary_request
>>>                                                                                          |<-----------------|
>>>                                                                                          |
>>>                                                                    __handle_secondary_request
>>>                                                           eal_dev_hotplug_request_to_secondary
>>>            |<------------------------------------- rte_mp_request_sync
>>>            |
>>> handle_primary_request--------->|
>>>                                                            |
>>>                             __handle_primary_request
>>>                                local_dev_remove(this will take long time)
>>>                                             rte_mp_reply -------------------------------->|
>>>                                                                                          |
>>>                                                                              local_dev_remove
>>>           |<-------------------------------------------------
>>> rte_mp_reply
>>>
>>> The marked 'local_dev_remove()' in the secondary process will perform a
>> pcap file synchronization operation.
>>> When the pcap file is too large, it will take a lot of time (according to my test
>> 100G takes 20+ seconds).
>>> This caused the processing of hot_plug message to time out.
>>
>> Hi Yiding,
>>
>> Thanks for the information,
>>
>> Right now all MP operations timeout is hardcoded in the code and it is 5
>> seconds.
>> Do you think does it work to have an API to set custom timeout, something like
>> `rte_mp_timeout_set()`, and call this from pdump?
>>
>> This gives a generic solution for similar cases, not just for pcap.
>> But my concern is if this is too much multi-process related internal detail to
>> update, @Anatoly may comment on this.
> 
> Hi, Ferruh
> For pdump case only, I think the timeout is affected by pcap's size and other system components, such as the type of FS, system memory size.
> It may be difficult to predict the specific time value for setting.

It doesn't have to be specific.

Point here is to have a multi process API to set timeout, instead of put
a hardcoded timeout in pcap PMD.
  
Yiding Zhou Dec. 5, 2022, 1:58 a.m. UTC | #11
> >>>>> On Tue,  6 Sep 2022 16:05:11 +0800 Yiding Zhou
> >>>>> <mailto:yidingx.zhou@intel.com> wrote:
> >>>>>
> >>>>>> The pcap file will be synchronized to the disk when stopping the device.
> >>>>>> It takes a long time if the file is large that would cause the
> >>>>>> 'detach sync request' timeout when the device is closed under
> >>>>>> multi-process scenario.
> >>>>>>
> >>>>>> This commit fixes the issue by using alarm handler to release dumper.
> >>>>>>
> >>>>>> Fixes: 0ecfb6c04d54 ("net/pcap: move handler to process private")
> >>>>>> Cc: mailto:stable@dpdk.org
> >>>>>>
> >>>>>> Signed-off-by: Yiding Zhou <mailto:yidingx.zhou@intel.com>
> >>>>>
> >>>>>
> >>>>> I think you need to redesign the handshake if this the case.
> >>>>> Forcing 30 second delay at the end of all uses of pcap is not acceptable.
> >>>>
> >>>> @Zhang, Qi Z Do we need to redesign the handshake to fix this?
> >>>
> >>> Hi, Ferruh
> >>> Sorry for the late reply.
> >>> I did not receive your email on Oct 6, I got your comments from patchwork.
> >>>
> >>> "Can you please provide more details on multi-process communication
> >>> and call trace, to help us think about a solution to address this
> >>> issue in a more generic way (not just for pcap but for any case
> >>> device close takes more than multi-process timeout)?"
> >>>
> >>> I try to explain this issue with a sequence diagram, hope it can be
> >>> displayed
> >> correctly in the mail.
> >>>
> >>>        thread                                 intr thread           intr thread             thread
> >>>     of secondary                       of secondary          of primary          of
> primary
> >>>              |                                              |                         |                          |
> >>>              |                                              |                         |                          |
> >>> rte_eal_hotplug_remove
> >>> rte_dev_remove
> >>> eal_dev_hotplug_request_to_primary
> >>> rte_mp_request_sync ------------------------------------------------------->|
> >>>
> >>> |
> >>>
> >> handle_secondary_request
> >>>                                                                                          |<-----------------|
> >>>                                                                                          |
> >>>                                                                    __handle_secondary_request
> >>>
> eal_dev_hotplug_request_to_secondary
> >>>            |<------------------------------------- rte_mp_request_sync
> >>>            |
> >>> handle_primary_request--------->|
> >>>                                                            |
> >>>                             __handle_primary_request
> >>>                                local_dev_remove(this will take long time)
> >>>                                             rte_mp_reply -------------------------------->|
> >>>                                                                                          |
> >>>                                                                              local_dev_remove
> >>>           |<-------------------------------------------------
> >>> rte_mp_reply
> >>>
> >>> The marked 'local_dev_remove()' in the secondary process will
> >>> perform a
> >> pcap file synchronization operation.
> >>> When the pcap file is too large, it will take a lot of time
> >>> (according to my test
> >> 100G takes 20+ seconds).
> >>> This caused the processing of hot_plug message to time out.
> >>
> >> Hi Yiding,
> >>
> >> Thanks for the information,
> >>
> >> Right now all MP operations timeout is hardcoded in the code and it
> >> is 5 seconds.
> >> Do you think does it work to have an API to set custom timeout,
> >> something like `rte_mp_timeout_set()`, and call this from pdump?
> >>
> >> This gives a generic solution for similar cases, not just for pcap.
> >> But my concern is if this is too much multi-process related internal
> >> detail to update, @Anatoly may comment on this.
> >
> > Hi, Ferruh
> > For pdump case only, I think the timeout is affected by pcap's size and other
> system components, such as the type of FS, system memory size.
> > It may be difficult to predict the specific time value for setting.
> 
> It doesn't have to be specific.
> 
> Point here is to have a multi process API to set timeout, instead of put a
> hardcoded timeout in pcap PMD.

OK, I understood.
  

Patch

diff --git a/drivers/net/pcap/pcap_ethdev.c b/drivers/net/pcap/pcap_ethdev.c
index ec29fd6bc5..5c643a0277 100644
--- a/drivers/net/pcap/pcap_ethdev.c
+++ b/drivers/net/pcap/pcap_ethdev.c
@@ -17,6 +17,7 @@ 
 #include <rte_mbuf_dyn.h>
 #include <rte_bus_vdev.h>
 #include <rte_os_shim.h>
+#include <rte_alarm.h>
 
 #include "pcap_osdep.h"
 
@@ -664,6 +665,25 @@  eth_dev_start(struct rte_eth_dev *dev)
 	return 0;
 }
 
+static void eth_pcap_dumper_release(void *arg)
+{
+	pcap_dump_close((pcap_dumper_t *)arg);
+}
+
+static void
+eth_pcap_dumper_close(pcap_dumper_t *dumper)
+{
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		/*
+		 * Delay 30 seconds before releasing dumper to wait for file sync
+		 * to complete to avoid blocking alarm thread in PRIMARY process
+		 */
+		rte_eal_alarm_set(30000000, eth_pcap_dumper_release, dumper);
+	} else {
+		rte_eal_alarm_set(1, eth_pcap_dumper_release, dumper);
+	}
+}
+
 /*
  * This function gets called when the current port gets stopped.
  * Is the only place for us to close all the tx streams dumpers.
@@ -689,7 +709,7 @@  eth_dev_stop(struct rte_eth_dev *dev)
 
 	for (i = 0; i < dev->data->nb_tx_queues; i++) {
 		if (pp->tx_dumper[i] != NULL) {
-			pcap_dump_close(pp->tx_dumper[i]);
+			eth_pcap_dumper_close(pp->tx_dumper[i]);
 			pp->tx_dumper[i] = NULL;
 		}