[v2] Fixes: ethdev: secondary process change shared memory

Message ID 20200113050312.1506-1-fangtonghao@sangfor.com.cn (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] Fixes: ethdev: secondary process change shared memory |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-nxp-Performance success Performance Testing PASS
ci/travis-robot warning Travis build: failed
ci/Intel-compilation fail Compilation issues

Commit Message

方统浩50450 Jan. 13, 2020, 5:03 a.m. UTC
  Secondary process calls “rte_eth_dev_pci_allocate”
function and enters rte_eth_copy_pci_info function
when initializing.Then it sets the value of struct
"rte_eth_dev_data.dev_flags" to zero and reset it,
but this struct is shared by primary process and
secondary process.To fix this bug,by adding an
if-statement to forbid the secondaryprocess changing
the above-mentioned value.

Signed-off-by: Fang TongHao <fangtonghao@sangfor.com.cn>
---
 lib/librte_ethdev/rte_ethdev_pci.h | 18 ++++++++++--------
 1 file changed, 10 insertions(+), 8 deletions(-)
  

Comments

Ferruh Yigit Jan. 14, 2020, 2:45 p.m. UTC | #1
On 1/13/2020 5:03 AM, Fang TongHao wrote:
> Secondary process calls “rte_eth_dev_pci_allocate”
> function and enters rte_eth_copy_pci_info function
> when initializing.Then it sets the value of struct
> "rte_eth_dev_data.dev_flags" to zero and reset it,
> but this struct is shared by primary process and
> secondary process.To fix this bug,by adding an
> if-statement to forbid the secondaryprocess changing
> the above-mentioned value.

Hi Fang,

Thanks for the fix, I agree with the problem statement, but not sure if this
should be handled in the helper function or in the place where the function is
called. Helper function is simple on what it does, do we need to put the primary
process logic in it.

Can you please give more details of the bug you have encounter, is it seen by a
specific PMD?

Thanks,
ferruh

> 
> Signed-off-by: Fang TongHao <fangtonghao@sangfor.com.cn>
> ---
>  lib/librte_ethdev/rte_ethdev_pci.h | 18 ++++++++++--------
>  1 file changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/lib/librte_ethdev/rte_ethdev_pci.h b/lib/librte_ethdev/rte_ethdev_pci.h
> index ccdbb46ec..e7dae0545 100644
> --- a/lib/librte_ethdev/rte_ethdev_pci.h
> +++ b/lib/librte_ethdev/rte_ethdev_pci.h
> @@ -60,14 +60,16 @@ rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev,
>  
>  	eth_dev->intr_handle = &pci_dev->intr_handle;
>  
> -	eth_dev->data->dev_flags = 0;
> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
> -
> -	eth_dev->data->kdrv = pci_dev->kdrv;
> -	eth_dev->data->numa_node = pci_dev->device.numa_node;
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		eth_dev->data->dev_flags = 0;
> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
> +
> +		eth_dev->data->kdrv = pci_dev->kdrv;
> +		eth_dev->data->numa_node = pci_dev->device.numa_node;
> +	}
>  }
>  
>  static inline int
>
  
方统浩50450 Jan. 15, 2020, 6:49 a.m. UTC | #2
Hi Ferruh, thanks for your message.


We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
secondary process will change the shared memory when initializing.Secondary process calls
"rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
(rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
process.


Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
function should simple on what it does.I have two ways to fix this problem, one is add an if-statement

in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
I think the second way is simple and lower risk.


Please forgive me because my poor english....



发件人:Ferruh Yigit <ferruh.yigit@intel.com>
发送日期:2020-01-14 22:45:33
收件人:Fang TongHao <fangtonghao@sangfor.com.cn>,thomas@monjalon.net,arybchenko@solarflare.com
抄送人:dev@dpdk.org,stable@dpdk.org,jia.guo@intel.com,cunming.liang@intel.com,qi.z.zhang@intel.com,jungle845943968@outlook.com
主题:Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory>On 1/13/2020 5:03 AM, Fang TongHao wrote:
>> Secondary process calls “rte_eth_dev_pci_allocate”
>> function and enters rte_eth_copy_pci_info function
>> when initializing.Then it sets the value of struct
>> "rte_eth_dev_data.dev_flags" to zero and reset it,
>> but this struct is shared by primary process and
>> secondary process.To fix this bug,by adding an
>> if-statement to forbid the secondaryprocess changing
>> the above-mentioned value.
>
>Hi Fang,
>
>Thanks for the fix, I agree with the problem statement, but not sure if this
>should be handled in the helper function or in the place where the function is
>called. Helper function is simple on what it does, do we need to put the primary
>process logic in it.
>
>Can you please give more details of the bug you have encounter, is it seen by a
>specific PMD?
>
>Thanks,
>ferruh
>
>> 
>> Signed-off-by: Fang TongHao <fangtonghao@sangfor.com.cn>
>> ---
>>  lib/librte_ethdev/rte_ethdev_pci.h | 18 ++++++++++--------
>>  1 file changed, 10 insertions(+), 8 deletions(-)
>> 
>> diff --git a/lib/librte_ethdev/rte_ethdev_pci.h b/lib/librte_ethdev/rte_ethdev_pci.h
>> index ccdbb46ec..e7dae0545 100644
>> --- a/lib/librte_ethdev/rte_ethdev_pci.h
>> +++ b/lib/librte_ethdev/rte_ethdev_pci.h
>> @@ -60,14 +60,16 @@ rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev,
>>  
>>  	eth_dev->intr_handle = &pci_dev->intr_handle;
>>  
>> -	eth_dev->data->dev_flags = 0;
>> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
>> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
>> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
>> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
>> -
>> -	eth_dev->data->kdrv = pci_dev->kdrv;
>> -	eth_dev->data->numa_node = pci_dev->device.numa_node;
>> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>> +		eth_dev->data->dev_flags = 0;
>> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
>> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
>> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
>> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
>> +
>> +		eth_dev->data->kdrv = pci_dev->kdrv;
>> +		eth_dev->data->numa_node = pci_dev->device.numa_node;
>> +	}
>>  }
>>  
>>  static inline int
>> 
>
  
Ferruh Yigit Jan. 15, 2020, 6:35 p.m. UTC | #3
On 1/15/2020 6:49 AM, 方统浩50450 wrote:
> Hi Ferruh, thanks for your message.
> 
> 
> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
> secondary process will change the shared memory when initializing.Secondary process calls
> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
> process.

I agree this is the problem.
In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
but the generic code is faulty.

And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.

> 
> 
> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
> 
> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
> I think the second way is simple and lower risk.

Yes these are the two options.

I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
BUT my concern was adding decision making to simple/leaf function and make it
harder to debug/use, instead of giving what primary/secondary process should
call decision in higher level.

But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
secondary process, like mlx4 or szedata2, and most probably this is not their
intention.
And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
function may have side affect of 'eth_dev->intr_handle' not set in secondary.

With above considerations I am OK to your proposal to cover all cases, Thomas,
Andrew, any concern?

@Fang, only can you please make a new version to update the
'rte_eth_copy_pci_info' function comment to document shared data is not updated
for the secondary process?

Thanks,
ferruh

> 
> 
> Please forgive me because my poor english....
> 
> 
> 
> 发件人:Ferruh Yigit <ferruh.yigit@intel.com>
> 发送日期:2020-01-14 22:45:33
> 收件人:Fang TongHao <fangtonghao@sangfor.com.cn>,thomas@monjalon.net,arybchenko@solarflare.com
> 抄送人:dev@dpdk.org,stable@dpdk.org,jia.guo@intel.com,cunming.liang@intel.com,qi.z.zhang@intel.com,jungle845943968@outlook.com
> 主题:Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory>On 1/13/2020 5:03 AM, Fang TongHao wrote:
>>> Secondary process calls “rte_eth_dev_pci_allocate”
>>> function and enters rte_eth_copy_pci_info function
>>> when initializing.Then it sets the value of struct
>>> "rte_eth_dev_data.dev_flags" to zero and reset it,
>>> but this struct is shared by primary process and
>>> secondary process.To fix this bug,by adding an
>>> if-statement to forbid the secondaryprocess changing
>>> the above-mentioned value.
>>
>> Hi Fang,
>>
>> Thanks for the fix, I agree with the problem statement, but not sure if this
>> should be handled in the helper function or in the place where the function is
>> called. Helper function is simple on what it does, do we need to put the primary
>> process logic in it.
>>
>> Can you please give more details of the bug you have encounter, is it seen by a
>> specific PMD?
>>
>> Thanks,
>> ferruh
>>
>>>
>>> Signed-off-by: Fang TongHao <fangtonghao@sangfor.com.cn>
>>> ---
>>>  lib/librte_ethdev/rte_ethdev_pci.h | 18 ++++++++++--------
>>>  1 file changed, 10 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/lib/librte_ethdev/rte_ethdev_pci.h b/lib/librte_ethdev/rte_ethdev_pci.h
>>> index ccdbb46ec..e7dae0545 100644
>>> --- a/lib/librte_ethdev/rte_ethdev_pci.h
>>> +++ b/lib/librte_ethdev/rte_ethdev_pci.h
>>> @@ -60,14 +60,16 @@ rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev,
>>>  
>>>  	eth_dev->intr_handle = &pci_dev->intr_handle;
>>>  
>>> -	eth_dev->data->dev_flags = 0;
>>> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
>>> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
>>> -	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
>>> -		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
>>> -
>>> -	eth_dev->data->kdrv = pci_dev->kdrv;
>>> -	eth_dev->data->numa_node = pci_dev->device.numa_node;
>>> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
>>> +		eth_dev->data->dev_flags = 0;
>>> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
>>> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
>>> +		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
>>> +			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
>>> +
>>> +		eth_dev->data->kdrv = pci_dev->kdrv;
>>> +		eth_dev->data->numa_node = pci_dev->device.numa_node;
>>> +	}
>>>  }
>>>  
>>>  static inline int
>>>
>>
> 
> 
> 
>
  
Thomas Monjalon Jan. 15, 2020, 8:43 p.m. UTC | #4
15/01/2020 19:35, Ferruh Yigit:
> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
> > Hi Ferruh, thanks for your message.
> > 
> > 
> > We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
> > support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
> > secondary process will change the shared memory when initializing.Secondary process calls
> > "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
> > (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
> > Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
> > is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
> > the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
> > detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
> > the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
> > process.
> 
> I agree this is the problem.
> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
> but the generic code is faulty.
> 
> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.
> 
> > Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
> > enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
> > function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
> > 
> > in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
> > another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
> > shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
> > I think the second way is simple and lower risk.
> 
> Yes these are the two options.
> 
> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
> BUT my concern was adding decision making to simple/leaf function and make it
> harder to debug/use, instead of giving what primary/secondary process should
> call decision in higher level.
> 
> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
> secondary process, like mlx4 or szedata2, and most probably this is not their
> intention.
> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
> 
> With above considerations I am OK to your proposal to cover all cases, Thomas,
> Andrew, any concern?

Do you mean drivers need to be fixed?
  
Andrew Rybchenko Jan. 16, 2020, 7:43 a.m. UTC | #5
On 1/15/20 11:43 PM, Thomas Monjalon wrote:
> 15/01/2020 19:35, Ferruh Yigit:
>> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
>>> Hi Ferruh, thanks for your message.
>>>
>>>
>>> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
>>> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
>>> secondary process will change the shared memory when initializing.Secondary process calls
>>> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
>>> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
>>> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
>>> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
>>> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
>>> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
>>> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
>>> process.

Hold on, just for my understanding. As far as I can see
RTE_ETH_DEV_DETACHABLE was removed in 17.11. Does it
change something in above description?

>> I agree this is the problem.
>> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
>>
>> but the generic code is faulty.
>>
>> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.

Yes, as I understand RTE_ETH_DEV_CLOSE_REMOVE,
RTE_ETH_DEV_BONDED_SLAVE, RTE_ETH_DEV_REPRESENTOR and
RTE_ETH_DEV_NOLIVE_MAC_ADDR may be lost because of
reinit (if not restored in other branches). Bad anyway.

>>> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
>>> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
>>> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
>>>
>>> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
>>> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
>>> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
>>> I think the second way is simple and lower risk.
>>
>> Yes these are the two options.
>>
>> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
>> BUT my concern was adding decision making to simple/leaf function and make it
>> harder to debug/use, instead of giving what primary/secondary process should
>> call decision in higher level.
>>
>> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
>> secondary process, like mlx4 or szedata2, and most probably this is not their
>> intention.
>> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
>> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
>>
>> With above considerations I am OK to your proposal to cover all cases, Thomas,
>> Andrew, any concern?

I would put if condition in rte_eth_copy_pci_info().
It is the function which writes shared space from
secondary process when it should not be done and it
should be fixed there.

> Do you mean drivers need to be fixed?

I'm not sure that I fully understand it. Since copy function
cares about intr_handle copying I'm afraid that it is not
100% correct to skip it in secondary process completely as
many drivers do right now. Basically it makes eth_dev structure
in secondary process inconsistent. However, it looks like
most of these drivers simply obtain handle from pci_dev
directly and it explains why they are not affected.
There are exceptions which are potentially bugs, e.g.
drivers/net/ice/ice_ethdev.c: ice_interrupt_handler at the end.

I think that it would be better if intr_handle is always
correct in eth_dev (both primary and secondary cases) and
drivers use it instead of the same from pci_dev.
  
Ferruh Yigit Jan. 16, 2020, 9 a.m. UTC | #6
On 1/15/2020 8:43 PM, Thomas Monjalon wrote:
> 15/01/2020 19:35, Ferruh Yigit:
>> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
>>> Hi Ferruh, thanks for your message.
>>>
>>>
>>> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
>>> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
>>> secondary process will change the shared memory when initializing.Secondary process calls
>>> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
>>> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
>>> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
>>> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
>>> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
>>> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
>>> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
>>> process.
>>
>> I agree this is the problem.
>> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
>> but the generic code is faulty.
>>
>> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.
>>
>>> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
>>> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
>>> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
>>>
>>> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
>>> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
>>> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
>>> I think the second way is simple and lower risk.
>>
>> Yes these are the two options.
>>
>> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
>> BUT my concern was adding decision making to simple/leaf function and make it
>> harder to debug/use, instead of giving what primary/secondary process should
>> call decision in higher level.
>>
>> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
>> secondary process, like mlx4 or szedata2, and most probably this is not their
>> intention.
>> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
>> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
>>
>> With above considerations I am OK to your proposal to cover all cases, Thomas,
>> Andrew, any concern?
> 
> Do you mean drivers need to be fixed?
> 

either it or 'rte_eth_copy_pci_info'.

Right now 'rte_eth_copy_pci_info' updates the shared memory, calling it in
secondary overwrites the memory set by primary.

Options Fang mentioned:
1) Don't call 'rte_eth_copy_pci_info' from secondary process path, this requires
fixing 'rte_eth_dev_pci_allocate', 'eth_dev_pci_specific_init' and possibly some
drivers.

2) Add a check inside the 'rte_eth_copy_pci_info' to prevent updating shared
memory if it is secondary process.

Fang's patch does (2), and I am OK with it as well after latest findings.
  
Ferruh Yigit Jan. 16, 2020, 9:04 a.m. UTC | #7
On 1/16/2020 7:43 AM, Andrew Rybchenko wrote:
> On 1/15/20 11:43 PM, Thomas Monjalon wrote:
>> 15/01/2020 19:35, Ferruh Yigit:
>>> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
>>>> Hi Ferruh, thanks for your message.
>>>>
>>>>
>>>> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
>>>> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
>>>> secondary process will change the shared memory when initializing.Secondary process calls
>>>> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
>>>> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
>>>> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
>>>> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
>>>> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
>>>> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
>>>> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
>>>> process.
> 
> Hold on, just for my understanding. As far as I can see
> RTE_ETH_DEV_DETACHABLE was removed in 17.11. Does it
> change something in above description?

Overall secondary overwrites primary values, I think we should fix it
independent from the flags involved.

> 
>>> I agree this is the problem.
>>> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
>>>
>>> but the generic code is faulty.
>>>
>>> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.
> 
> Yes, as I understand RTE_ETH_DEV_CLOSE_REMOVE,
> RTE_ETH_DEV_BONDED_SLAVE, RTE_ETH_DEV_REPRESENTOR and
> RTE_ETH_DEV_NOLIVE_MAC_ADDR may be lost because of
> reinit (if not restored in other branches). Bad anyway.
> 
>>>> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
>>>> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
>>>> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
>>>>
>>>> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
>>>> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
>>>> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
>>>> I think the second way is simple and lower risk.
>>>
>>> Yes these are the two options.
>>>
>>> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
>>> BUT my concern was adding decision making to simple/leaf function and make it
>>> harder to debug/use, instead of giving what primary/secondary process should
>>> call decision in higher level.
>>>
>>> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
>>> secondary process, like mlx4 or szedata2, and most probably this is not their
>>> intention.
>>> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
>>> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
>>>
>>> With above considerations I am OK to your proposal to cover all cases, Thomas,
>>> Andrew, any concern?
> 
> I would put if condition in rte_eth_copy_pci_info().
> It is the function which writes shared space from
> secondary process when it should not be done and it
> should be fixed there.

OK

> 
>> Do you mean drivers need to be fixed?
> 
> I'm not sure that I fully understand it. Since copy function
> cares about intr_handle copying I'm afraid that it is not
> 100% correct to skip it in secondary process completely as
> many drivers do right now. Basically it makes eth_dev structure
> in secondary process inconsistent. However, it looks like
> most of these drivers simply obtain handle from pci_dev
> directly and it explains why they are not affected.
> There are exceptions which are potentially bugs, e.g.
> drivers/net/ice/ice_ethdev.c: ice_interrupt_handler at the end.
> 
> I think that it would be better if intr_handle is always
> correct in eth_dev (both primary and secondary cases) and
> drivers use it instead of the same from pci_dev.
> 

OK

So this suggest going on with Fang's patch. I only requested an additional note
in function comment related to this secondary check.
  
方统浩50450 Jan. 16, 2020, 11:35 a.m. UTC | #8
>@Fang, only can you please make a new version to update the
>'rte_eth_copy_pci_info' function comment to document shared data is not updated
>for the secondary process?
>So this suggest going on with Fang's patch. I only requested an additional note
>in function comment related to this secondary check.
@Ferruh Yigit
Should I update a new version patch of "rte_eth_copy_pci_info" function and explain
wthether the regular functioning of secondary process is affected or not?
I cant figure out what you need me to do.




发件人:Ferruh Yigit <ferruh.yigit@intel.com>
发送日期:2020-01-16 17:04:09
收件人:Andrew Rybchenko <arybchenko@solarflare.com>,Thomas Monjalon <thomas@monjalon.net>
抄送人:"方统浩50450" <fangtonghao@sangfor.com.cn>,dev@dpdk.org,stable@dpdk.org,jia.guo@intel.com,cunming.liang@intel.com,qi.z.zhang@intel.com,jungle845943968@outlook.com,Jerin Jacob Kollanukkaran <jerinj@marvell.com>
主题:Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory>On 1/16/2020 7:43 AM, Andrew Rybchenko wrote:
>> On 1/15/20 11:43 PM, Thomas Monjalon wrote:
>>> 15/01/2020 19:35, Ferruh Yigit:
>>>> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
>>>>> Hi Ferruh, thanks for your message.
>>>>>
>>>>>
>>>>> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
>>>>> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
>>>>> secondary process will change the shared memory when initializing.Secondary process calls
>>>>> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
>>>>> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
>>>>> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
>>>>> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
>>>>> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
>>>>> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
>>>>> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
>>>>> process.
>> 
>> Hold on, just for my understanding. As far as I can see
>> RTE_ETH_DEV_DETACHABLE was removed in 17.11. Does it
>> change something in above description?
>
>Overall secondary overwrites primary values, I think we should fix it
>independent from the flags involved.
>
>> 
>>>> I agree this is the problem.
>>>> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
>>>>
>>>> but the generic code is faulty.
>>>>
>>>> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.
>> 
>> Yes, as I understand RTE_ETH_DEV_CLOSE_REMOVE,
>> RTE_ETH_DEV_BONDED_SLAVE, RTE_ETH_DEV_REPRESENTOR and
>> RTE_ETH_DEV_NOLIVE_MAC_ADDR may be lost because of
>> reinit (if not restored in other branches). Bad anyway.
>> 
>>>>> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
>>>>> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
>>>>> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
>>>>>
>>>>> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
>>>>> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
>>>>> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
>>>>> I think the second way is simple and lower risk.
>>>>
>>>> Yes these are the two options.
>>>>
>>>> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
>>>> BUT my concern was adding decision making to simple/leaf function and make it
>>>> harder to debug/use, instead of giving what primary/secondary process should
>>>> call decision in higher level.
>>>>
>>>> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
>>>> secondary process, like mlx4 or szedata2, and most probably this is not their
>>>> intention.
>>>> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
>>>> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
>>>>
>>>> With above considerations I am OK to your proposal to cover all cases, Thomas,
>>>> Andrew, any concern?
>> 
>> I would put if condition in rte_eth_copy_pci_info().
>> It is the function which writes shared space from
>> secondary process when it should not be done and it
>> should be fixed there.
>
>OK
>
>> 
>>> Do you mean drivers need to be fixed?
>> 
>> I'm not sure that I fully understand it. Since copy function
>> cares about intr_handle copying I'm afraid that it is not
>> 100% correct to skip it in secondary process completely as
>> many drivers do right now. Basically it makes eth_dev structure
>> in secondary process inconsistent. However, it looks like
>> most of these drivers simply obtain handle from pci_dev
>> directly and it explains why they are not affected.
>> There are exceptions which are potentially bugs, e.g.
>> drivers/net/ice/ice_ethdev.c: ice_interrupt_handler at the end.
>> 
>> I think that it would be better if intr_handle is always
>> correct in eth_dev (both primary and secondary cases) and
>> drivers use it instead of the same from pci_dev.
>> 
>
>OK
>
>So this suggest going on with Fang's patch. I only requested an additional note
>in function comment related to this secondary check.
  
Ferruh Yigit Jan. 16, 2020, 12:18 p.m. UTC | #9
On 1/16/2020 11:35 AM, 方统浩50450 wrote:
> 
> 
>>@Fang, only can you please make a new version to update the
>>'rte_eth_copy_pci_info' function comment to document shared data is not updated
>>for the secondary process?
> 
>>So this suggest going on with Fang's patch. I only requested an additional note
>>in function comment related to this secondary check.
> 
> @Ferruh Yigit
> Should I update a new version patch of "rte_eth_copy_pci_info" function and explain
> wthether the regular functioning of secondary process is affected or not?
> I cant figure out what you need me to do.

Hi Fang,

Yes can you please send a new version of your patch.
In new version, additionally update the 'rte_eth_copy_pci_info()' function
comment to document that function updates 'eth_dev->data' only for primary process.

Thanks,
ferruh

> 
> 
> 发件人:Ferruh Yigit <ferruh.yigit@intel.com>
> 发送日期:2020-01-16 17:04:09
> 收件人:Andrew Rybchenko <arybchenko@solarflare.com>,Thomas Monjalon <thomas@monjalon.net>
> 抄送人:"方统浩50450" <fangtonghao@sangfor.com.cn>,dev@dpdk.org,stable@dpdk.org,jia.guo@intel.com,cunming.liang@intel.com,qi.z.zhang@intel.com,jungle845943968@outlook.com,Jerin Jacob Kollanukkaran <jerinj@marvell.com>
> 主题:Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory>On 1/16/2020 7:43 AM, Andrew Rybchenko wrote:
>>> On 1/15/20 11:43 PM, Thomas Monjalon wrote:
>>>> 15/01/2020 19:35, Ferruh Yigit:
>>>>> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
>>>>>> Hi Ferruh, thanks for your message.
>>>>>>
>>>>>>
>>>>>> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
>>>>>> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
>>>>>> secondary process will change the shared memory when initializing.Secondary process calls
>>>>>> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
>>>>>> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
>>>>>> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
>>>>>> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
>>>>>> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
>>>>>> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
>>>>>> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
>>>>>> process.
>>> 
>>> Hold on, just for my understanding. As far as I can see
>>> RTE_ETH_DEV_DETACHABLE was removed in 17.11. Does it
>>> change something in above description?
>>
>>Overall secondary overwrites primary values, I think we should fix it
>>independent from the flags involved.
>>
>>> 
>>>>> I agree this is the problem.
>>>>> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
>>>>>
>>>>> but the generic code is faulty.
>>>>>
>>>>> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.
>>> 
>>> Yes, as I understand RTE_ETH_DEV_CLOSE_REMOVE,
>>> RTE_ETH_DEV_BONDED_SLAVE, RTE_ETH_DEV_REPRESENTOR and
>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR may be lost because of
>>> reinit (if not restored in other branches). Bad anyway.
>>> 
>>>>>> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
>>>>>> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
>>>>>> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
>>>>>>
>>>>>> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
>>>>>> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
>>>>>> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
>>>>>> I think the second way is simple and lower risk.
>>>>>
>>>>> Yes these are the two options.
>>>>>
>>>>> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
>>>>> BUT my concern was adding decision making to simple/leaf function and make it
>>>>> harder to debug/use, instead of giving what primary/secondary process should
>>>>> call decision in higher level.
>>>>>
>>>>> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
>>>>> secondary process, like mlx4 or szedata2, and most probably this is not their
>>>>> intention.
>>>>> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
>>>>> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
>>>>>
>>>>> With above considerations I am OK to your proposal to cover all cases, Thomas,
>>>>> Andrew, any concern?
>>> 
>>> I would put if condition in rte_eth_copy_pci_info().
>>> It is the function which writes shared space from
>>> secondary process when it should not be done and it
>>> should be fixed there.
>>
>>OK
>>
>>> 
>>>> Do you mean drivers need to be fixed?
>>> 
>>> I'm not sure that I fully understand it. Since copy function
>>> cares about intr_handle copying I'm afraid that it is not
>>> 100% correct to skip it in secondary process completely as
>>> many drivers do right now. Basically it makes eth_dev structure
>>> in secondary process inconsistent. However, it looks like
>>> most of these drivers simply obtain handle from pci_dev
>>> directly and it explains why they are not affected.
>>> There are exceptions which are potentially bugs, e.g.
>>> drivers/net/ice/ice_ethdev.c: ice_interrupt_handler at the end.
>>> 
>>> I think that it would be better if intr_handle is always
>>> correct in eth_dev (both primary and secondary cases) and
>>> drivers use it instead of the same from pci_dev.
>>> 
>>
>>OK
>>
>>So this suggest going on with Fang's patch. I only requested an additional note
>>in function comment related to this secondary check.
> 
>
  
方统浩50450 Jan. 17, 2020, 2:11 a.m. UTC | #10
ok, I send a new version of my patch and rewrite commit log again.
you can check my patch in https://patches.dpdk.org/patch/64819/






发件人:Ferruh Yigit <ferruh.yigit@intel.com>
发送日期:2020-01-16 20:18:18
收件人:"方统浩50450" <fangtonghao@sangfor.com.cn>
抄送人:Andrew Rybchenko <arybchenko@solarflare.com>,Thomas Monjalon <thomas@monjalon.net>,dev@dpdk.org,stable@dpdk.org,jia.guo@intel.com,cunming.liang@intel.com,qi.z.zhang@intel.com,jungle845943968@outlook.com,Jerin Jacob Kollanukkaran <jerinj@marvell.com>
主题:Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory>On 1/16/2020 11:35 AM, 方统浩50450 wrote:
>> 
>> 
>>>@Fang, only can you please make a new version to update the
>>>'rte_eth_copy_pci_info' function comment to document shared data is not updated
>>>for the secondary process?
>> 
>>>So this suggest going on with Fang's patch. I only requested an additional note
>>>in function comment related to this secondary check.
>> 
>> @Ferruh Yigit
>> Should I update a new version patch of "rte_eth_copy_pci_info" function and explain
>> wthether the regular functioning of secondary process is affected or not?
>> I cant figure out what you need me to do.
>
>Hi Fang,
>
>Yes can you please send a new version of your patch.
>In new version, additionally update the 'rte_eth_copy_pci_info()' function
>comment to document that function updates 'eth_dev->data' only for primary process.
>
>Thanks,
>ferruh
>
>> 
>> 
>> 发件人:Ferruh Yigit <ferruh.yigit@intel.com>
>> 发送日期:2020-01-16 17:04:09
>> 收件人:Andrew Rybchenko <arybchenko@solarflare.com>,Thomas Monjalon <thomas@monjalon.net>
>> 抄送人:"方统浩50450" <fangtonghao@sangfor.com.cn>,dev@dpdk.org,stable@dpdk.org,jia.guo@intel.com,cunming.liang@intel.com,qi.z.zhang@intel.com,jungle845943968@outlook.com,Jerin Jacob Kollanukkaran <jerinj@marvell.com>
>> 主题:Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory>On 1/16/2020 7:43 AM, Andrew Rybchenko wrote:
>>>> On 1/15/20 11:43 PM, Thomas Monjalon wrote:
>>>>> 15/01/2020 19:35, Ferruh Yigit:
>>>>>> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
>>>>>>> Hi Ferruh, thanks for your message.
>>>>>>>
>>>>>>>
>>>>>>> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
>>>>>>> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
>>>>>>> secondary process will change the shared memory when initializing.Secondary process calls
>>>>>>> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
>>>>>>> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
>>>>>>> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
>>>>>>> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
>>>>>>> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
>>>>>>> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
>>>>>>> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
>>>>>>> process.
>>>> 
>>>> Hold on, just for my understanding. As far as I can see
>>>> RTE_ETH_DEV_DETACHABLE was removed in 17.11. Does it
>>>> change something in above description?
>>>
>>>Overall secondary overwrites primary values, I think we should fix it
>>>independent from the flags involved.
>>>
>>>> 
>>>>>> I agree this is the problem.
>>>>>> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
>>>>>>
>>>>>> but the generic code is faulty.
>>>>>>
>>>>>> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.
>>>> 
>>>> Yes, as I understand RTE_ETH_DEV_CLOSE_REMOVE,
>>>> RTE_ETH_DEV_BONDED_SLAVE, RTE_ETH_DEV_REPRESENTOR and
>>>> RTE_ETH_DEV_NOLIVE_MAC_ADDR may be lost because of
>>>> reinit (if not restored in other branches). Bad anyway.
>>>> 
>>>>>>> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
>>>>>>> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
>>>>>>> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
>>>>>>>
>>>>>>> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
>>>>>>> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
>>>>>>> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
>>>>>>> I think the second way is simple and lower risk.
>>>>>>
>>>>>> Yes these are the two options.
>>>>>>
>>>>>> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
>>>>>> BUT my concern was adding decision making to simple/leaf function and make it
>>>>>> harder to debug/use, instead of giving what primary/secondary process should
>>>>>> call decision in higher level.
>>>>>>
>>>>>> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
>>>>>> secondary process, like mlx4 or szedata2, and most probably this is not their
>>>>>> intention.
>>>>>> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
>>>>>> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
>>>>>>
>>>>>> With above considerations I am OK to your proposal to cover all cases, Thomas,
>>>>>> Andrew, any concern?
>>>> 
>>>> I would put if condition in rte_eth_copy_pci_info().
>>>> It is the function which writes shared space from
>>>> secondary process when it should not be done and it
>>>> should be fixed there.
>>>
>>>OK
>>>
>>>> 
>>>>> Do you mean drivers need to be fixed?
>>>> 
>>>> I'm not sure that I fully understand it. Since copy function
>>>> cares about intr_handle copying I'm afraid that it is not
>>>> 100% correct to skip it in secondary process completely as
>>>> many drivers do right now. Basically it makes eth_dev structure
>>>> in secondary process inconsistent. However, it looks like
>>>> most of these drivers simply obtain handle from pci_dev
>>>> directly and it explains why they are not affected.
>>>> There are exceptions which are potentially bugs, e.g.
>>>> drivers/net/ice/ice_ethdev.c: ice_interrupt_handler at the end.
>>>> 
>>>> I think that it would be better if intr_handle is always
>>>> correct in eth_dev (both primary and secondary cases) and
>>>> drivers use it instead of the same from pci_dev.
>>>> 
>>>
>>>OK
>>>
>>>So this suggest going on with Fang's patch. I only requested an additional note
>>>in function comment related to this secondary check.
>> 
>> 
>
  

Patch

diff --git a/lib/librte_ethdev/rte_ethdev_pci.h b/lib/librte_ethdev/rte_ethdev_pci.h
index ccdbb46ec..e7dae0545 100644
--- a/lib/librte_ethdev/rte_ethdev_pci.h
+++ b/lib/librte_ethdev/rte_ethdev_pci.h
@@ -60,14 +60,16 @@  rte_eth_copy_pci_info(struct rte_eth_dev *eth_dev,
 
 	eth_dev->intr_handle = &pci_dev->intr_handle;
 
-	eth_dev->data->dev_flags = 0;
-	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
-		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
-	if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
-		eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
-
-	eth_dev->data->kdrv = pci_dev->kdrv;
-	eth_dev->data->numa_node = pci_dev->device.numa_node;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		eth_dev->data->dev_flags = 0;
+		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC)
+			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_LSC;
+		if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_RMV)
+			eth_dev->data->dev_flags |= RTE_ETH_DEV_INTR_RMV;
+
+		eth_dev->data->kdrv = pci_dev->kdrv;
+		eth_dev->data->numa_node = pci_dev->device.numa_node;
+	}
 }
 
 static inline int