[3/3] net/vdev: fix insert vdev core dump
Checks
Commit Message
Inserting a vdev device when the device arguments are already stored
in devargs_list, the rte_devargs_insert function replaces the supplied
new devargs with the found devargs and frees the new devargs. As a
result, the use of free devargs results in a core dump.
This patch fixes the issue by using valid devargs.
Fixes: f3a1188cee4a ("devargs: make device representation generic")
Cc: stable@dpdk.org
Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
---
drivers/bus/vdev/vdev.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
> -----Original Message-----
> From: Mingjin Ye <mingjinx.ye@intel.com>
> Sent: Thursday, March 14, 2024 5:37 PM
> To: dev@dpdk.org
> Cc: Ye, MingjinX <mingjinx.ye@intel.com>; stable@dpdk.org
> Subject: [PATCH 3/3] net/vdev: fix insert vdev core dump
>
> Inserting a vdev device when the device arguments are already stored in
> devargs_list, the rte_devargs_insert function replaces the supplied new
> devargs with the found devargs and frees the new devargs. As a result, the use
> of free devargs results in a core dump.
>
> This patch fixes the issue by using valid devargs.
>
> Fixes: f3a1188cee4a ("devargs: make device representation generic")
> Cc: stable@dpdk.org
>
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> drivers/bus/vdev/vdev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
Tested-by: Yu Jiang <yux.jiang@intel.com>
Best regards,
Yu Jiang
14/03/2024 10:36, Mingjin Ye:
> Inserting a vdev device when the device arguments are already stored
> in devargs_list, the rte_devargs_insert function replaces the supplied
> new devargs with the found devargs and frees the new devargs. As a
> result, the use of free devargs results in a core dump.
>
> This patch fixes the issue by using valid devargs.
>
> Fixes: f3a1188cee4a ("devargs: make device representation generic")
> Cc: stable@dpdk.org
>
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> ---
> dev->device.bus = &rte_vdev_bus;
> dev->device.numa_node = SOCKET_ID_ANY;
> - dev->device.name = devargs->name;
>
> if (find_vdev(name)) {
> /*
> @@ -300,6 +299,7 @@ insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
>
> rte_devargs_insert(&devargs);
> dev->device.devargs = devargs;
> + dev->device.name = devargs->name;
> TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
How setting the name later can have an impact here?
> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Thursday, June 20, 2024 4:16 AM
> To: Ye, MingjinX <mingjinx.ye@intel.com>
> Cc: dev@dpdk.org; stable@dpdk.org; Marchand, David
> <david.marchand@redhat.com>; stephen@networkplumber.org;
> Richardson, Bruce <bruce.richardson@intel.com>
> Subject: Re: [PATCH 3/3] net/vdev: fix insert vdev core dump
>
> 14/03/2024 10:36, Mingjin Ye:
> > Inserting a vdev device when the device arguments are already stored
> > in devargs_list, the rte_devargs_insert function replaces the supplied
> > new devargs with the found devargs and frees the new devargs. As a
> > result, the use of free devargs results in a core dump.
> >
> > This patch fixes the issue by using valid devargs.
> >
> > Fixes: f3a1188cee4a ("devargs: make device representation generic")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
> > ---
> > dev->device.bus = &rte_vdev_bus;
> > dev->device.numa_node = SOCKET_ID_ANY;
> > - dev->device.name = devargs->name;
> >
> > if (find_vdev(name)) {
> > /*
> > @@ -300,6 +299,7 @@ insert_vdev(const char *name, const char *args,
> > struct rte_vdev_device **p_dev)
> >
> > rte_devargs_insert(&devargs);
> > dev->device.devargs = devargs;
> > + dev->device.name = devargs->name;
> > TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>
> How setting the name later can have an impact here?
>
/**
* Insert an rte_devargs in the global list.
*
* @param da
* The devargs structure to insert.
* If a devargs for the same device is already inserted,
* it will be updated and returned. It means *da pointer can change.
*
* @return
* - 0 on success
* - Negative on error.
*/
int
rte_devargs_insert(struct rte_devargs **da);
Info: dpdk/lib/eal/include/rte_devargs.h
On 3/14/2024 10:36 AM, Mingjin Ye wrote:
> Inserting a vdev device when the device arguments are already stored
> in devargs_list, the rte_devargs_insert function replaces the supplied
> new devargs with the found devargs and frees the new devargs. As a
> result, the use of free devargs results in a core dump.
>
> This patch fixes the issue by using valid devargs.
>
> Fixes: f3a1188cee4a ("devargs: make device representation generic")
> Cc: stable@dpdk.org
>
> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
I am not too familiar with how devargs works, so I have a trivial question.
I understand the point of this patch, and it is roughly as follows:
1) we enter insert_vdev and allocated new devargs structure (and copy
the `name` into devargs)
2) we set dev->device.name to devargs->name
3) we insert devargs into the list using rte_devargs_insert
4) if devargs list already had devargs with the same name, our devargs
is destroyed and replaced with devargs that is in the list
5) because of this, dev->device.name becomes invalid as that specific
devargs has been freed - it points to name from the old devargs
We do need to store devargs->name in dev->device.name, and we need to do
so after calling rte_devargs_insert to avoid keeping reference to memory
that was freed. So, provisionally,
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
My question is, under what circumstances would there be duplicate
entries in devargs list? I assume there is an answer to that question
because rte_devargs_insert() expects this case, so this is just for my
understanding - I can't seem to figure out how we would arrive at a
situation where we have duplicate devargs.
> ---
> drivers/bus/vdev/vdev.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c
> index 1a6cc7d12d..53fc4a6e21 100644
> --- a/drivers/bus/vdev/vdev.c
> +++ b/drivers/bus/vdev/vdev.c
> @@ -286,7 +286,6 @@ insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
>
> dev->device.bus = &rte_vdev_bus;
> dev->device.numa_node = SOCKET_ID_ANY;
> - dev->device.name = devargs->name;
>
> if (find_vdev(name)) {
> /*
> @@ -300,6 +299,7 @@ insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
>
> rte_devargs_insert(&devargs);
> dev->device.devargs = devargs;
> + dev->device.name = devargs->name;
> TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
>
> if (p_dev)
> -----Original Message-----
> From: Burakov, Anatoly <anatoly.burakov@intel.com>
> Sent: Friday, July 12, 2024 12:10 AM
> To: Ye, MingjinX <mingjinx.ye@intel.com>; dev@dpdk.org
> Cc: stable@dpdk.org
> Subject: Re: [PATCH 3/3] net/vdev: fix insert vdev core dump
>
> On 3/14/2024 10:36 AM, Mingjin Ye wrote:
> > Inserting a vdev device when the device arguments are already stored
> > in devargs_list, the rte_devargs_insert function replaces the supplied
> > new devargs with the found devargs and frees the new devargs. As a
> > result, the use of free devargs results in a core dump.
> >
> > This patch fixes the issue by using valid devargs.
> >
> > Fixes: f3a1188cee4a ("devargs: make device representation generic")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
>
> I am not too familiar with how devargs works, so I have a trivial question.
>
> I understand the point of this patch, and it is roughly as follows:
>
> 1) we enter insert_vdev and allocated new devargs structure (and copy the
> `name` into devargs)
> 2) we set dev->device.name to devargs->name
> 3) we insert devargs into the list using rte_devargs_insert
> 4) if devargs list already had devargs with the same name, our devargs is
> destroyed and replaced with devargs that is in the list
> 5) because of this, dev->device.name becomes invalid as that specific
> devargs has been freed - it points to name from the old devargs
>
> We do need to store devargs->name in dev->device.name, and we need to
> do so after calling rte_devargs_insert to avoid keeping reference to memory
> that was freed. So, provisionally,
>
> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
>
> My question is, under what circumstances would there be duplicate entries
> in devargs list?
In a multi-process circumstances, the primary and secondary processes both have "vdev" startup
Parameters. And their parameters have the same name. This causes multiple devargs objects with the same name
to be created in the secondary process, the 1st one constructed from the startup parameter and
the 2nd one constructed due to the VDEV_SCAN_ONE message received.
Path 1:
eal_parse_args->eal_parse_common_option: OPT_VDEV_NUM
eal_option_device_parse->rte_devargs_add
Path 2:
vdev_action: vdev_scan_one ->insert_vdev(alloc_devargs)
> I assume there is an answer to that question because
> rte_devargs_insert() expects this case, so this is just for my understanding - I
> can't seem to figure out how we would arrive at a situation where we have
> duplicate devargs.
>
> > ---
> > drivers/bus/vdev/vdev.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/bus/vdev/vdev.c b/drivers/bus/vdev/vdev.c index
> > 1a6cc7d12d..53fc4a6e21 100644
> > --- a/drivers/bus/vdev/vdev.c
> > +++ b/drivers/bus/vdev/vdev.c
> > @@ -286,7 +286,6 @@ insert_vdev(const char *name, const char *args,
> > struct rte_vdev_device **p_dev)
> >
> > dev->device.bus = &rte_vdev_bus;
> > dev->device.numa_node = SOCKET_ID_ANY;
> > - dev->device.name = devargs->name;
> >
> > if (find_vdev(name)) {
> > /*
> > @@ -300,6 +299,7 @@ insert_vdev(const char *name, const char *args,
> > struct rte_vdev_device **p_dev)
> >
> > rte_devargs_insert(&devargs);
> > dev->device.devargs = devargs;
> > + dev->device.name = devargs->name;
> > TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
> >
> > if (p_dev)
>
> --
> Thanks,
> Anatoly
On 7/12/2024 4:18 AM, Ye, MingjinX wrote:
>
>
>> -----Original Message-----
>> From: Burakov, Anatoly <anatoly.burakov@intel.com>
>> Sent: Friday, July 12, 2024 12:10 AM
>> To: Ye, MingjinX <mingjinx.ye@intel.com>; dev@dpdk.org
>> Cc: stable@dpdk.org
>> Subject: Re: [PATCH 3/3] net/vdev: fix insert vdev core dump
>>
>> On 3/14/2024 10:36 AM, Mingjin Ye wrote:
>>> Inserting a vdev device when the device arguments are already stored
>>> in devargs_list, the rte_devargs_insert function replaces the supplied
>>> new devargs with the found devargs and frees the new devargs. As a
>>> result, the use of free devargs results in a core dump.
>>>
>>> This patch fixes the issue by using valid devargs.
>>>
>>> Fixes: f3a1188cee4a ("devargs: make device representation generic")
>>> Cc: stable@dpdk.org
>>>
>>> Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
>>
>> I am not too familiar with how devargs works, so I have a trivial question.
>>
>> I understand the point of this patch, and it is roughly as follows:
>>
>> 1) we enter insert_vdev and allocated new devargs structure (and copy the
>> `name` into devargs)
>> 2) we set dev->device.name to devargs->name
>> 3) we insert devargs into the list using rte_devargs_insert
>> 4) if devargs list already had devargs with the same name, our devargs is
>> destroyed and replaced with devargs that is in the list
>> 5) because of this, dev->device.name becomes invalid as that specific
>> devargs has been freed - it points to name from the old devargs
>>
>> We do need to store devargs->name in dev->device.name, and we need to
>> do so after calling rte_devargs_insert to avoid keeping reference to memory
>> that was freed. So, provisionally,
>>
>> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>
>> My question is, under what circumstances would there be duplicate entries
>> in devargs list?
>
> In a multi-process circumstances, the primary and secondary processes both have "vdev" startup
> Parameters. And their parameters have the same name. This causes multiple devargs objects with the same name
> to be created in the secondary process, the 1st one constructed from the startup parameter and
> the 2nd one constructed due to the VDEV_SCAN_ONE message received.
>
> Path 1:
> eal_parse_args->eal_parse_common_option: OPT_VDEV_NUM
> eal_option_device_parse->rte_devargs_add
>
> Path 2:
> vdev_action: vdev_scan_one ->insert_vdev(alloc_devargs)
So, technically, this is only a problem because we're specifying --vdev
at secondary process startup even though we could've just relied on vdev
multiprocess hotplug to provide us with a device? Or is there more to it?
In any case, I think this patch is OK. I think the Fixes: tag is
pointing to an incorrect commit. I've walked through the commit history,
and it seems that this wasn't a bug initially, but became a bug when
vdev multiprocess hotplug was added, so I think the correct Fixes: tag
should be as follows:
Fixes: cdb068f031c6 ("bus/vdev: scan by multi-process channel")
So, on account of the above, I suggest rewording the commit message to
something like the following:
In secondary processes, insert_vdev() may be called multiple times on
the same device due to a combination of multiprocess hotplug for vdev
bus, and EAL argument adding the same vdev. In this circumstance, when
rte_devargs_insert() is called, the devargs->name reference becomes
invalid because rte_devargs_insert() will destroy the just-allocated
devargs and will replace the pointer with one from the devargs list. As
a result, the reference to devargs->name stored in dev->device.name
becomes invalid. Fix the issue by not setting device name until after
rte_devargs_insert() was called.
Fixes: cdb068f031c6 ("bus/vdev: scan by multi-process channel")
Cc: stable@dpdk.org
Signed-off-by: Mingjin Ye <mingjinx.ye@intel.com>
Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
@@ -286,7 +286,6 @@ insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
dev->device.bus = &rte_vdev_bus;
dev->device.numa_node = SOCKET_ID_ANY;
- dev->device.name = devargs->name;
if (find_vdev(name)) {
/*
@@ -300,6 +299,7 @@ insert_vdev(const char *name, const char *args, struct rte_vdev_device **p_dev)
rte_devargs_insert(&devargs);
dev->device.devargs = devargs;
+ dev->device.name = devargs->name;
TAILQ_INSERT_TAIL(&vdev_device_list, dev, next);
if (p_dev)