[8/8] ipc: fix net/mlx5 memleak

Message ID 20190417144436.24216-1-herakliusz.lipiec@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [1/8] ipc: fix rte_mp_request_sync memleak |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Herakliusz Lipiec April 17, 2019, 2:44 p.m. UTC
  When sending multiple requests, rte_mp_request_sync
can succeed sending a few of those requests, but then
fail on a later one and in the end return with rc=-1.
The upper layers - e.g. device hotplug - currently
handles this case as if no messages were sent and no
memory for response buffers was allocated, which is
not true. Fixed by always freeing reply message buffers.

Fixes: 9a8ab29b84d3 ("net/mlx5: replace IPC socket with EAL API")
Fixes: c18cf501a7af ("net/mlx5: enable secondary process to register DMA memory")
Cc: yskoh@mellanox.com
Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
---
 drivers/net/mlx5/mlx5_mp.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Yongseok Koh April 22, 2019, 5:51 p.m. UTC | #1
> On Apr 17, 2019, at 7:44 AM, Herakliusz Lipiec <herakliusz.lipiec@intel.com> wrote:
> 
> When sending multiple requests, rte_mp_request_sync
> can succeed sending a few of those requests, but then
> fail on a later one and in the end return with rc=-1.
> The upper layers - e.g. device hotplug - currently
> handles this case as if no messages were sent and no
> memory for response buffers was allocated, which is
> not true. Fixed by always freeing reply message buffers.
> 
> Fixes: 9a8ab29b84d3 ("net/mlx5: replace IPC socket with EAL API")
> Fixes: c18cf501a7af ("net/mlx5: enable secondary process to register DMA memory")
> Cc: yskoh@mellanox.com
> Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
> ---
> drivers/net/mlx5/mlx5_mp.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
> index cea74adb6..c9915b1d5 100644
> --- a/drivers/net/mlx5/mlx5_mp.c
> +++ b/drivers/net/mlx5/mlx5_mp.c
> @@ -258,6 +258,7 @@ mlx5_mp_req_mr_create(struct rte_eth_dev *dev, uintptr_t addr)
> 	if (ret) {
> 		DRV_LOG(ERR, "port %u request to primary process failed",
> 			dev->data->port_id);
> +		free(mp_rep.msgs);
> 		return -rte_errno;
> 	}
> 	assert(mp_rep.nb_received == 1);
> @@ -295,7 +296,8 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
> 	if (ret) {
> 		DRV_LOG(ERR, "port %u request to primary process failed",
> 			dev->data->port_id);
> -		return -rte_errno;
> +		ret = -rte_errno;
> +		goto exit;

These two requests will be made by a secondary process targeting to the primary.
Then, there's only one request in this case and we don't need to take care of that.
Right?

Same comment for mlx4.

Thanks,
Yongseok

> 	}
> 	assert(mp_rep.nb_received == 1);
> 	mp_res = &mp_rep.msgs[0];
> -- 
> 2.17.2
>
  
Burakov, Anatoly April 23, 2019, 8:09 a.m. UTC | #2
On 22-Apr-19 6:51 PM, Yongseok Koh wrote:
> 
>> On Apr 17, 2019, at 7:44 AM, Herakliusz Lipiec <herakliusz.lipiec@intel.com> wrote:
>>
>> When sending multiple requests, rte_mp_request_sync
>> can succeed sending a few of those requests, but then
>> fail on a later one and in the end return with rc=-1.
>> The upper layers - e.g. device hotplug - currently
>> handles this case as if no messages were sent and no
>> memory for response buffers was allocated, which is
>> not true. Fixed by always freeing reply message buffers.
>>
>> Fixes: 9a8ab29b84d3 ("net/mlx5: replace IPC socket with EAL API")
>> Fixes: c18cf501a7af ("net/mlx5: enable secondary process to register DMA memory")
>> Cc: yskoh@mellanox.com
>> Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
>> ---
>> drivers/net/mlx5/mlx5_mp.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
>> index cea74adb6..c9915b1d5 100644
>> --- a/drivers/net/mlx5/mlx5_mp.c
>> +++ b/drivers/net/mlx5/mlx5_mp.c
>> @@ -258,6 +258,7 @@ mlx5_mp_req_mr_create(struct rte_eth_dev *dev, uintptr_t addr)
>> 	if (ret) {
>> 		DRV_LOG(ERR, "port %u request to primary process failed",
>> 			dev->data->port_id);
>> +		free(mp_rep.msgs);
>> 		return -rte_errno;
>> 	}
>> 	assert(mp_rep.nb_received == 1);
>> @@ -295,7 +296,8 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
>> 	if (ret) {
>> 		DRV_LOG(ERR, "port %u request to primary process failed",
>> 			dev->data->port_id);
>> -		return -rte_errno;
>> +		ret = -rte_errno;
>> +		goto exit;
> 
> These two requests will be made by a secondary process targeting to the primary.
> Then, there's only one request in this case and we don't need to take care of that.
> Right?
> 
> Same comment for mlx4.

Hi Yongseok,

mp_rep.msgs is potentially allocated regardless of whether you're in 
primary or secondary, and whether the call to mp_request_sync succeeded 
or failed. Hence, need to free in all cases.

See this patch: http://patches.dpdk.org/patch/52868/
and this bug: https://bugs.dpdk.org/show_bug.cgi?id=228

> 
> Thanks,
> Yongseok
> 
>> 	}
>> 	assert(mp_rep.nb_received == 1);
>> 	mp_res = &mp_rep.msgs[0];
>> -- 
>> 2.17.2
>>
> 
>
  
Yongseok Koh April 23, 2019, 8:13 p.m. UTC | #3
> On Apr 23, 2019, at 1:09 AM, Burakov, Anatoly <anatoly.burakov@intel.com> wrote:
> 
> On 22-Apr-19 6:51 PM, Yongseok Koh wrote:
>>> On Apr 17, 2019, at 7:44 AM, Herakliusz Lipiec <herakliusz.lipiec@intel.com> wrote:
>>> 
>>> When sending multiple requests, rte_mp_request_sync
>>> can succeed sending a few of those requests, but then
>>> fail on a later one and in the end return with rc=-1.
>>> The upper layers - e.g. device hotplug - currently
>>> handles this case as if no messages were sent and no
>>> memory for response buffers was allocated, which is
>>> not true. Fixed by always freeing reply message buffers.
>>> 
>>> Fixes: 9a8ab29b84d3 ("net/mlx5: replace IPC socket with EAL API")
>>> Fixes: c18cf501a7af ("net/mlx5: enable secondary process to register DMA memory")
>>> Cc: yskoh@mellanox.com
>>> Signed-off-by: Herakliusz Lipiec <herakliusz.lipiec@intel.com>
>>> ---
>>> drivers/net/mlx5/mlx5_mp.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
>>> index cea74adb6..c9915b1d5 100644
>>> --- a/drivers/net/mlx5/mlx5_mp.c
>>> +++ b/drivers/net/mlx5/mlx5_mp.c
>>> @@ -258,6 +258,7 @@ mlx5_mp_req_mr_create(struct rte_eth_dev *dev, uintptr_t addr)
>>> 	if (ret) {
>>> 		DRV_LOG(ERR, "port %u request to primary process failed",
>>> 			dev->data->port_id);
>>> +		free(mp_rep.msgs);
>>> 		return -rte_errno;
>>> 	}
>>> 	assert(mp_rep.nb_received == 1);
>>> @@ -295,7 +296,8 @@ mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
>>> 	if (ret) {
>>> 		DRV_LOG(ERR, "port %u request to primary process failed",
>>> 			dev->data->port_id);
>>> -		return -rte_errno;
>>> +		ret = -rte_errno;
>>> +		goto exit;
>> These two requests will be made by a secondary process targeting to the primary.
>> Then, there's only one request in this case and we don't need to take care of that.
>> Right?
>> Same comment for mlx4.
> 
> Hi Yongseok,
> 
> mp_rep.msgs is potentially allocated regardless of whether you're in primary or secondary, and whether the call to mp_request_sync succeeded or failed. Hence, need to free in all cases.

Then, it looks fine to me.

> 
> See this patch: https://eur03.safelinks.protection.outlook.com/?url=http%3A%2F%2Fpatches.dpdk.org%2Fpatch%2F52868%2F&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C007b61ef9d964dc79e7108d6c7c2f9d8%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636916037564345993&amp;sdata=O%2BoG%2F8P8cXwKS%2FDfZyMiG3CiIDpeXe3dPMJgVilzFWY%3D&amp;reserved=0
> and this bug: https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.dpdk.org%2Fshow_bug.cgi%3Fid%3D228&amp;data=02%7C01%7Cyskoh%40mellanox.com%7C007b61ef9d964dc79e7108d6c7c2f9d8%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636916037564345993&amp;sdata=jLA5wLqT%2BfW3p79rg2SVEZ16GS37dgqdF4NwmiRU%2B7A%3D&amp;reserved=0
> 
>> Thanks,
>> Yongseok
>>> 	}
>>> 	assert(mp_rep.nb_received == 1);
>>> 	mp_res = &mp_rep.msgs[0];
>>> -- 
>>> 2.17.2
>>> 
> 
> 
> -- 
> Thanks,
> Anatoly
  

Patch

diff --git a/drivers/net/mlx5/mlx5_mp.c b/drivers/net/mlx5/mlx5_mp.c
index cea74adb6..c9915b1d5 100644
--- a/drivers/net/mlx5/mlx5_mp.c
+++ b/drivers/net/mlx5/mlx5_mp.c
@@ -258,6 +258,7 @@  mlx5_mp_req_mr_create(struct rte_eth_dev *dev, uintptr_t addr)
 	if (ret) {
 		DRV_LOG(ERR, "port %u request to primary process failed",
 			dev->data->port_id);
+		free(mp_rep.msgs);
 		return -rte_errno;
 	}
 	assert(mp_rep.nb_received == 1);
@@ -295,7 +296,8 @@  mlx5_mp_req_verbs_cmd_fd(struct rte_eth_dev *dev)
 	if (ret) {
 		DRV_LOG(ERR, "port %u request to primary process failed",
 			dev->data->port_id);
-		return -rte_errno;
+		ret = -rte_errno;
+		goto exit;
 	}
 	assert(mp_rep.nb_received == 1);
 	mp_res = &mp_rep.msgs[0];