eal: fix device hotplug

Message ID 20211102075259.3392-1-david.marchand@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series eal: fix device hotplug |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS

Commit Message

David Marchand Nov. 2, 2021, 7:52 a.m. UTC
  The device event interrupt handler was always freed.

Bugzilla ID: 845
Fixes: c2bd9367e18f ("lib: remove direct access to interrupt handle")

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
 lib/eal/linux/eal_dev.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)
  

Comments

Yu Jiang Nov. 2, 2021, 9:35 a.m. UTC | #1
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, November 2, 2021 3:53 PM
> To: dev@dpdk.org
> Cc: Jiang, YuX <yux.jiang@intel.com>; Harman Kalra <hkalra@marvell.com>
> Subject: [PATCH] eal: fix device hotplug
> 
> The device event interrupt handler was always freed.
> 
> Bugzilla ID: 845
> Fixes: c2bd9367e18f ("lib: remove direct access to interrupt handle")
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>  lib/eal/linux/eal_dev.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/eal/linux/eal_dev.c b/lib/eal/linux/eal_dev.c index
> 06820a3666..925cdba553 100644
> --- a/lib/eal/linux/eal_dev.c
> +++ b/lib/eal/linux/eal_dev.c
> @@ -317,10 +317,12 @@ rte_dev_event_monitor_start(void)
>  		goto exit;
>  	}
> 
> -	if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
> +	ret = rte_intr_type_set(intr_handle,
> RTE_INTR_HANDLE_DEV_EVENT);
> +	if (ret)
>  		goto exit;
> 
> -	if (rte_intr_fd_set(intr_handle, -1))
> +	ret = rte_intr_fd_set(intr_handle, -1);
> +	if (ret)
>  		goto exit;
> 
>  	ret = dev_uev_socket_fd_create();
> @@ -339,7 +341,10 @@ rte_dev_event_monitor_start(void)
>  	monitor_refcount++;
> 
>  exit:
> -	rte_intr_instance_free(intr_handle);
> +	if (ret) {
> +		rte_intr_instance_free(intr_handle);
> +		intr_handle = NULL;
> +	}
>  	rte_rwlock_write_unlock(&monitor_lock);
>  	return ret;
>  }
> @@ -370,6 +375,7 @@ rte_dev_event_monitor_stop(void)
> 
>  	close(rte_intr_fd_get(intr_handle));
>  	rte_intr_instance_free(intr_handle);
> +	intr_handle = NULL;
> 
>  	monitor_refcount--;
> 
> --
> 2.23.0
Hi David,

The patch https://patchwork.dpdk.org/project/dpdk/patch/20211102075259.3392-1-david.marchand@redhat.com/   
failed to verify. 

after executing "device_del dev1" in the qemu window, the testpmd window hangs to death and output as following:
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
EAL: Cannot find bus for device (0000:00:05.0)
  
Maxime Coquelin Nov. 4, 2021, 2:11 p.m. UTC | #2
On 11/2/21 08:52, David Marchand wrote:
> The device event interrupt handler was always freed.
> 
> Bugzilla ID: 845
> Fixes: c2bd9367e18f ("lib: remove direct access to interrupt handle")
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
>   lib/eal/linux/eal_dev.c | 12 +++++++++---
>   1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/lib/eal/linux/eal_dev.c b/lib/eal/linux/eal_dev.c
> index 06820a3666..925cdba553 100644
> --- a/lib/eal/linux/eal_dev.c
> +++ b/lib/eal/linux/eal_dev.c
> @@ -317,10 +317,12 @@ rte_dev_event_monitor_start(void)
>   		goto exit;
>   	}
>   
> -	if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
> +	ret = rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT);
> +	if (ret)
>   		goto exit;
>   
> -	if (rte_intr_fd_set(intr_handle, -1))
> +	ret = rte_intr_fd_set(intr_handle, -1);
> +	if (ret)
>   		goto exit;
>   
>   	ret = dev_uev_socket_fd_create();
> @@ -339,7 +341,10 @@ rte_dev_event_monitor_start(void)
>   	monitor_refcount++;
>   
>   exit:
> -	rte_intr_instance_free(intr_handle);
> +	if (ret) {
> +		rte_intr_instance_free(intr_handle);
> +		intr_handle = NULL;
> +	}
>   	rte_rwlock_write_unlock(&monitor_lock);
>   	return ret;
>   }
> @@ -370,6 +375,7 @@ rte_dev_event_monitor_stop(void)
>   
>   	close(rte_intr_fd_get(intr_handle));
>   	rte_intr_instance_free(intr_handle);
> +	intr_handle = NULL;
>   
>   	monitor_refcount--;
>   
> 

Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Thanks,
Maxime
  
David Marchand Nov. 4, 2021, 2:15 p.m. UTC | #3
On Tue, Nov 2, 2021 at 10:35 AM Jiang, YuX <yux.jiang@intel.com> wrote:
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Tuesday, November 2, 2021 3:53 PM
> > To: dev@dpdk.org
> > Cc: Jiang, YuX <yux.jiang@intel.com>; Harman Kalra <hkalra@marvell.com>
> > Subject: [PATCH] eal: fix device hotplug
> >
> > The device event interrupt handler was always freed.
> >
> > Bugzilla ID: 845
> > Fixes: c2bd9367e18f ("lib: remove direct access to interrupt handle")
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > ---
> >  lib/eal/linux/eal_dev.c | 12 +++++++++---
> >  1 file changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/lib/eal/linux/eal_dev.c b/lib/eal/linux/eal_dev.c index
> > 06820a3666..925cdba553 100644
> > --- a/lib/eal/linux/eal_dev.c
> > +++ b/lib/eal/linux/eal_dev.c
> > @@ -317,10 +317,12 @@ rte_dev_event_monitor_start(void)
> >               goto exit;
> >       }
> >
> > -     if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
> > +     ret = rte_intr_type_set(intr_handle,
> > RTE_INTR_HANDLE_DEV_EVENT);
> > +     if (ret)
> >               goto exit;
> >
> > -     if (rte_intr_fd_set(intr_handle, -1))
> > +     ret = rte_intr_fd_set(intr_handle, -1);
> > +     if (ret)
> >               goto exit;
> >
> >       ret = dev_uev_socket_fd_create();
> > @@ -339,7 +341,10 @@ rte_dev_event_monitor_start(void)
> >       monitor_refcount++;
> >
> >  exit:
> > -     rte_intr_instance_free(intr_handle);
> > +     if (ret) {
> > +             rte_intr_instance_free(intr_handle);
> > +             intr_handle = NULL;
> > +     }
> >       rte_rwlock_write_unlock(&monitor_lock);
> >       return ret;
> >  }
> > @@ -370,6 +375,7 @@ rte_dev_event_monitor_stop(void)
> >
> >       close(rte_intr_fd_get(intr_handle));
> >       rte_intr_instance_free(intr_handle);
> > +     intr_handle = NULL;
> >
> >       monitor_refcount--;
> >
> > --
> > 2.23.0
> Hi David,
>
> The patch https://patchwork.dpdk.org/project/dpdk/patch/20211102075259.3392-1-david.marchand@redhat.com/
> failed to verify.
>
> after executing "device_del dev1" in the qemu window, the testpmd window hangs to death and output as following:
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)
> EAL: Cannot find bus for device (0000:00:05.0)

Replying on the ml, since the discussion mainly happened in bugzilla.
This first patch was not enough, another issue was caught in the pci
bus hotplug code.
  
David Marchand Nov. 4, 2021, 2:22 p.m. UTC | #4
On Thu, Nov 4, 2021 at 3:11 PM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
> On 11/2/21 08:52, David Marchand wrote:
> > The device event interrupt handler was always freed.
> >
> > Bugzilla ID: 845
> > Fixes: c2bd9367e18f ("lib: remove direct access to interrupt handle")
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
Tested-by: Yan Xia <yanx.xia@intel.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>

Applied, thanks.
  

Patch

diff --git a/lib/eal/linux/eal_dev.c b/lib/eal/linux/eal_dev.c
index 06820a3666..925cdba553 100644
--- a/lib/eal/linux/eal_dev.c
+++ b/lib/eal/linux/eal_dev.c
@@ -317,10 +317,12 @@  rte_dev_event_monitor_start(void)
 		goto exit;
 	}
 
-	if (rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT))
+	ret = rte_intr_type_set(intr_handle, RTE_INTR_HANDLE_DEV_EVENT);
+	if (ret)
 		goto exit;
 
-	if (rte_intr_fd_set(intr_handle, -1))
+	ret = rte_intr_fd_set(intr_handle, -1);
+	if (ret)
 		goto exit;
 
 	ret = dev_uev_socket_fd_create();
@@ -339,7 +341,10 @@  rte_dev_event_monitor_start(void)
 	monitor_refcount++;
 
 exit:
-	rte_intr_instance_free(intr_handle);
+	if (ret) {
+		rte_intr_instance_free(intr_handle);
+		intr_handle = NULL;
+	}
 	rte_rwlock_write_unlock(&monitor_lock);
 	return ret;
 }
@@ -370,6 +375,7 @@  rte_dev_event_monitor_stop(void)
 
 	close(rte_intr_fd_get(intr_handle));
 	rte_intr_instance_free(intr_handle);
+	intr_handle = NULL;
 
 	monitor_refcount--;