eal: remove panic on remote launch failure

Message ID 20220929105101.406724-1-bruce.richardson@intel.com (mailing list archive)
State Accepted, archived
Delegated to: David Marchand
Headers
Series eal: remove panic on remote launch failure |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/intel-Testing success Testing PASS

Commit Message

Bruce Richardson Sept. 29, 2022, 10:51 a.m. UTC
  Library functions should not cause the app to exit or panic. Replace the
existing panic call in the EAL remote launch functions with an error
code return instead.

Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
---
 doc/guides/rel_notes/deprecation.rst | 3 ---
 lib/eal/common/eal_common_launch.c   | 3 +--
 lib/eal/common/eal_thread.h          | 4 +++-
 lib/eal/include/rte_launch.h         | 1 +
 lib/eal/unix/eal_unix_thread.c       | 7 ++++---
 lib/eal/windows/eal_thread.c         | 7 ++++---
 6 files changed, 13 insertions(+), 12 deletions(-)
  

Comments

David Marchand Sept. 30, 2022, 7:17 a.m. UTC | #1
On Thu, Sep 29, 2022 at 12:51 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> Library functions should not cause the app to exit or panic. Replace the
> existing panic call in the EAL remote launch functions with an error
> code return instead.
>
> Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>

LGTM with an update of the release notes (that I will add when merging
if there is no futher comment/objection).

Reviewed-by: David Marchand <david.marchand@redhat.com>
  
David Marchand Oct. 3, 2022, 7:16 a.m. UTC | #2
On Fri, Sep 30, 2022 at 9:17 AM David Marchand
<david.marchand@redhat.com> wrote:
>
> On Thu, Sep 29, 2022 at 12:51 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > Library functions should not cause the app to exit or panic. Replace the
> > existing panic call in the EAL remote launch functions with an error
> > code return instead.
> >
> > Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> Reviewed-by: David Marchand <david.marchand@redhat.com>

There are probably other places to cleanup wrt rte_panic in EAL, but
this first cleanup is good and the rest can come later.

Added a note in the RN and applied.
Thanks Bruce.
  

Patch

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 688b8c2606..71c0d0137a 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -18,9 +18,6 @@  Deprecation Notices
   in the future. Applications can use ``devtools/cocci/func_or_ret.cocci``
   to update their code.
 
-* eal: The function ``rte_eal_remote_launch`` will return new error codes
-  after read or write error on the pipe, instead of calling ``rte_panic``.
-
 * rte_atomicNN_xxx: These APIs do not take memory order parameter. This does
   not allow for writing optimized code for all the CPU architectures supported
   in DPDK. DPDK has adopted the atomic operations from
diff --git a/lib/eal/common/eal_common_launch.c b/lib/eal/common/eal_common_launch.c
index 462c886e30..487e58c124 100644
--- a/lib/eal/common/eal_common_launch.c
+++ b/lib/eal/common/eal_common_launch.c
@@ -49,8 +49,7 @@  rte_eal_remote_launch(lcore_function_t *f, void *arg, unsigned int worker_id)
 	 */
 	__atomic_store_n(&lcore_config[worker_id].f, f, __ATOMIC_RELEASE);
 
-	eal_thread_wake_worker(worker_id);
-	rc = 0;
+	rc = eal_thread_wake_worker(worker_id);
 
 finish:
 	rte_eal_trace_thread_remote_launch(f, arg, worker_id, rc);
diff --git a/lib/eal/common/eal_thread.h b/lib/eal/common/eal_thread.h
index e0240ccc60..32bfe5c977 100644
--- a/lib/eal/common/eal_thread.h
+++ b/lib/eal/common/eal_thread.h
@@ -67,8 +67,10 @@  eal_thread_dump_current_affinity(char *str, unsigned int size);
  *
  * @param worker_id
  *   The lcore_id of a worker thread.
+ * @return
+ *   0 on success, negative errno on error
  */
-void
+int
 eal_thread_wake_worker(unsigned int worker_id);
 
 /**
diff --git a/lib/eal/include/rte_launch.h b/lib/eal/include/rte_launch.h
index 295584668e..f7bf9f6b2d 100644
--- a/lib/eal/include/rte_launch.h
+++ b/lib/eal/include/rte_launch.h
@@ -62,6 +62,7 @@  typedef int (lcore_function_t)(void *);
  * @return
  *   - 0: Success. Execution of function f started on the remote lcore.
  *   - (-EBUSY): The remote lcore is not in a WAIT state.
+ *   - (-EPIPE): Error reading or writing pipe to worker thread
  */
 int rte_eal_remote_launch(lcore_function_t *f, void *arg, unsigned worker_id);
 
diff --git a/lib/eal/unix/eal_unix_thread.c b/lib/eal/unix/eal_unix_thread.c
index 70b5ba6b98..ef6cbff0ee 100644
--- a/lib/eal/unix/eal_unix_thread.c
+++ b/lib/eal/unix/eal_unix_thread.c
@@ -9,7 +9,7 @@ 
 
 #include "eal_private.h"
 
-void
+int
 eal_thread_wake_worker(unsigned int worker_id)
 {
 	int m2w = lcore_config[worker_id].pipe_main2worker[1];
@@ -21,13 +21,14 @@  eal_thread_wake_worker(unsigned int worker_id)
 		n = write(m2w, &c, 1);
 	} while (n == 0 || (n < 0 && errno == EINTR));
 	if (n < 0)
-		rte_panic("cannot write on configuration pipe\n");
+		return -EPIPE;
 
 	do {
 		n = read(w2m, &c, 1);
 	} while (n < 0 && errno == EINTR);
 	if (n <= 0)
-		rte_panic("cannot read on configuration pipe\n");
+		return -EPIPE;
+	return 0;
 }
 
 void
diff --git a/lib/eal/windows/eal_thread.c b/lib/eal/windows/eal_thread.c
index f3c61b4456..caffe68d19 100644
--- a/lib/eal/windows/eal_thread.c
+++ b/lib/eal/windows/eal_thread.c
@@ -16,7 +16,7 @@ 
 #include "eal_thread.h"
 #include "eal_windows.h"
 
-void
+int
 eal_thread_wake_worker(unsigned int worker_id)
 {
 	int m2w = lcore_config[worker_id].pipe_main2worker[1];
@@ -28,13 +28,14 @@  eal_thread_wake_worker(unsigned int worker_id)
 		n = _write(m2w, &c, 1);
 	} while (n == 0 || (n < 0 && errno == EINTR));
 	if (n < 0)
-		rte_panic("cannot write on configuration pipe\n");
+		return -EPIPE;
 
 	do {
 		n = _read(w2m, &c, 1);
 	} while (n < 0 && errno == EINTR);
 	if (n <= 0)
-		rte_panic("cannot read on configuration pipe\n");
+		return -EPIPE;
+	return 0;
 }
 
 void