[v4,2/4] dts: improve starting and stopping interactive shells

Message ID 20240613181510.30135-3-jspewock@iol.unh.edu (mailing list archive)
State Superseded
Delegated to: Thomas Monjalon
Headers
Series Add second scatter test case |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Jeremy Spewock June 13, 2024, 6:15 p.m. UTC
  From: Jeremy Spewock <jspewock@iol.unh.edu>

The InteractiveShell class currently relies on being cleaned up and
shutdown at the time of garbage collection, but this cleanup of the class
does no verification that the session is still running prior to cleanup.
So, if a user were to call this method themselves prior to garbage
collection, it would be called twice and throw an exception when the
desired behavior is to do nothing since the session is already cleaned
up. This is solved by using a weakref and a finalize class which
achieves the same result of calling the method at garbage collection,
but also ensures that it is called exactly once.

Additionally, this fixes issues regarding starting a primary DPDK
application while another is still cleaning up via a retry when starting
interactive shells. It also adds catch for attempting to send a command
to an interactive shell that is not running to create a more descriptive
error message.

Signed-off-by: Jeremy Spewock <jspewock@iol.unh.edu>
---
 .../remote_session/interactive_shell.py       | 35 ++++++++++++++-----
 .../single_active_interactive_shell.py        | 34 ++++++++++++++++--
 dts/framework/remote_session/testpmd_shell.py |  2 +-
 3 files changed, 60 insertions(+), 11 deletions(-)
  

Comments

Juraj Linkeš June 18, 2024, 3:54 p.m. UTC | #1
> @@ -15,18 +18,34 @@ class InteractiveShell(SingleActiveInteractiveShell):

> +    def _start_application(self, get_privileged_command: Callable[[str], str] | None) -> None:
> +        """Overrides :meth:`_start_application` in the parent class.
> +
> +        Add a weakref finalize class after starting the application.
> +
> +        Args:
> +            get_privileged_command: A function (but could be any callable) that produces
> +                the version of the command with elevated privileges.
> +        """
> +        super()._start_application(get_privileged_command)
> +        self._finalizer = weakref.finalize(self, self._close)

I think we can just add the above line to start_application() to achieve 
the same thing. And we should move the docstring to the public method.

> +
>       def start_application(self) -> None:
>           """Start the application."""
>           self._start_application(self._get_privileged_command)
>
  
Jeremy Spewock June 18, 2024, 4:47 p.m. UTC | #2
On Tue, Jun 18, 2024 at 11:54 AM Juraj Linkeš
<juraj.linkes@pantheon.tech> wrote:
>
> > @@ -15,18 +18,34 @@ class InteractiveShell(SingleActiveInteractiveShell):
>
> > +    def _start_application(self, get_privileged_command: Callable[[str], str] | None) -> None:
> > +        """Overrides :meth:`_start_application` in the parent class.
> > +
> > +        Add a weakref finalize class after starting the application.
> > +
> > +        Args:
> > +            get_privileged_command: A function (but could be any callable) that produces
> > +                the version of the command with elevated privileges.
> > +        """
> > +        super()._start_application(get_privileged_command)
> > +        self._finalizer = weakref.finalize(self, self._close)
>
> I think we can just add the above line to start_application() to achieve
> the same thing. And we should move the docstring to the public method.

Sure, makes sense to me, we only need the finalizer when we start
manually anyway, there's no need to set it up when you use it as a
context manager. Actually, I wonder if this would throw an exception
at the time of garbage collection if you used an InteractiveShell as a
context manager. I think it might because the context manager doesn't
trigger the finalizer, so it probably would try to clean up twice.

Good catch!

>
> > +
> >       def start_application(self) -> None:
> >           """Start the application."""
> >           self._start_application(self._get_privileged_command)
> >
  

Patch

diff --git a/dts/framework/remote_session/interactive_shell.py b/dts/framework/remote_session/interactive_shell.py
index 9d124b8245..5b6f5c2a41 100644
--- a/dts/framework/remote_session/interactive_shell.py
+++ b/dts/framework/remote_session/interactive_shell.py
@@ -8,6 +8,9 @@ 
 collection.
 """
 
+import weakref
+from typing import Callable, ClassVar
+
 from .single_active_interactive_shell import SingleActiveInteractiveShell
 
 
@@ -15,18 +18,34 @@  class InteractiveShell(SingleActiveInteractiveShell):
     """Adds manual start and stop functionality to interactive shells.
 
     Like its super-class, this class should not be instantiated directly and should instead be
-    extended. This class also provides an option for automated cleanup of the application through
-    the garbage collector.
+    extended. This class also provides an option for automated cleanup of the application using a
+    weakref and a finalize class. This finalize class allows for cleanup of the class at the time
+    of garbage collection and also ensures that cleanup only happens once. This way if a user
+    initiates the closing of the shell manually it is not repeated at the time of garbage
+    collection.
     """
 
+    _finalizer: weakref.finalize
+    #: Shells that do not require only one instance to be running shouldn't need more than 1
+    #: attempt to start.
+    _init_attempts: ClassVar[int] = 1
+
+    def _start_application(self, get_privileged_command: Callable[[str], str] | None) -> None:
+        """Overrides :meth:`_start_application` in the parent class.
+
+        Add a weakref finalize class after starting the application.
+
+        Args:
+            get_privileged_command: A function (but could be any callable) that produces
+                the version of the command with elevated privileges.
+        """
+        super()._start_application(get_privileged_command)
+        self._finalizer = weakref.finalize(self, self._close)
+
     def start_application(self) -> None:
         """Start the application."""
         self._start_application(self._get_privileged_command)
 
     def close(self) -> None:
-        """Properly free all resources."""
-        self._close()
-
-    def __del__(self) -> None:
-        """Make sure the session is properly closed before deleting the object."""
-        self.close()
+        """Free all resources using finalize class."""
+        self._finalizer()
diff --git a/dts/framework/remote_session/single_active_interactive_shell.py b/dts/framework/remote_session/single_active_interactive_shell.py
index 74060be8a7..282ceec483 100644
--- a/dts/framework/remote_session/single_active_interactive_shell.py
+++ b/dts/framework/remote_session/single_active_interactive_shell.py
@@ -44,6 +44,10 @@  class SingleActiveInteractiveShell(ABC):
     Interactive shells are started and stopped using a context manager. This allows for the start
     and cleanup of the application to happen at predictable times regardless of exceptions or
     interrupts.
+
+    Attributes:
+        is_alive: :data:`True` if the application has started successfully, :data:`False`
+            otherwise.
     """
 
     _interactive_session: SSHClient
@@ -55,6 +59,9 @@  class SingleActiveInteractiveShell(ABC):
     _app_args: str
     _get_privileged_command: Callable[[str], str] | None
 
+    #: The number of times to try starting the application before considering it a failure.
+    _init_attempts: ClassVar[int] = 5
+
     #: Prompt to expect at the end of output when sending a command.
     #: This is often overridden by subclasses.
     _default_prompt: ClassVar[str] = ""
@@ -71,6 +78,8 @@  class SingleActiveInteractiveShell(ABC):
     #: for DPDK on the node will be prepended to the path to the executable.
     dpdk_app: ClassVar[bool] = False
 
+    is_alive: bool = False
+
     def __init__(
         self,
         interactive_session: SSHClient,
@@ -110,17 +119,34 @@  def _start_application(self, get_privileged_command: Callable[[str], str] | None
 
         This method is often overridden by subclasses as their process for starting may look
         different. A new SSH channel is initialized for the application to run on, then the
-        application is started.
+        application is started. Initialization of the shell on the host can be retried up to
+        `self._init_attempts` - 1 times. This is done because some DPDK applications need slightly
+        more time after exiting their script to clean up EAL before others can start.
 
         Args:
             get_privileged_command: A function (but could be any callable) that produces
                 the version of the command with elevated privileges.
         """
         self._init_channel()
+        self._ssh_channel.settimeout(5)
         start_command = f"{self.path} {self._app_args}"
         if get_privileged_command is not None:
             start_command = get_privileged_command(start_command)
-        self.send_command(start_command)
+        self.is_alive = True
+        for attempt in range(self._init_attempts):
+            try:
+                self.send_command(start_command)
+                break
+            except TimeoutError:
+                self._logger.info(
+                    f"Interactive shell failed to start (attempt {attempt+1} out of "
+                    f"{self._init_attempts})"
+                )
+        else:
+            self._ssh_channel.settimeout(self._timeout)
+            self.is_alive = False  # update state on failure to start
+            raise InteractiveCommandExecutionError("Failed to start application.")
+        self._ssh_channel.settimeout(self._timeout)
 
     def send_command(self, command: str, prompt: str | None = None) -> str:
         """Send `command` and get all output before the expected ending string.
@@ -142,6 +168,10 @@  def send_command(self, command: str, prompt: str | None = None) -> str:
         Returns:
             All output in the buffer before expected string.
         """
+        if not self.is_alive:
+            raise InteractiveCommandExecutionError(
+                f"Cannot send command {command} to application because the shell is not running."
+            )
         self._logger.info(f"Sending: '{command}'")
         if prompt is None:
             prompt = self._default_prompt
diff --git a/dts/framework/remote_session/testpmd_shell.py b/dts/framework/remote_session/testpmd_shell.py
index 17561d4dae..805bb3a77d 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -230,7 +230,7 @@  def set_forward_mode(self, mode: TestPmdForwardingModes, verify: bool = True):
     def _close(self) -> None:
         """Overrides :meth:`~.interactive_shell.close`."""
         self.stop()
-        self.send_command("quit", "")
+        self.send_command("quit", "Bye...")
         return super()._close()
 
     def get_capas_rxq(