[v1,2/2] dts: improve starting and stopping interactive shells

Message ID 20240709163145.110030-3-jspewock@iol.unh.edu (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series dts: add context manager for interactive shells |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-unit-arm64-testing pending Testing pending
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS

Commit Message

Jeremy Spewock July 9, 2024, 4:31 p.m. UTC
From: Jeremy Spewock <jspewock@iol.unh.edu>

The InteractiveShell class currently relies on being cleaned up and
shutdown at the time of garbage collection, but this cleanup of the class
does no verification that the session is still running prior to cleanup.
So, if a user were to call this method themselves prior to garbage
collection, it would be called twice and throw an exception when the
desired behavior is to do nothing since the session is already cleaned
up. This is solved by using a weakref and a finalize class which
achieves the same result of calling the method at garbage collection,
but also ensures that it is called exactly once.

Additionally, this fixes issues regarding starting a primary DPDK
application while another is still cleaning up via a retry when starting
interactive shells. It also adds catch for attempting to send a command
to an interactive shell that is not running to create a more descriptive
error message.

Signed-off-by: Jeremy Spewock <jspewock@iol.unh.edu>
Reviewed-by: Luca Vizzarro <luca.vizzarro@arm.com>
Reviewed-by: Patrick Robb <probb@iol.unh.edu>
---
 .../remote_session/interactive_shell.py       | 29 ++++++++----
 .../single_active_interactive_shell.py        | 46 +++++++++++++++++--
 dts/framework/remote_session/testpmd_shell.py |  2 +-
 3 files changed, 64 insertions(+), 13 deletions(-)
  

Comments

Dean Marx July 10, 2024, 1:12 p.m. UTC | #1
For the whole series:
Tested-by: Dean Marx <dmarx@iol.unh.edu>
  
Juraj Linkeš July 11, 2024, 2:53 p.m. UTC | #2
With the docs changes below,
Reviewed-by: Juraj Linkeš <juraj.linkes@pantheon.tech>

> diff --git a/dts/framework/remote_session/interactive_shell.py b/dts/framework/remote_session/interactive_shell.py
> index 11dc8a0643..fcaf1f6a5f 100644
> --- a/dts/framework/remote_session/interactive_shell.py
> +++ b/dts/framework/remote_session/interactive_shell.py
> @@ -9,6 +9,9 @@
>   collection.
>   """
>   
> +import weakref
> +from typing import ClassVar
> +
>   from .single_active_interactive_shell import SingleActiveInteractiveShell
>   
>   
> @@ -16,18 +19,26 @@ class InteractiveShell(SingleActiveInteractiveShell):
>       """Adds manual start and stop functionality to interactive shells.
>   
>       Like its super-class, this class should not be instantiated directly and should instead be
> -    extended. This class also provides an option for automated cleanup of the application through
> -    the garbage collector.
> +    extended. This class also provides an option for automated cleanup of the application using a
> +    weakref and a finalize class. This finalize class allows for cleanup of the class at the time
> +    of garbage collection and also ensures that cleanup only happens once. This way if a user
> +    initiates the closing of the shell manually it is not repeated at the time of garbage
> +    collection.
>       """
>   
> +    _finalizer: weakref.finalize
> +    #: Shells that do not require only one instance to be running shouldn't need more than 1
> +    #: attempt to start.

This wording is a bit confusing. This could mean that the shells could 
require multiple instances. We could try an alternative approach:
One attempt should be enough for shells which don't have to worry about 
other instances closing before starting a new one.

Or something similar.

> +    _init_attempts: ClassVar[int] = 1
> +
>       def start_application(self) -> None:
> -        """Start the application."""
> +        """Start the application.
> +
> +        After the application has started, add a weakref finalize class to manage cleanup.

I think this could be improved to:
use :class:`weakref.finalize` to manage cleanup

> +        """
>           self._start_application()
> +        self._finalizer = weakref.finalize(self, self._close)
>   
>       def close(self) -> None:
> -        """Properly free all resources."""
> -        self._close()
> -
> -    def __del__(self) -> None:
> -        """Make sure the session is properly closed before deleting the object."""
> -        self.close()
> +        """Free all resources using finalize class."""

Also here:
using :class:`weakref.finalize`

> +        self._finalizer()
  
Jeremy Spewock July 11, 2024, 3:32 p.m. UTC | #3
On Thu, Jul 11, 2024 at 10:53 AM Juraj Linkeš
<juraj.linkes@pantheon.tech> wrote:
>
> With the docs changes below,
> Reviewed-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
>
> > diff --git a/dts/framework/remote_session/interactive_shell.py b/dts/framework/remote_session/interactive_shell.py
> > index 11dc8a0643..fcaf1f6a5f 100644
> > --- a/dts/framework/remote_session/interactive_shell.py
> > +++ b/dts/framework/remote_session/interactive_shell.py
> > @@ -9,6 +9,9 @@
> >   collection.
> >   """
> >
> > +import weakref
> > +from typing import ClassVar
> > +
> >   from .single_active_interactive_shell import SingleActiveInteractiveShell
> >
> >
> > @@ -16,18 +19,26 @@ class InteractiveShell(SingleActiveInteractiveShell):
> >       """Adds manual start and stop functionality to interactive shells.
> >
> >       Like its super-class, this class should not be instantiated directly and should instead be
> > -    extended. This class also provides an option for automated cleanup of the application through
> > -    the garbage collector.
> > +    extended. This class also provides an option for automated cleanup of the application using a
> > +    weakref and a finalize class. This finalize class allows for cleanup of the class at the time
> > +    of garbage collection and also ensures that cleanup only happens once. This way if a user
> > +    initiates the closing of the shell manually it is not repeated at the time of garbage
> > +    collection.
> >       """
> >
> > +    _finalizer: weakref.finalize
> > +    #: Shells that do not require only one instance to be running shouldn't need more than 1
> > +    #: attempt to start.
>
> This wording is a bit confusing. This could mean that the shells could
> require multiple instances. We could try an alternative approach:
> One attempt should be enough for shells which don't have to worry about
> other instances closing before starting a new one.
>
> Or something similar.

Good point, I'll make this change.

>
> > +    _init_attempts: ClassVar[int] = 1
> > +
> >       def start_application(self) -> None:
> > -        """Start the application."""
> > +        """Start the application.
> > +
> > +        After the application has started, add a weakref finalize class to manage cleanup.
>
> I think this could be improved to:
> use :class:`weakref.finalize` to manage cleanup

Good idea, referencing the actual class like this is more clear.

>
> > +        """
> >           self._start_application()
> > +        self._finalizer = weakref.finalize(self, self._close)
> >
> >       def close(self) -> None:
> > -        """Properly free all resources."""
> > -        self._close()
> > -
> > -    def __del__(self) -> None:
> > -        """Make sure the session is properly closed before deleting the object."""
> > -        self.close()
> > +        """Free all resources using finalize class."""
>
> Also here:
> using :class:`weakref.finalize`

Ack.

>
> > +        self._finalizer()
>
  

Patch

diff --git a/dts/framework/remote_session/interactive_shell.py b/dts/framework/remote_session/interactive_shell.py
index 11dc8a0643..fcaf1f6a5f 100644
--- a/dts/framework/remote_session/interactive_shell.py
+++ b/dts/framework/remote_session/interactive_shell.py
@@ -9,6 +9,9 @@ 
 collection.
 """
 
+import weakref
+from typing import ClassVar
+
 from .single_active_interactive_shell import SingleActiveInteractiveShell
 
 
@@ -16,18 +19,26 @@  class InteractiveShell(SingleActiveInteractiveShell):
     """Adds manual start and stop functionality to interactive shells.
 
     Like its super-class, this class should not be instantiated directly and should instead be
-    extended. This class also provides an option for automated cleanup of the application through
-    the garbage collector.
+    extended. This class also provides an option for automated cleanup of the application using a
+    weakref and a finalize class. This finalize class allows for cleanup of the class at the time
+    of garbage collection and also ensures that cleanup only happens once. This way if a user
+    initiates the closing of the shell manually it is not repeated at the time of garbage
+    collection.
     """
 
+    _finalizer: weakref.finalize
+    #: Shells that do not require only one instance to be running shouldn't need more than 1
+    #: attempt to start.
+    _init_attempts: ClassVar[int] = 1
+
     def start_application(self) -> None:
-        """Start the application."""
+        """Start the application.
+
+        After the application has started, add a weakref finalize class to manage cleanup.
+        """
         self._start_application()
+        self._finalizer = weakref.finalize(self, self._close)
 
     def close(self) -> None:
-        """Properly free all resources."""
-        self._close()
-
-    def __del__(self) -> None:
-        """Make sure the session is properly closed before deleting the object."""
-        self.close()
+        """Free all resources using finalize class."""
+        self._finalizer()
diff --git a/dts/framework/remote_session/single_active_interactive_shell.py b/dts/framework/remote_session/single_active_interactive_shell.py
index 30c55d4703..38094c0fe2 100644
--- a/dts/framework/remote_session/single_active_interactive_shell.py
+++ b/dts/framework/remote_session/single_active_interactive_shell.py
@@ -27,6 +27,7 @@ 
 from paramiko import Channel, channel  # type: ignore[import-untyped]
 from typing_extensions import Self
 
+from framework.exception import InteractiveCommandExecutionError
 from framework.logger import DTSLogger
 from framework.params import Params
 from framework.settings import SETTINGS
@@ -45,6 +46,10 @@  class SingleActiveInteractiveShell(ABC):
     Interactive shells are started and stopped using a context manager. This allows for the start
     and cleanup of the application to happen at predictable times regardless of exceptions or
     interrupts.
+
+    Attributes:
+        is_alive: :data:`True` if the application has started successfully, :data:`False`
+            otherwise.
     """
 
     _node: Node
@@ -57,6 +62,9 @@  class SingleActiveInteractiveShell(ABC):
     _privileged: bool
     _real_path: PurePath
 
+    #: The number of times to try starting the application before considering it a failure.
+    _init_attempts: ClassVar[int] = 5
+
     #: Prompt to expect at the end of output when sending a command.
     #: This is often overridden by subclasses.
     _default_prompt: ClassVar[str] = ""
@@ -69,6 +77,8 @@  class SingleActiveInteractiveShell(ABC):
     #: Path to the executable to start the interactive application.
     path: ClassVar[PurePath]
 
+    is_alive: bool = False
+
     def __init__(
         self,
         node: Node,
@@ -111,11 +121,33 @@  def _make_start_command(self) -> str:
     def _start_application(self) -> None:
         """Starts a new interactive application based on the path to the app.
 
-        This method is often overridden by subclasses as their process for
-        starting may look different.
+        This method is often overridden by subclasses as their process for starting may look
+        different. Initialization of the shell on the host can be retried up to
+        `self._init_attempts` - 1 times. This is done because some DPDK applications need slightly
+        more time after exiting their script to clean up EAL before others can start.
+
+        Raises:
+            InteractiveCommandExecutionError: If the application fails to start within the allotted
+                number of retries.
         """
         self._setup_ssh_channel()
-        self.send_command(self._make_start_command())
+        self._ssh_channel.settimeout(5)
+        start_command = self._make_start_command()
+        self.is_alive = True
+        for attempt in range(self._init_attempts):
+            try:
+                self.send_command(start_command)
+                break
+            except TimeoutError:
+                self._logger.info(
+                    f"Interactive shell failed to start (attempt {attempt+1} out of "
+                    f"{self._init_attempts})"
+                )
+        else:
+            self._ssh_channel.settimeout(self._timeout)
+            self.is_alive = False  # update state on failure to start
+            raise InteractiveCommandExecutionError("Failed to start application.")
+        self._ssh_channel.settimeout(self._timeout)
 
     def send_command(
         self, command: str, prompt: str | None = None, skip_first_line: bool = False
@@ -139,7 +171,15 @@  def send_command(
 
         Returns:
             All output in the buffer before expected string.
+
+        Raises:
+            InteractiveCommandExecutionError: If attempting to send a command to a shell that is
+                not currently running.
         """
+        if not self.is_alive:
+            raise InteractiveCommandExecutionError(
+                f"Cannot send command {command} to application because the shell is not running."
+            )
         self._logger.info(f"Sending: '{command}'")
         if prompt is None:
             prompt = self._default_prompt
diff --git a/dts/framework/remote_session/testpmd_shell.py b/dts/framework/remote_session/testpmd_shell.py
index f54a745185..eda6eb320f 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -807,5 +807,5 @@  def show_port_stats(self, port_id: int) -> TestPmdPortStats:
     def _close(self) -> None:
         """Overrides :meth:`~.interactive_shell.close`."""
         self.stop()
-        self.send_command("quit", "")
+        self.send_command("quit", "Bye...")
         return super()._close()