[v1,1/4] dts: improve starting and stopping interactive shells

Message ID 20240514201436.2496-2-jspewock@iol.unh.edu (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series Add second scatter test case |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation warning apply patch failure
ci/iol-testing warning apply patch failure

Commit Message

Jeremy Spewock May 14, 2024, 8:14 p.m. UTC
  From: Jeremy Spewock <jspewock@iol.unh.edu>

The InteractiveShell class currently relies on being cleaned up and
shutdown at the time of garbage collection, but this cleanup of the class
does no verification that the session is still running prior to cleanup.
So, if a user were to call this method themselves prior to garbage
collection, it would be called twice and throw an exception when the
desired behavior is to do nothing since the session is already cleaned
up. This is solved by using a weakref and a finalize class which
achieves the same result of calling the method at garbage collection,
but also ensures that it is called exactly once.

Additionally, this fixes issues regarding starting a primary DPDK
application while another is still cleaning up via a retry when starting
interactive shells. It also adds catch for attempting to send a command
to an interactive shell that is not running to create a more descriptive
error message.

Signed-off-by: Jeremy Spewock <jspewock@iol.unh.edu>
---
 .../remote_session/interactive_shell.py       | 51 ++++++++++++++++---
 dts/framework/remote_session/testpmd_shell.py |  4 +-
 2 files changed, 45 insertions(+), 10 deletions(-)
  

Comments

Luca Vizzarro May 20, 2024, 5:17 p.m. UTC | #1
Looks good to me! Thank you for your work.

Reviewed-by: Luca Vizzarro <luca.vizzarro@arm.com>
  
Patrick Robb May 22, 2024, 1:43 p.m. UTC | #2
Reviewed-by: Patrick Robb <probb@iol.unh.edu>
  

Patch

diff --git a/dts/framework/remote_session/interactive_shell.py b/dts/framework/remote_session/interactive_shell.py
index 5cfe202e15..d1a9d8a6d2 100644
--- a/dts/framework/remote_session/interactive_shell.py
+++ b/dts/framework/remote_session/interactive_shell.py
@@ -14,12 +14,14 @@ 
 environment variable configure the timeout of getting the output from command execution.
 """
 
+import weakref
 from abc import ABC
 from pathlib import PurePath
 from typing import Callable, ClassVar
 
 from paramiko import Channel, SSHClient, channel  # type: ignore[import]
 
+from framework.exception import InteractiveCommandExecutionError
 from framework.logger import DTSLogger
 from framework.settings import SETTINGS
 
@@ -32,6 +34,10 @@  class InteractiveShell(ABC):
     and collecting input until reaching a certain prompt. All interactive applications
     will use the same SSH connection, but each will create their own channel on that
     session.
+
+    Attributes:
+        is_started: :data:`True` if the application has started successfully, :data:`False`
+            otherwise.
     """
 
     _interactive_session: SSHClient
@@ -41,6 +47,7 @@  class InteractiveShell(ABC):
     _logger: DTSLogger
     _timeout: float
     _app_args: str
+    _finalizer: weakref.finalize
 
     #: Prompt to expect at the end of output when sending a command.
     #: This is often overridden by subclasses.
@@ -58,6 +65,8 @@  class InteractiveShell(ABC):
     #: for DPDK on the node will be prepended to the path to the executable.
     dpdk_app: ClassVar[bool] = False
 
+    is_started: bool = False
+
     def __init__(
         self,
         interactive_session: SSHClient,
@@ -93,17 +102,39 @@  def __init__(
     def _start_application(self, get_privileged_command: Callable[[str], str] | None) -> None:
         """Starts a new interactive application based on the path to the app.
 
-        This method is often overridden by subclasses as their process for
-        starting may look different.
+        This method is often overridden by subclasses as their process for starting may look
+        different. Initialization of the shell on the host can be retried up to 5 times. This is
+        done because some DPDK applications need slightly more time after exiting their script to
+        clean up EAL before others can start.
+
+        When the application is started we also bind a class for finalization to this instance of
+        the shell to ensure proper cleanup of the application.
 
         Args:
             get_privileged_command: A function (but could be any callable) that produces
                 the version of the command with elevated privileges.
         """
+        self._finalizer = weakref.finalize(self, self._close)
+        max_retries = 5
+        self._ssh_channel.settimeout(1)
         start_command = f"{self.path} {self._app_args}"
         if get_privileged_command is not None:
             start_command = get_privileged_command(start_command)
-        self.send_command(start_command)
+        self.is_started = True
+        for retry in range(max_retries):
+            try:
+                self.send_command(start_command)
+                break
+            except TimeoutError:
+                self._logger.info(
+                    "Interactive shell failed to start, retrying... "
+                    f"({retry+1} out of {max_retries})"
+                )
+        else:
+            self._ssh_channel.settimeout(self._timeout)
+            self.is_started = False  # update state on failure to start
+            raise InteractiveCommandExecutionError("Failed to start application.")
+        self._ssh_channel.settimeout(self._timeout)
 
     def send_command(self, command: str, prompt: str | None = None) -> str:
         """Send `command` and get all output before the expected ending string.
@@ -125,6 +156,10 @@  def send_command(self, command: str, prompt: str | None = None) -> str:
         Returns:
             All output in the buffer before expected string.
         """
+        if not self.is_started:
+            raise InteractiveCommandExecutionError(
+                f"Cannot send command {command} to application because the shell is not running."
+            )
         self._logger.info(f"Sending: '{command}'")
         if prompt is None:
             prompt = self._default_prompt
@@ -140,11 +175,11 @@  def send_command(self, command: str, prompt: str | None = None) -> str:
         self._logger.debug(f"Got output: {out}")
         return out
 
-    def close(self) -> None:
-        """Properly free all resources."""
+    def _close(self) -> None:
+        self.is_started = False
         self._stdin.close()
         self._ssh_channel.close()
 
-    def __del__(self) -> None:
-        """Make sure the session is properly closed before deleting the object."""
-        self.close()
+    def close(self) -> None:
+        """Properly free all resources."""
+        self._finalizer()
diff --git a/dts/framework/remote_session/testpmd_shell.py b/dts/framework/remote_session/testpmd_shell.py
index f6783af621..cb4642bf3d 100644
--- a/dts/framework/remote_session/testpmd_shell.py
+++ b/dts/framework/remote_session/testpmd_shell.py
@@ -227,10 +227,10 @@  def set_forward_mode(self, mode: TestPmdForwardingModes, verify: bool = True):
                 f"Test pmd failed to set fwd mode to {mode.value}"
             )
 
-    def close(self) -> None:
+    def _close(self) -> None:
         """Overrides :meth:`~.interactive_shell.close`."""
         self.send_command("quit", "")
-        return super().close()
+        return super()._close()
 
     def get_capas_rxq(
         self, supported_capabilities: MutableSet, unsupported_capabilities: MutableSet