[v1,1/2] dts: Improve output gathering in interactive shells

Message ID 20240312172558.11844-2-jspewock@iol.unh.edu (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series Improve interactive shell output gathering |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Jeremy Spewock March 12, 2024, 5:25 p.m. UTC
  From: Jeremy Spewock <jspewock@iol.unh.edu>

The current implementation of consuming output from interactive shells
relies on being able to find an expected prompt somewhere within the
output buffer after sending the command. This is useful in situations
where the prompt does not appear in the output itself, but in some
practical cases (such as the starting of an XML-RPC server for scapy)
the prompt exists in one of the commands sent to the shell and this can
cause the command to exit early and creates a race condition between the
server starting and the first command being sent to the server.

This patch addresses this problem by searching for a line that strictly
ends with the provided prompt, rather than one that simply contains it,
so that the detection that a command is finished is more consistent. It
also adds a catch to detect when a command times out before finding the
prompt so that the exception can be wrapped into a more explicit one and
display the output that it did manage to gather before timing out.

Bugzilla ID: 1359
Fixes: 88489c0501af ("dts: add smoke tests")

Signed-off-by: Jeremy Spewock <jspewock@iol.unh.edu>
---
 dts/framework/exception.py                    |  7 +++++
 .../remote_session/interactive_shell.py       | 26 +++++++++++++------
 2 files changed, 25 insertions(+), 8 deletions(-)
  

Comments

Juraj Linkeš April 3, 2024, 9 a.m. UTC | #1
On Tue, Mar 12, 2024 at 6:26 PM <jspewock@iol.unh.edu> wrote:
>
> From: Jeremy Spewock <jspewock@iol.unh.edu>
>
> The current implementation of consuming output from interactive shells
> relies on being able to find an expected prompt somewhere within the
> output buffer after sending the command. This is useful in situations
> where the prompt does not appear in the output itself, but in some
> practical cases (such as the starting of an XML-RPC server for scapy)
> the prompt exists in one of the commands sent to the shell and this can
> cause the command to exit early and creates a race condition between the
> server starting and the first command being sent to the server.
>
> This patch addresses this problem by searching for a line that strictly
> ends with the provided prompt, rather than one that simply contains it,
> so that the detection that a command is finished is more consistent. It
> also adds a catch to detect when a command times out before finding the
> prompt so that the exception can be wrapped into a more explicit one and
> display the output that it did manage to gather before timing out.
>

This could still cause problems if the prompt appears at the end of a
line in the output. Maybe we don't need to worry about that until we
actually hit that problem. In any case, I'd like to test this, so
please rebase the patch. :-)

> Bugzilla ID: 1359
> Fixes: 88489c0501af ("dts: add smoke tests")
>
> Signed-off-by: Jeremy Spewock <jspewock@iol.unh.edu>
> ---
>  dts/framework/exception.py                    |  7 +++++
>  .../remote_session/interactive_shell.py       | 26 +++++++++++++------
>  2 files changed, 25 insertions(+), 8 deletions(-)
>
> diff --git a/dts/framework/exception.py b/dts/framework/exception.py
> index 658eee2c38..cce1e0231a 100644
> --- a/dts/framework/exception.py
> +++ b/dts/framework/exception.py
> @@ -146,6 +146,13 @@ def __str__(self) -> str:
>          return f"Command {self.command} returned a non-zero exit code: {self._command_return_code}"
>
>
> +class InteractiveCommandExecutionError(DTSError):
> +    """An unsuccessful execution of a remote command in an interactive environment."""
> +
> +    #:
> +    severity: ClassVar[ErrorSeverity] = ErrorSeverity.REMOTE_CMD_EXEC_ERR
> +
> +
>  class RemoteDirectoryExistsError(DTSError):
>      """A directory that exists on a remote node."""
>
> diff --git a/dts/framework/remote_session/interactive_shell.py b/dts/framework/remote_session/interactive_shell.py
> index 5cfe202e15..2bcfdcb3c7 100644
> --- a/dts/framework/remote_session/interactive_shell.py
> +++ b/dts/framework/remote_session/interactive_shell.py
> @@ -20,6 +20,7 @@
>
>  from paramiko import Channel, SSHClient, channel  # type: ignore[import]
>
> +from framework.exception import InteractiveCommandExecutionError
>  from framework.logger import DTSLogger
>  from framework.settings import SETTINGS
>
> @@ -124,6 +125,10 @@ def send_command(self, command: str, prompt: str | None = None) -> str:
>
>          Returns:
>              All output in the buffer before expected string.
> +
> +        Raises:
> +            InteractiveCommandExecutionError: If command was sent but prompt could not be found in
> +                the output.
>          """
>          self._logger.info(f"Sending: '{command}'")
>          if prompt is None:
> @@ -131,14 +136,19 @@ def send_command(self, command: str, prompt: str | None = None) -> str:
>          self._stdin.write(f"{command}{self._command_extra_chars}\n")
>          self._stdin.flush()
>          out: str = ""
> -        for line in self._stdout:
> -            out += line
> -            if prompt in line and not line.rstrip().endswith(
> -                command.rstrip()
> -            ):  # ignore line that sent command
> -                break
> -        self._logger.debug(f"Got output: {out}")
> -        return out
> +        try:
> +            for line in self._stdout:
> +                out += line
> +                if line.rstrip().endswith(prompt):
> +                    break
> +        except TimeoutError:

I like this addition, but maybe we could do better. In the regular SSH
session, we're expected to raise:
* SSHSessionDeadError if the session is not alive,
* SSHTimeoutError if the command execution times out.

Can we do that here as well?

> +            raise InteractiveCommandExecutionError(

We could just reuse the SSHTimeoutError exception. Is there a reason
to distinguish between interactive and non-interactive timeouts?

> +                f"Failed to find the prompt ({prompt}) at the end of a line in the output from the"
> +                f" command ({command}). Got:\n{out}"
> +            )
> +        else:
> +            self._logger.debug(f"Got output: {out}")
> +            return out
>
>      def close(self) -> None:
>          """Properly free all resources."""
> --
> 2.43.2
>
  
Jeremy Spewock April 8, 2024, 4:20 p.m. UTC | #2
On Wed, Apr 3, 2024 at 5:00 AM Juraj Linkeš <juraj.linkes@pantheon.tech> wrote:
>
> On Tue, Mar 12, 2024 at 6:26 PM <jspewock@iol.unh.edu> wrote:
> >
> > From: Jeremy Spewock <jspewock@iol.unh.edu>
> >
> > The current implementation of consuming output from interactive shells
> > relies on being able to find an expected prompt somewhere within the
> > output buffer after sending the command. This is useful in situations
> > where the prompt does not appear in the output itself, but in some
> > practical cases (such as the starting of an XML-RPC server for scapy)
> > the prompt exists in one of the commands sent to the shell and this can
> > cause the command to exit early and creates a race condition between the
> > server starting and the first command being sent to the server.
> >
> > This patch addresses this problem by searching for a line that strictly
> > ends with the provided prompt, rather than one that simply contains it,
> > so that the detection that a command is finished is more consistent. It
> > also adds a catch to detect when a command times out before finding the
> > prompt so that the exception can be wrapped into a more explicit one and
> > display the output that it did manage to gather before timing out.
> >
>
> This could still cause problems if the prompt appears at the end of a
> line in the output. Maybe we don't need to worry about that until we
> actually hit that problem. In any case, I'd like to test this, so
> please rebase the patch. :-)

I will rebase and send out a v2, but that is a good point. it would be
just as easy to instead check to see that the prompt is the only thing
on the line, so we could do that instead, which do you think is
better? I'm sort of indifferent, I can see how verifying that the line
only contains the prompt would be useful in cases like it appears in a
docstring or something similar (and it's nice to be more explicit in
general), and I think that leaving it as the end of the line could
potentially save some verbosity if the line you are looking for is
long so you can just detect the end of it. Another problem that I
could think of with long lines potentially is if, somehow, you were
looking for a prompt that the pseudo-terminal split across a few lines
and it split your prompt, but I'm not sure this is really relevant to
us at all since we only really expect prompts that are usually short.

>
<snip>
> > +        try:
> > +            for line in self._stdout:
> > +                out += line
> > +                if line.rstrip().endswith(prompt):
> > +                    break
> > +        except TimeoutError:
>
> I like this addition, but maybe we could do better. In the regular SSH
> session, we're expected to raise:
> * SSHSessionDeadError if the session is not alive,
> * SSHTimeoutError if the command execution times out.
>
> Can we do that here as well?

This is a good point, I don't see why we couldn't and thinking about
it I like the idea of making this more consistent, good catch.

>
> > +            raise InteractiveCommandExecutionError(
>
> We could just reuse the SSHTimeoutError exception. Is there a reason
> to distinguish between interactive and non-interactive timeouts?

I wanted to add a distinction between interactive and non-interactive
just because in general the way we log messages when sending commands
to interactive shells looks pretty close to what it looks like when
you do the same for non-interactive, so if there was an error I
thought it might be more helpful in the logs to know which session you
were sending the command to when an error was encountered. Maybe,
however, a better approach to this would just be keep the exception
types the same but change the messages.

>
<snip>
> > 2.43.2
> >
  
Juraj Linkeš April 10, 2024, 10:20 a.m. UTC | #3
On Mon, Apr 8, 2024 at 6:20 PM Jeremy Spewock <jspewock@iol.unh.edu> wrote:
>
> On Wed, Apr 3, 2024 at 5:00 AM Juraj Linkeš <juraj.linkes@pantheon.tech> wrote:
> >
> > On Tue, Mar 12, 2024 at 6:26 PM <jspewock@iol.unh.edu> wrote:
> > >
> > > From: Jeremy Spewock <jspewock@iol.unh.edu>
> > >
> > > The current implementation of consuming output from interactive shells
> > > relies on being able to find an expected prompt somewhere within the
> > > output buffer after sending the command. This is useful in situations
> > > where the prompt does not appear in the output itself, but in some
> > > practical cases (such as the starting of an XML-RPC server for scapy)
> > > the prompt exists in one of the commands sent to the shell and this can
> > > cause the command to exit early and creates a race condition between the
> > > server starting and the first command being sent to the server.
> > >
> > > This patch addresses this problem by searching for a line that strictly
> > > ends with the provided prompt, rather than one that simply contains it,
> > > so that the detection that a command is finished is more consistent. It
> > > also adds a catch to detect when a command times out before finding the
> > > prompt so that the exception can be wrapped into a more explicit one and
> > > display the output that it did manage to gather before timing out.
> > >
> >
> > This could still cause problems if the prompt appears at the end of a
> > line in the output. Maybe we don't need to worry about that until we
> > actually hit that problem. In any case, I'd like to test this, so
> > please rebase the patch. :-)
>
> I will rebase and send out a v2, but that is a good point. it would be
> just as easy to instead check to see that the prompt is the only thing
> on the line, so we could do that instead, which do you think is
> better?

Would this work? I'm thinking we may need to send another extra
newline character to put the solitary prompt into the buffer if the
command output doesn't contain a newline.

> I'm sort of indifferent, I can see how verifying that the line
> only contains the prompt would be useful in cases like it appears in a
> docstring or something similar (and it's nice to be more explicit in
> general), and I think that leaving it as the end of the line could
> potentially save some verbosity if the line you are looking for is
> long so you can just detect the end of it. Another problem that I
> could think of with long lines potentially is if, somehow, you were
> looking for a prompt that the pseudo-terminal split across a few lines
> and it split your prompt, but I'm not sure this is really relevant to
> us at all since we only really expect prompts that are usually short.
>

If it works with checking just the end of the line let's leave it like
this (because of the possibility above). I think there shouldn't be
any prompts without something after them in docstrings.

> >
> <snip>
> > > +        try:
> > > +            for line in self._stdout:
> > > +                out += line
> > > +                if line.rstrip().endswith(prompt):
> > > +                    break
> > > +        except TimeoutError:
> >
> > I like this addition, but maybe we could do better. In the regular SSH
> > session, we're expected to raise:
> > * SSHSessionDeadError if the session is not alive,
> > * SSHTimeoutError if the command execution times out.
> >
> > Can we do that here as well?
>
> This is a good point, I don't see why we couldn't and thinking about
> it I like the idea of making this more consistent, good catch.
>
> >
> > > +            raise InteractiveCommandExecutionError(
> >
> > We could just reuse the SSHTimeoutError exception. Is there a reason
> > to distinguish between interactive and non-interactive timeouts?
>
> I wanted to add a distinction between interactive and non-interactive
> just because in general the way we log messages when sending commands
> to interactive shells looks pretty close to what it looks like when
> you do the same for non-interactive, so if there was an error I
> thought it might be more helpful in the logs to know which session you
> were sending the command to when an error was encountered. Maybe,
> however, a better approach to this would just be keep the exception
> types the same but change the messages.
>

There is probably some value to distinguish between them. We could
just mirror the non-interactive exceptions and keep the messages the
same.

> >
> <snip>
> > > 2.43.2
> > >
  

Patch

diff --git a/dts/framework/exception.py b/dts/framework/exception.py
index 658eee2c38..cce1e0231a 100644
--- a/dts/framework/exception.py
+++ b/dts/framework/exception.py
@@ -146,6 +146,13 @@  def __str__(self) -> str:
         return f"Command {self.command} returned a non-zero exit code: {self._command_return_code}"
 
 
+class InteractiveCommandExecutionError(DTSError):
+    """An unsuccessful execution of a remote command in an interactive environment."""
+
+    #:
+    severity: ClassVar[ErrorSeverity] = ErrorSeverity.REMOTE_CMD_EXEC_ERR
+
+
 class RemoteDirectoryExistsError(DTSError):
     """A directory that exists on a remote node."""
 
diff --git a/dts/framework/remote_session/interactive_shell.py b/dts/framework/remote_session/interactive_shell.py
index 5cfe202e15..2bcfdcb3c7 100644
--- a/dts/framework/remote_session/interactive_shell.py
+++ b/dts/framework/remote_session/interactive_shell.py
@@ -20,6 +20,7 @@ 
 
 from paramiko import Channel, SSHClient, channel  # type: ignore[import]
 
+from framework.exception import InteractiveCommandExecutionError
 from framework.logger import DTSLogger
 from framework.settings import SETTINGS
 
@@ -124,6 +125,10 @@  def send_command(self, command: str, prompt: str | None = None) -> str:
 
         Returns:
             All output in the buffer before expected string.
+
+        Raises:
+            InteractiveCommandExecutionError: If command was sent but prompt could not be found in
+                the output.
         """
         self._logger.info(f"Sending: '{command}'")
         if prompt is None:
@@ -131,14 +136,19 @@  def send_command(self, command: str, prompt: str | None = None) -> str:
         self._stdin.write(f"{command}{self._command_extra_chars}\n")
         self._stdin.flush()
         out: str = ""
-        for line in self._stdout:
-            out += line
-            if prompt in line and not line.rstrip().endswith(
-                command.rstrip()
-            ):  # ignore line that sent command
-                break
-        self._logger.debug(f"Got output: {out}")
-        return out
+        try:
+            for line in self._stdout:
+                out += line
+                if line.rstrip().endswith(prompt):
+                    break
+        except TimeoutError:
+            raise InteractiveCommandExecutionError(
+                f"Failed to find the prompt ({prompt}) at the end of a line in the output from the"
+                f" command ({command}). Got:\n{out}"
+            )
+        else:
+            self._logger.debug(f"Got output: {out}")
+            return out
 
     def close(self) -> None:
         """Properly free all resources."""