mbox series

[v4,00/10] dts: add hello world testcase

Message ID 20230213152846.284191-1-juraj.linkes@pantheon.tech (mailing list archive)
Headers
Series dts: add hello world testcase |

Message

Juraj Linkeš Feb. 13, 2023, 3:28 p.m. UTC
Add code needed to run the HelloWorld testcase which just runs the hello
world dpdk application.

The patchset currently heavily refactors this original DTS code needed
to run the testcase:
* The whole architecture has been redone into more sensible class
  hierarchy
* DPDK build on the System under Test
* DPDK eal args construction, app running and shutting down
* Optional SUT hugepage memory configuration
* Test runner
* Test results
* TestSuite class
* Test runner parts interfacing with TestSuite
* The HelloWorld testsuite itself

The code is divided into sub-packages, some of which are divided
further.

There patch may need to be divided into smaller chunks. If so, proposals
on where exactly to split it would be very helpful.

v3:
Finished refactoring everything in this patch, with test suite and test
results being the last parts.
Also changed the directory structure. It's now simplified and the
imports look much better.
I've also many many minor changes such as renaming variables here and
there.

v4:
Made hugepage config optional, users may now specify that in the main
config file.
Removed HelloWorld test plan and incorporated parts of it into the test
suite python file.
Updated documentation.

Juraj Linkeš (10):
  dts: add node and os abstractions
  dts: add ssh command verification
  dts: add dpdk build on sut
  dts: add dpdk execution handling
  dts: add node memory setup
  dts: add test suite module
  dts: add hello world testsuite
  dts: add test suite config and runner
  dts: add test results module
  doc: update DTS setup and test suite cookbook

 doc/guides/tools/dts.rst                      | 145 +++++++-
 dts/conf.yaml                                 |  22 +-
 dts/framework/config/__init__.py              | 130 ++++++-
 dts/framework/config/conf_yaml_schema.json    | 172 +++++++++-
 dts/framework/dts.py                          | 185 ++++++++--
 dts/framework/exception.py                    | 100 +++++-
 dts/framework/logger.py                       |  24 +-
 dts/framework/remote_session/__init__.py      |  30 +-
 dts/framework/remote_session/linux_session.py | 107 ++++++
 dts/framework/remote_session/os_session.py    | 175 ++++++++++
 dts/framework/remote_session/posix_session.py | 221 ++++++++++++
 .../remote_session/remote/__init__.py         |  16 +
 .../remote_session/remote/remote_session.py   | 155 +++++++++
 .../{ => remote}/ssh_session.py               |  91 ++++-
 .../remote_session/remote_session.py          |  95 ------
 dts/framework/settings.py                     |  81 ++++-
 dts/framework/test_result.py                  | 316 ++++++++++++++++++
 dts/framework/test_suite.py                   | 254 ++++++++++++++
 dts/framework/testbed_model/__init__.py       |  20 +-
 dts/framework/testbed_model/dpdk.py           |  78 +++++
 dts/framework/testbed_model/hw/__init__.py    |  27 ++
 dts/framework/testbed_model/hw/cpu.py         | 253 ++++++++++++++
 .../testbed_model/hw/virtual_device.py        |  16 +
 dts/framework/testbed_model/node.py           | 162 +++++++--
 dts/framework/testbed_model/sut_node.py       | 261 +++++++++++++++
 dts/framework/utils.py                        |  39 ++-
 dts/tests/TestSuite_hello_world.py            |  64 ++++
 27 files changed, 3030 insertions(+), 209 deletions(-)
 create mode 100644 dts/framework/remote_session/linux_session.py
 create mode 100644 dts/framework/remote_session/os_session.py
 create mode 100644 dts/framework/remote_session/posix_session.py
 create mode 100644 dts/framework/remote_session/remote/__init__.py
 create mode 100644 dts/framework/remote_session/remote/remote_session.py
 rename dts/framework/remote_session/{ => remote}/ssh_session.py (65%)
 delete mode 100644 dts/framework/remote_session/remote_session.py
 create mode 100644 dts/framework/test_result.py
 create mode 100644 dts/framework/test_suite.py
 create mode 100644 dts/framework/testbed_model/dpdk.py
 create mode 100644 dts/framework/testbed_model/hw/__init__.py
 create mode 100644 dts/framework/testbed_model/hw/cpu.py
 create mode 100644 dts/framework/testbed_model/hw/virtual_device.py
 create mode 100644 dts/framework/testbed_model/sut_node.py
 create mode 100644 dts/tests/TestSuite_hello_world.py
  

Comments

Bruce Richardson Feb. 17, 2023, 5:26 p.m. UTC | #1
On Mon, Feb 13, 2023 at 04:28:36PM +0100, Juraj Linkeš wrote:
> Add code needed to run the HelloWorld testcase which just runs the hello
> world dpdk application.
> 
> The patchset currently heavily refactors this original DTS code needed
> to run the testcase:
> * The whole architecture has been redone into more sensible class
>   hierarchy
> * DPDK build on the System under Test
> * DPDK eal args construction, app running and shutting down
> * Optional SUT hugepage memory configuration
> * Test runner
> * Test results
> * TestSuite class
> * Test runner parts interfacing with TestSuite
> * The HelloWorld testsuite itself
> 
> The code is divided into sub-packages, some of which are divided
> further.
> 
> There patch may need to be divided into smaller chunks. If so, proposals
> on where exactly to split it would be very helpful.
> 
> v3:
> Finished refactoring everything in this patch, with test suite and test
> results being the last parts.
> Also changed the directory structure. It's now simplified and the
> imports look much better.
> I've also many many minor changes such as renaming variables here and
> there.
> 
> v4:
> Made hugepage config optional, users may now specify that in the main
> config file.
> Removed HelloWorld test plan and incorporated parts of it into the test
> suite python file.
> Updated documentation.
> 
Hi,

just trying this out by reading the docs and trying to follow along. Couple
of high-level comments thus far without getting into the patches:

* In the "configuring DTS" section, I think it would be good to:
   - say that the config file should be named conf.yaml by default. It's in
     the next section, but I think it should be called out earlier.
   - say that there is a template conf.yaml file in the dts directory
     already. On my first reading I actually thought that the sample config
     file was dts/framework/config/conf_yaml_schema.json (and I was going
     to comment on the name being weird! :-)). Only when I opened it did I
     realise my mistake. Therefore, downplan the schema, and put more
     emphasis on where to find the simple conf example to start with.
   - if hugepage config is now optional, as you say above, remove that from
     the sample and the docs.

* The code thus far seems to imply that you are always going to use root.
  When I configured it to log on to bruce@localhost, it timed out waiting
  for a prompt, I believe because it was looking for "#" which is the
  default only for root prompts.

* When running as root, things progressed further but I hit an error when
  DTS was trying to get the CPU config. No idea what is happening here,
  because running the same commands manually over ssh seemed to work fine.
  Below is the error. Any hints as to what is the problem appreciated.

Thanks,
/Bruce

$ ./main.py --tarball ~/Downloads/dpdk-22.11.1.tar.xz -v Y
2023/02/17 16:59:57 - SUT 1 - INFO - Connecting to root@localhost.
2023/02/17 16:59:58 - SUT 1 - INFO - Connection to root@localhost successful.
2023/02/17 16:59:58 - SUT 1 - INFO - Getting CPU information.
2023/02/17 16:59:58 - SUT 1 - INFO - Sending: 'lscpu -p=CPU,CORE,SOCKET,NODE|grep -v \#'
2023/02/17 16:59:59 - dts_runner - ERROR - Connection to node NodeConfiguration(name='SUT 1', hostname='localhost', user='root', password=None, arch=<Architecture.x86_64: 'x86_64'>, os=<OS.linux: 'linux'>, lcores='3,4', use_first_core=False, memory_channels=8, hugepages=HugepageConfiguration(amount=256, force_first_numa=False)) failed.
Traceback (most recent call last):
  File "/home/bruce/dpdk.org/dts/framework/dts.py", line 41, in run_all
    sut_node = SutNode(execution.system_under_test)
  File "/home/bruce/dpdk.org/dts/framework/testbed_model/sut_node.py", line 39, in __init__
    super(SutNode, self).__init__(node_config)
  File "/home/bruce/dpdk.org/dts/framework/testbed_model/node.py", line 47, in __init__
    self._get_remote_cpus()
  File "/home/bruce/dpdk.org/dts/framework/testbed_model/node.py", line 155, in _get_remote_cpus
    self.lcores = self.main_session.get_remote_cpus(self.config.use_first_core)
  File "/home/bruce/dpdk.org/dts/framework/remote_session/linux_session.py", line 18, in get_remote_cpus
    cpu_info = self.remote_session.send_command(
  File "/home/bruce/dpdk.org/dts/framework/remote_session/remote/remote_session.py", line 103, in send_command
    result = self._send_command(command, timeout, env)
  File "/home/bruce/dpdk.org/dts/framework/remote_session/remote/ssh_session.py", line 172, in _send_command
    return_code = int(self._send_command_get_output("echo $?", timeout, None))
ValueError: invalid literal for int() with base 10: '\x1b[?2004l\r\r\n0'
2023/02/17 16:59:59 - dts_runner - DEBUG - Summary of errors:
2023/02/17 16:59:59 - dts_runner - DEBUG - ValueError("invalid literal for int() with base 10: '\\x1b[?2004l\\r\\r\\n0'")
2023/02/17 16:59:59 - dts_runner - INFO - DTS execution has ended.
  
Juraj Linkeš Feb. 20, 2023, 10:13 a.m. UTC | #2
Thanks for the comments, Bruce.

On Fri, Feb 17, 2023 at 6:26 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Feb 13, 2023 at 04:28:36PM +0100, Juraj Linkeš wrote:
> > Add code needed to run the HelloWorld testcase which just runs the hello
> > world dpdk application.
> >
> > The patchset currently heavily refactors this original DTS code needed
> > to run the testcase:
> > * The whole architecture has been redone into more sensible class
> >   hierarchy
> > * DPDK build on the System under Test
> > * DPDK eal args construction, app running and shutting down
> > * Optional SUT hugepage memory configuration
> > * Test runner
> > * Test results
> > * TestSuite class
> > * Test runner parts interfacing with TestSuite
> > * The HelloWorld testsuite itself
> >
> > The code is divided into sub-packages, some of which are divided
> > further.
> >
> > There patch may need to be divided into smaller chunks. If so, proposals
> > on where exactly to split it would be very helpful.
> >
> > v3:
> > Finished refactoring everything in this patch, with test suite and test
> > results being the last parts.
> > Also changed the directory structure. It's now simplified and the
> > imports look much better.
> > I've also many many minor changes such as renaming variables here and
> > there.
> >
> > v4:
> > Made hugepage config optional, users may now specify that in the main
> > config file.
> > Removed HelloWorld test plan and incorporated parts of it into the test
> > suite python file.
> > Updated documentation.
> >
> Hi,
>
> just trying this out by reading the docs and trying to follow along. Couple
> of high-level comments thus far without getting into the patches:
>
> * In the "configuring DTS" section, I think it would be good to:
>    - say that the config file should be named conf.yaml by default. It's in
>      the next section, but I think it should be called out earlier.
>    - say that there is a template conf.yaml file in the dts directory
>      already. On my first reading I actually thought that the sample config
>      file was dts/framework/config/conf_yaml_schema.json (and I was going
>      to comment on the name being weird! :-)). Only when I opened it did I
>      realise my mistake. Therefore, downplan the schema, and put more
>      emphasis on where to find the simple conf example to start with.

Good points, I'll rewrite that a bit.

>    - if hugepage config is now optional, as you say above, remove that from
>      the sample and the docs.
>

The optional part is just that users may choose between DTS either
configuring the hugepages or not, but hugepages still must be
configured (if not by DTS, then beforehand). I'll document this a bit
more, but I'd like to leave it in the sample config with a note saying
it's optional.

> * The code thus far seems to imply that you are always going to use root.
>   When I configured it to log on to bruce@localhost, it timed out waiting
>   for a prompt, I believe because it was looking for "#" which is the
>   default only for root prompts.
>

True, I'll add this to docs. This will not be a requirement in the
future though - we want to do passwordless sudo.

> * When running as root, things progressed further but I hit an error when
>   DTS was trying to get the CPU config. No idea what is happening here,
>   because running the same commands manually over ssh seemed to work fine.
>   Below is the error. Any hints as to what is the problem appreciated.
>

I remember running into the same issue as well. I think it's related
to the bracketed paste feature of some terminal emulators:
https://askubuntu.com/questions/662222/why-bracketed-paste-mode-is-enabled-sporadically-in-my-terminal-screen
Please try disabling it and see whether that helps.
I haven't gone to great lengths to harden this part of SSH
implementation as we'll be moving to Fabric (from pexpect) after this
patch (which uses a mature Python SSH implementation instead of
expect).

> Thanks,
> /Bruce
>
> $ ./main.py --tarball ~/Downloads/dpdk-22.11.1.tar.xz -v Y
> 2023/02/17 16:59:57 - SUT 1 - INFO - Connecting to root@localhost.
> 2023/02/17 16:59:58 - SUT 1 - INFO - Connection to root@localhost successful.
> 2023/02/17 16:59:58 - SUT 1 - INFO - Getting CPU information.
> 2023/02/17 16:59:58 - SUT 1 - INFO - Sending: 'lscpu -p=CPU,CORE,SOCKET,NODE|grep -v \#'
> 2023/02/17 16:59:59 - dts_runner - ERROR - Connection to node NodeConfiguration(name='SUT 1', hostname='localhost', user='root', password=None, arch=<Architecture.x86_64: 'x86_64'>, os=<OS.linux: 'linux'>, lcores='3,4', use_first_core=False, memory_channels=8, hugepages=HugepageConfiguration(amount=256, force_first_numa=False)) failed.
> Traceback (most recent call last):
>   File "/home/bruce/dpdk.org/dts/framework/dts.py", line 41, in run_all
>     sut_node = SutNode(execution.system_under_test)
>   File "/home/bruce/dpdk.org/dts/framework/testbed_model/sut_node.py", line 39, in __init__
>     super(SutNode, self).__init__(node_config)
>   File "/home/bruce/dpdk.org/dts/framework/testbed_model/node.py", line 47, in __init__
>     self._get_remote_cpus()
>   File "/home/bruce/dpdk.org/dts/framework/testbed_model/node.py", line 155, in _get_remote_cpus
>     self.lcores = self.main_session.get_remote_cpus(self.config.use_first_core)
>   File "/home/bruce/dpdk.org/dts/framework/remote_session/linux_session.py", line 18, in get_remote_cpus
>     cpu_info = self.remote_session.send_command(
>   File "/home/bruce/dpdk.org/dts/framework/remote_session/remote/remote_session.py", line 103, in send_command
>     result = self._send_command(command, timeout, env)
>   File "/home/bruce/dpdk.org/dts/framework/remote_session/remote/ssh_session.py", line 172, in _send_command
>     return_code = int(self._send_command_get_output("echo $?", timeout, None))
> ValueError: invalid literal for int() with base 10: '\x1b[?2004l\r\r\n0'
> 2023/02/17 16:59:59 - dts_runner - DEBUG - Summary of errors:
> 2023/02/17 16:59:59 - dts_runner - DEBUG - ValueError("invalid literal for int() with base 10: '\\x1b[?2004l\\r\\r\\n0'")
> 2023/02/17 16:59:59 - dts_runner - INFO - DTS execution has ended.
>
  
Bruce Richardson Feb. 20, 2023, 11:56 a.m. UTC | #3
On Mon, Feb 20, 2023 at 11:13:45AM +0100, Juraj Linkeš wrote:
> Thanks for the comments, Bruce.
> 
> On Fri, Feb 17, 2023 at 6:26 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Mon, Feb 13, 2023 at 04:28:36PM +0100, Juraj Linkeš wrote:
> > > Add code needed to run the HelloWorld testcase which just runs the hello
> > > world dpdk application.
> > >
> > > The patchset currently heavily refactors this original DTS code needed
> > > to run the testcase:
> > > * The whole architecture has been redone into more sensible class
> > >   hierarchy
> > > * DPDK build on the System under Test
> > > * DPDK eal args construction, app running and shutting down
> > > * Optional SUT hugepage memory configuration
> > > * Test runner
> > > * Test results
> > > * TestSuite class
> > > * Test runner parts interfacing with TestSuite
> > > * The HelloWorld testsuite itself
> > >
> > > The code is divided into sub-packages, some of which are divided
> > > further.
> > >
> > > There patch may need to be divided into smaller chunks. If so, proposals
> > > on where exactly to split it would be very helpful.
> > >
> > > v3:
> > > Finished refactoring everything in this patch, with test suite and test
> > > results being the last parts.
> > > Also changed the directory structure. It's now simplified and the
> > > imports look much better.
> > > I've also many many minor changes such as renaming variables here and
> > > there.
> > >
> > > v4:
> > > Made hugepage config optional, users may now specify that in the main
> > > config file.
> > > Removed HelloWorld test plan and incorporated parts of it into the test
> > > suite python file.
> > > Updated documentation.
> > >
> > Hi,
> >
> > just trying this out by reading the docs and trying to follow along. Couple
> > of high-level comments thus far without getting into the patches:
> >
> > * In the "configuring DTS" section, I think it would be good to:
> >    - say that the config file should be named conf.yaml by default. It's in
> >      the next section, but I think it should be called out earlier.
> >    - say that there is a template conf.yaml file in the dts directory
> >      already. On my first reading I actually thought that the sample config
> >      file was dts/framework/config/conf_yaml_schema.json (and I was going
> >      to comment on the name being weird! :-)). Only when I opened it did I
> >      realise my mistake. Therefore, downplan the schema, and put more
> >      emphasis on where to find the simple conf example to start with.
> 
> Good points, I'll rewrite that a bit.
> 
> >    - if hugepage config is now optional, as you say above, remove that from
> >      the sample and the docs.
> >
> 
> The optional part is just that users may choose between DTS either
> configuring the hugepages or not, but hugepages still must be
> configured (if not by DTS, then beforehand). I'll document this a bit
> more, but I'd like to leave it in the sample config with a note saying
> it's optional.
> 
> > * The code thus far seems to imply that you are always going to use root.
> >   When I configured it to log on to bruce@localhost, it timed out waiting
> >   for a prompt, I believe because it was looking for "#" which is the
> >   default only for root prompts.
> >
> 
> True, I'll add this to docs. This will not be a requirement in the
> future though - we want to do passwordless sudo.
> 
> > * When running as root, things progressed further but I hit an error when
> >   DTS was trying to get the CPU config. No idea what is happening here,
> >   because running the same commands manually over ssh seemed to work fine.
> >   Below is the error. Any hints as to what is the problem appreciated.
> >
> 
> I remember running into the same issue as well. I think it's related
> to the bracketed paste feature of some terminal emulators:
> https://askubuntu.com/questions/662222/why-bracketed-paste-mode-is-enabled-sporadically-in-my-terminal-screen
> Please try disabling it and see whether that helps.
> I haven't gone to great lengths to harden this part of SSH
> implementation as we'll be moving to Fabric (from pexpect) after this
> patch (which uses a mature Python SSH implementation instead of
> expect).
> 
Ok, thanks for the explanation, I'll try that out.

/Bruce
  
Bruce Richardson Feb. 22, 2023, 4:39 p.m. UTC | #4
On Mon, Feb 20, 2023 at 11:13:45AM +0100, Juraj Linkeš wrote:
> Thanks for the comments, Bruce.
> 
> On Fri, Feb 17, 2023 at 6:26 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Mon, Feb 13, 2023 at 04:28:36PM +0100, Juraj Linkeš wrote:
> > > Add code needed to run the HelloWorld testcase which just runs the hello
> > > world dpdk application.
> > >
> > > The patchset currently heavily refactors this original DTS code needed
> > > to run the testcase:
> > > * The whole architecture has been redone into more sensible class
> > >   hierarchy
> > > * DPDK build on the System under Test
> > > * DPDK eal args construction, app running and shutting down
> > > * Optional SUT hugepage memory configuration
> > > * Test runner
> > > * Test results
> > > * TestSuite class
> > > * Test runner parts interfacing with TestSuite
> > > * The HelloWorld testsuite itself
> > >
<snip>
> 
> > * When running as root, things progressed further but I hit an error when
> >   DTS was trying to get the CPU config. No idea what is happening here,
> >   because running the same commands manually over ssh seemed to work fine.
> >   Below is the error. Any hints as to what is the problem appreciated.
> >
> 
> I remember running into the same issue as well. I think it's related
> to the bracketed paste feature of some terminal emulators:
> https://askubuntu.com/questions/662222/why-bracketed-paste-mode-is-enabled-sporadically-in-my-terminal-screen
> Please try disabling it and see whether that helps.
> I haven't gone to great lengths to harden this part of SSH
> implementation as we'll be moving to Fabric (from pexpect) after this
> patch (which uses a mature Python SSH implementation instead of
> expect).
> 

Adding things to my environment, e.g. bashrc didn't seem to work for me,
but the following change fixed this particular error. Might be worth
including in the code to avoid others hitting an issue?

index d0863d8791..936d5f4642 100644
--- a/dts/framework/remote_session/remote/ssh_session.py
+++ b/dts/framework/remote_session/remote/ssh_session.py
@@ -68,6 +68,7 @@ def _connect(self) -> None:
 
             self.send_expect("stty -echo", "#")
             self.send_expect("stty columns 1000", "#")
+            self.send_expect("bind 'set enable-bracketed-paste off'", "#")
         except Exception as e:
             self._logger.error(RED(str(e)))
             if getattr(self, "port", None):

Unfortunately, things still aren't running correctly for me. The code gets
copied over and builds, and then the first hello-world test case runs ok.
However, things don't work after that - something seems wrong with the
lcore detection or filtering logic on my system.

  File "/home/bruce/dpdk.org/dts/framework/testbed_model/hw/cpu.py", line 206, in _filter_cores
    raise ValueError(
ValueError: The amount of logical cores per core to use (1) exceeds the actual amount present. Is hyperthreading enabled?

To the suggestion on hyperthreading, I then checked, and yes, I have HT
enabled on the system. Any suggestions what is wrong?

BTW: suggest the following changes to the error message:
* s/amount/number/ - as cores are countable.
* "Is hyperthreading enabled?" -> "This test requires SMT/hyperthreading be
enabled". By asking if it's enabled, you don't make it clear whether it
should be enabled or not. Since I had it enabled, the question implied to
me that it should be disabled. It's only on reading the code I see the comment
that it is meant to be enabled.

/Bruce
  
Juraj Linkeš Feb. 23, 2023, 8:27 a.m. UTC | #5
On Wed, Feb 22, 2023 at 5:43 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Feb 20, 2023 at 11:13:45AM +0100, Juraj Linkeš wrote:
> > Thanks for the comments, Bruce.
> >
> > On Fri, Feb 17, 2023 at 6:26 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Mon, Feb 13, 2023 at 04:28:36PM +0100, Juraj Linkeš wrote:
> > > > Add code needed to run the HelloWorld testcase which just runs the hello
> > > > world dpdk application.
> > > >
> > > > The patchset currently heavily refactors this original DTS code needed
> > > > to run the testcase:
> > > > * The whole architecture has been redone into more sensible class
> > > >   hierarchy
> > > > * DPDK build on the System under Test
> > > > * DPDK eal args construction, app running and shutting down
> > > > * Optional SUT hugepage memory configuration
> > > > * Test runner
> > > > * Test results
> > > > * TestSuite class
> > > > * Test runner parts interfacing with TestSuite
> > > > * The HelloWorld testsuite itself
> > > >
> <snip>
> >
> > > * When running as root, things progressed further but I hit an error when
> > >   DTS was trying to get the CPU config. No idea what is happening here,
> > >   because running the same commands manually over ssh seemed to work fine.
> > >   Below is the error. Any hints as to what is the problem appreciated.
> > >
> >
> > I remember running into the same issue as well. I think it's related
> > to the bracketed paste feature of some terminal emulators:
> > https://askubuntu.com/questions/662222/why-bracketed-paste-mode-is-enabled-sporadically-in-my-terminal-screen
> > Please try disabling it and see whether that helps.
> > I haven't gone to great lengths to harden this part of SSH
> > implementation as we'll be moving to Fabric (from pexpect) after this
> > patch (which uses a mature Python SSH implementation instead of
> > expect).
> >
>
> Adding things to my environment, e.g. bashrc didn't seem to work for me,
> but the following change fixed this particular error. Might be worth
> including in the code to avoid others hitting an issue?

I didn't really want to modify the code that's about to be replaced,
but this is a small and bening change, so I don't mind.

>
> index d0863d8791..936d5f4642 100644
> --- a/dts/framework/remote_session/remote/ssh_session.py
> +++ b/dts/framework/remote_session/remote/ssh_session.py
> @@ -68,6 +68,7 @@ def _connect(self) -> None:
>
>              self.send_expect("stty -echo", "#")
>              self.send_expect("stty columns 1000", "#")
> +            self.send_expect("bind 'set enable-bracketed-paste off'", "#")
>          except Exception as e:
>              self._logger.error(RED(str(e)))
>              if getattr(self, "port", None):
>
> Unfortunately, things still aren't running correctly for me. The code gets
> copied over and builds, and then the first hello-world test case runs ok.
> However, things don't work after that - something seems wrong with the
> lcore detection or filtering logic on my system.
>
>   File "/home/bruce/dpdk.org/dts/framework/testbed_model/hw/cpu.py", line 206, in _filter_cores
>     raise ValueError(
> ValueError: The amount of logical cores per core to use (1) exceeds the actual amount present. Is hyperthreading enabled?
>
> To the suggestion on hyperthreading, I then checked, and yes, I have HT
> enabled on the system. Any suggestions what is wrong?

Interesting. The first test case runs hello world on all cores
specified in conf.yaml (or all system cores if lcores is empty).
The second one tries to run it on just one core and, interestingly,
that fails. It's definitely related to hyperthreading, which I've
tested a bit (or I thought so), but apparently missed something.

Looking at the code, there's something wrong when checking the number
of lcores per core (with hyperthreading, more than 1 core per core
could be present) requested by filter (in this case, the test case
supplies the filter) and the lcores on the system.

I'll try to fix it and send v5 right away. If the fix doesn't work, we
could look at what "lscpu -p=CPU,CORE,SOCKET,NODE | grep -v #" returns
on your system. It's also captured in dts/output/suite.log. The lcore
config in conf.yaml could also be relevant, but I assume you didn't
change that. We could also check the test case output. It's also in
dts/output/suite.log

>
> BTW: suggest the following changes to the error message:
> * s/amount/number/ - as cores are countable.

Thanks. I've used it inappropriately in a number of places.

> * "Is hyperthreading enabled?" -> "This test requires SMT/hyperthreading be
> enabled". By asking if it's enabled, you don't make it clear whether it
> should be enabled or not. Since I had it enabled, the question implied to
> me that it should be disabled. It's only on reading the code I see the comment
> that it is meant to be enabled.

I see where the confusion is. The question is just a mere suggestion
as to where the problem could be, but the logic in code is faulty,
leading to this unclear error message. I'll fix the logic and probably
modify the message so it makes more sense.

>
> /Bruce
  
Bruce Richardson Feb. 23, 2023, 9:17 a.m. UTC | #6
On Thu, Feb 23, 2023 at 09:27:05AM +0100, Juraj Linkeš wrote:
> On Wed, Feb 22, 2023 at 5:43 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Mon, Feb 20, 2023 at 11:13:45AM +0100, Juraj Linkeš wrote:
> > > Thanks for the comments, Bruce.
> > >
> > > On Fri, Feb 17, 2023 at 6:26 PM Bruce Richardson
> > > <bruce.richardson@intel.com> wrote:
> > > >
> > > > On Mon, Feb 13, 2023 at 04:28:36PM +0100, Juraj Linkeš wrote:
> > > > > Add code needed to run the HelloWorld testcase which just runs the hello
> > > > > world dpdk application.
> > > > >
> > > > > The patchset currently heavily refactors this original DTS code needed
> > > > > to run the testcase:
> > > > > * The whole architecture has been redone into more sensible class
> > > > >   hierarchy
> > > > > * DPDK build on the System under Test
> > > > > * DPDK eal args construction, app running and shutting down
> > > > > * Optional SUT hugepage memory configuration
> > > > > * Test runner
> > > > > * Test results
> > > > > * TestSuite class
> > > > > * Test runner parts interfacing with TestSuite
> > > > > * The HelloWorld testsuite itself
> > > > >
> > <snip>
> > >
> > > > * When running as root, things progressed further but I hit an error when
> > > >   DTS was trying to get the CPU config. No idea what is happening here,
> > > >   because running the same commands manually over ssh seemed to work fine.
> > > >   Below is the error. Any hints as to what is the problem appreciated.
> > > >
> > >
> > > I remember running into the same issue as well. I think it's related
> > > to the bracketed paste feature of some terminal emulators:
> > > https://askubuntu.com/questions/662222/why-bracketed-paste-mode-is-enabled-sporadically-in-my-terminal-screen
> > > Please try disabling it and see whether that helps.
> > > I haven't gone to great lengths to harden this part of SSH
> > > implementation as we'll be moving to Fabric (from pexpect) after this
> > > patch (which uses a mature Python SSH implementation instead of
> > > expect).
> > >
> >
> > Adding things to my environment, e.g. bashrc didn't seem to work for me,
> > but the following change fixed this particular error. Might be worth
> > including in the code to avoid others hitting an issue?
> 
> I didn't really want to modify the code that's about to be replaced,
> but this is a small and bening change, so I don't mind.
> 
> >
> > index d0863d8791..936d5f4642 100644
> > --- a/dts/framework/remote_session/remote/ssh_session.py
> > +++ b/dts/framework/remote_session/remote/ssh_session.py
> > @@ -68,6 +68,7 @@ def _connect(self) -> None:
> >
> >              self.send_expect("stty -echo", "#")
> >              self.send_expect("stty columns 1000", "#")
> > +            self.send_expect("bind 'set enable-bracketed-paste off'", "#")
> >          except Exception as e:
> >              self._logger.error(RED(str(e)))
> >              if getattr(self, "port", None):
> >
> > Unfortunately, things still aren't running correctly for me. The code gets
> > copied over and builds, and then the first hello-world test case runs ok.
> > However, things don't work after that - something seems wrong with the
> > lcore detection or filtering logic on my system.
> >
> >   File "/home/bruce/dpdk.org/dts/framework/testbed_model/hw/cpu.py", line 206, in _filter_cores
> >     raise ValueError(
> > ValueError: The amount of logical cores per core to use (1) exceeds the actual amount present. Is hyperthreading enabled?
> >
> > To the suggestion on hyperthreading, I then checked, and yes, I have HT
> > enabled on the system. Any suggestions what is wrong?
> 
> Interesting. The first test case runs hello world on all cores
> specified in conf.yaml (or all system cores if lcores is empty).
> The second one tries to run it on just one core and, interestingly,
> that fails. It's definitely related to hyperthreading, which I've
> tested a bit (or I thought so), but apparently missed something.
> 
> Looking at the code, there's something wrong when checking the number
> of lcores per core (with hyperthreading, more than 1 core per core
> could be present) requested by filter (in this case, the test case
> supplies the filter) and the lcores on the system.
> 
> I'll try to fix it and send v5 right away. If the fix doesn't work, we
> could look at what "lscpu -p=CPU,CORE,SOCKET,NODE | grep -v #" returns
> on your system. It's also captured in dts/output/suite.log. The lcore
> config in conf.yaml could also be relevant, but I assume you didn't
> change that. We could also check the test case output. It's also in
> dts/output/suite.log
> 
> >
> > BTW: suggest the following changes to the error message:
> > * s/amount/number/ - as cores are countable.
> 
> Thanks. I've used it inappropriately in a number of places.
> 
> > * "Is hyperthreading enabled?" -> "This test requires SMT/hyperthreading be
> > enabled". By asking if it's enabled, you don't make it clear whether it
> > should be enabled or not. Since I had it enabled, the question implied to
> > me that it should be disabled. It's only on reading the code I see the comment
> > that it is meant to be enabled.
> 
> I see where the confusion is. The question is just a mere suggestion
> as to where the problem could be, but the logic in code is faulty,
> leading to this unclear error message. I'll fix the logic and probably
> modify the message so it makes more sense.
> 

Thanks, if you do a new version I'm happy enough to retest today.

/Bruce