[v1] dts: add time delay to async sniffer callback function

Message ID 20241030170808.29452-1-npratte@iol.unh.edu (mailing list archive)
State Accepted, archived
Delegated to: Patrick Robb
Headers
Series [v1] dts: add time delay to async sniffer callback function |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-marvell-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-compile-arm64-testing success Testing PASS
ci/iol-unit-amd64-testing success Testing PASS
ci/iol-compile-amd64-testing success Testing PASS
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Nicholas Pratte Oct. 30, 2024, 5:08 p.m. UTC
There exists a bug within i40e NICs in which the async sniffer does not
catch send packets as a result of the callback function sending packets
too quickly before the NICs are ready to start capturing.

There could be a multitude of reasons why this happens on these NICs, but
for the time being, inserting a one second delay in the callback function
will suffice.

Bugzilla ID: 1573
Signed-off-by: Nicholas Pratte <npratte@iol.unh.edu>
---
 dts/framework/testbed_model/traffic_generator/scapy.py | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)
  

Comments

Paul Szczepanek Nov. 6, 2024, 8:17 p.m. UTC | #1
On 30/10/2024 17:08, Nicholas Pratte wrote:
> There exists a bug within i40e NICs in which the async sniffer does not
> catch send packets as a result of the callback function sending packets
> too quickly before the NICs are ready to start capturing.
> 
> There could be a multitude of reasons why this happens on these NICs, but
> for the time being, inserting a one second delay in the callback function
> will suffice.

I can confirm the issue exists but we should explore a more definitive
solution than adding a wait. Ideally instead of relying on the callback
to send packets we should verify readines elsewhere in our sniffer and
send packets when ready in our framework and not as part of the scapy
sniffer constructor.
  
Patrick Robb Nov. 13, 2024, 7:27 a.m. UTC | #2
On Wed, Nov 6, 2024 at 3:17 PM Paul Szczepanek <paul.szczepanek@arm.com>
wrote:

>
> On 30/10/2024 17:08, Nicholas Pratte wrote:
> > There exists a bug within i40e NICs in which the async sniffer does not
> > catch send packets as a result of the callback function sending packets
> > too quickly before the NICs are ready to start capturing.
> >
> > There could be a multitude of reasons why this happens on these NICs, but
> > for the time being, inserting a one second delay in the callback function
> > will suffice.
>
> I can confirm the issue exists but we should explore a more definitive
> solution than adding a wait. Ideally instead of relying on the callback
> to send packets we should verify readines elsewhere in our sniffer and
> send packets when ready in our framework and not as part of the scapy
> sniffer constructor.
>

From looking at the documentation, it is the case that the standard way of
verifying readiness for the asyncsniffer is via the started_callback arg in
the asyncsniffer constructor. You can see some similar discussion here:
https://github.com/secdev/scapy/issues/3208

 So, if this is standard, it is probably best to remain within this
framework. I have been messing with this series tonight and although I
still can't tell why started_callback isn't calling on true sniffer
readiness, I think Nick's time.sleep calls are okay.

I will say, the modification of duration in this series is odd to me. It
looks like the _shell_start_and_stop_sniffing function arg has no default,
and no value is passed in in the call coming from send_packets_and_capture.
My preference would be to provide a default to the duration arg (say, 1)
and remove the arbitrary "duration + 1" in this series.

I also believe the comments about i40e should be removed. We understand
that this series is adding a delay to support sniffer readiness, but we
don't know why this behavior was originally seen on an i40e NIC, and
whether it's isolated to that driver.

Perhaps there would be a way to loop polling of scapy for sniffer
readiness, but I don't see how this would be better or different than the
asyncsniffer callback arg (which is essentially the same according to
docs). So, in my view the best thing to do is for me to fix up the commit
per my comments (and any others anyone has) and apply this.
  
Patrick Robb Nov. 14, 2024, 6:01 a.m. UTC | #3
As best as I can tell -1 vs -2 index is not important as they are named
args, and using the -2 breaks functionality. But we can chat tomorrow if
you don't agree.

Applied the v1 + my fixup per my comments above to next-dts.
  

Patch

diff --git a/dts/framework/testbed_model/traffic_generator/scapy.py b/dts/framework/testbed_model/traffic_generator/scapy.py
index be5ae3b895..9fa9feaf47 100644
--- a/dts/framework/testbed_model/traffic_generator/scapy.py
+++ b/dts/framework/testbed_model/traffic_generator/scapy.py
@@ -188,17 +188,19 @@  def _shell_create_sniffer(
                 when set to an empty string.
         """
         self._shell_set_packet_list(packets_to_send)
+        # We need to introduce a short delay in the sniffer sendp() for i40e NICs.
+        self.send_command("import time")
         sniffer_commands = [
             f"{self._sniffer_name} = AsyncSniffer(",
             f"iface='{recv_port.logical_name}',",
             "store=True,",
             # *args is used in the arguments of the lambda since Scapy sends parameters to the
             # callback function which we do not need for our purposes.
-            "started_callback=lambda *args: sendp(",
+            "started_callback=lambda *args: (time.sleep(1), sendp(",
             (
                 # Additional indentation is added to this line only for readability of the logs.
                 f"{self._python_indentation}{self._send_packet_list_name},"
-                f" iface='{send_port.logical_name}'),"
+                f" iface='{send_port.logical_name}')),"
             ),
             ")",
         ]
@@ -223,7 +225,8 @@  def _shell_start_and_stop_sniffing(self, duration: float) -> list[Packet]:
         """
         sniffed_packets_name = "gathered_packets"
         self.send_command(f"{self._sniffer_name}.start()")
-        time.sleep(duration)
+        # Insert a one second delay to prevent timeout errors from occurring
+        time.sleep(duration + 1)
         self.send_command(f"{sniffed_packets_name} = {self._sniffer_name}.stop(join=True)")
         # An extra newline is required here due to the nature of interactive Python shells
         packet_strs = self.send_command(