[v2] app/testpmd: expand noisy neighbour forward mode support

Message ID 20230126045516.176917-1-mkp@redhat.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series [v2] app/testpmd: expand noisy neighbour forward mode support |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/intel-Testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS

Commit Message

Mike Pattrick Jan. 26, 2023, 4:55 a.m. UTC
  Previously the noisy neighbour vnf simulation would only operate in io
mode, forwarding packets as is. However, this limited the usefulness of
noisy neighbour simulation.

This feature has now been expanded into all forwarding modes except for
ieee1588, where it isn't relevant; and iofwd, which would otherwise be
duplicative of noisy mode.

Signed-off-by: Mike Pattrick <mkp@redhat.com>

---

v2:
 - Included header that was incorrectly excluded from v1
 - Restored calls to rte_rand()
---
 app/test-pmd/5tswap.c                 |  29 ++-----
 app/test-pmd/csumonly.c               |  36 +++-----
 app/test-pmd/flowgen.c                |  24 +-----
 app/test-pmd/icmpecho.c               |  28 +-----
 app/test-pmd/macfwd.c                 |  27 +-----
 app/test-pmd/macswap.c                |  28 ++----
 app/test-pmd/noisy_vnf.c              | 118 ++++++++++++++++----------
 app/test-pmd/noisy_vnf.h              |  48 +++++++++++
 app/test-pmd/testpmd.c                |   6 ++
 app/test-pmd/txonly.c                 |  38 ++-------
 doc/guides/testpmd_app_ug/run_app.rst |  19 +++--
 11 files changed, 172 insertions(+), 229 deletions(-)
 create mode 100644 app/test-pmd/noisy_vnf.h
  

Comments

Singh, Aman Deep Feb. 1, 2023, 3:19 p.m. UTC | #1
Hi Mike,

Thanks a lot for the patch.

On 1/26/2023 10:25 AM, Mike Pattrick wrote:
> Previously the noisy neighbour vnf simulation would only operate in io
> mode, forwarding packets as is. However, this limited the usefulness of
> noisy neighbour simulation.
>
> This feature has now been expanded into all forwarding modes except for
> ieee1588, where it isn't relevant; and iofwd, which would otherwise be
> duplicative of noisy mode.

Well I would first like to know, why we need noisy neighbor for all modes
IMHO, do we need to add code to each mode, if most users don't use it.

Secondly, can't we achieve same behavior by running testpmd instances in
parallel on same NUMA node. Where one testpmd is in noisy mode.

> Signed-off-by: Mike Pattrick<mkp@redhat.com>
>
> ---
>
<snip>
  
Mike Pattrick Feb. 1, 2023, 7:03 p.m. UTC | #2
On Wed, Feb 1, 2023 at 10:19 AM Singh, Aman Deep
<aman.deep.singh@intel.com> wrote:
>
> Hi Mike,
>
> Thanks a lot for the patch.
>
> On 1/26/2023 10:25 AM, Mike Pattrick wrote:
>
> Previously the noisy neighbour vnf simulation would only operate in io
> mode, forwarding packets as is. However, this limited the usefulness of
> noisy neighbour simulation.
>
> This feature has now been expanded into all forwarding modes except for
> ieee1588, where it isn't relevant; and iofwd, which would otherwise be
> duplicative of noisy mode.
>
> Well I would first like to know, why we need noisy neighbor for all modes
> IMHO, do we need to add code to each mode, if most users don't use it.
>
> Secondly, can't we achieve same behavior by running testpmd instances in
> parallel on same NUMA node. Where one testpmd is in noisy mode.

I don't think the dual testpmd solution is identical, one of the
motivations for this change is to actually run the other modes with
the characteristics of the noisy mode. If we ran noisy with another
mode, that other mode would experience cache and memory contention,
but wouldn't experience queuing; and the contention wouldn't be
directly correlated with the exact packets that it forwarded, but
instead with the packets that noisy was forwarding.

Would it be preferable if I changed how this worked to not impact the
other forward modes when noisy options are disabled? I could change
this to switch the value of packet_fwd when noisy options are set. I
could also just move the full implementation back into noisy_vnf.c and
add a new option to affect how it forwards.


Thank you,
Mike

>
> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>
> ---
>
> <snip>
  
Singh, Aman Deep Feb. 8, 2023, 4:57 p.m. UTC | #3
On 2/2/2023 12:33 AM, Mike Pattrick wrote:
> On Wed, Feb 1, 2023 at 10:19 AM Singh, Aman Deep
> <aman.deep.singh@intel.com> wrote:
>> Hi Mike,
>>
>> Thanks a lot for the patch.
>>
>> On 1/26/2023 10:25 AM, Mike Pattrick wrote:
>>
>> Previously the noisy neighbour vnf simulation would only operate in io
>> mode, forwarding packets as is. However, this limited the usefulness of
>> noisy neighbour simulation.
>>
>> This feature has now been expanded into all forwarding modes except for
>> ieee1588, where it isn't relevant; and iofwd, which would otherwise be
>> duplicative of noisy mode.
>>
>> Well I would first like to know, why we need noisy neighbor for all modes
>> IMHO, do we need to add code to each mode, if most users don't use it.
>>
>> Secondly, can't we achieve same behavior by running testpmd instances in
>> parallel on same NUMA node. Where one testpmd is in noisy mode.
> I don't think the dual testpmd solution is identical, one of the
> motivations for this change is to actually run the other modes with
> the characteristics of the noisy mode. If we ran noisy with another
> mode, that other mode would experience cache and memory contention,
> but wouldn't experience queuing; and the contention wouldn't be
> directly correlated with the exact packets that it forwarded, but
> instead with the packets that noisy was forwarding.
>
> Would it be preferable if I changed how this worked to not impact the
> other forward modes when noisy options are disabled? I could change
> this to switch the value of packet_fwd when noisy options are set. I
> could also just move the full implementation back into noisy_vnf.c and
> add a new option to affect how it forwards.

Yes that will be good, to have full implementation in noisy_vnf.c only.

>
>
> Thank you,
> Mike
>
>> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>>
>> ---
>>
>> <snip>
  

Patch

diff --git a/app/test-pmd/5tswap.c b/app/test-pmd/5tswap.c
index f041a5e1d5..d66f520efa 100644
--- a/app/test-pmd/5tswap.c
+++ b/app/test-pmd/5tswap.c
@@ -19,6 +19,7 @@ 
 
 #include "macswap_common.h"
 #include "testpmd.h"
+#include "noisy_vnf.h"
 
 
 static inline void
@@ -91,8 +92,6 @@  pkt_burst_5tuple_swap(struct fwd_stream *fs)
 	uint64_t ol_flags;
 	uint16_t proto;
 	uint16_t nb_rx;
-	uint16_t nb_tx;
-	uint32_t retry;
 
 	int i;
 	union {
@@ -116,7 +115,7 @@  pkt_burst_5tuple_swap(struct fwd_stream *fs)
 				 nb_pkt_per_burst);
 	inc_rx_burst_stats(fs, nb_rx);
 	if (unlikely(nb_rx == 0))
-		return;
+		goto flush;
 
 	fs->rx_packets += nb_rx;
 	txp = &ports[fs->tx_port];
@@ -162,26 +161,10 @@  pkt_burst_5tuple_swap(struct fwd_stream *fs)
 		}
 		mbuf_field_set(mb, ol_flags);
 	}
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
-	/*
-	 * Retry if necessary
-	 */
-	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
-		retry = 0;
-		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
-			rte_delay_us(burst_tx_delay_time);
-			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
-		}
-	}
-	fs->tx_packets += nb_tx;
-	inc_tx_burst_stats(fs, nb_tx);
-	if (unlikely(nb_tx < nb_rx)) {
-		fs->fwd_dropped += (nb_rx - nb_tx);
-		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
-		} while (++nb_tx < nb_rx);
-	}
+
+flush:
+	noisy_eth_tx_burst(fs, nb_rx, pkts_burst);
+
 	get_end_cycles(fs, start_tsc);
 }
 
diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 1c24598515..2a6cf07adb 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -48,6 +48,7 @@ 
 #include <rte_geneve.h>
 
 #include "testpmd.h"
+#include "noisy_vnf.h"
 
 #define IP_DEFTTL  64   /* from RFC 1340. */
 
@@ -847,12 +848,10 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 	uint8_t gro_enable;
 #endif
 	uint16_t nb_rx;
-	uint16_t nb_tx;
 	uint16_t nb_prep;
 	uint16_t i;
 	uint64_t rx_ol_flags, tx_ol_flags;
 	uint64_t tx_offloads;
-	uint32_t retry;
 	uint32_t rx_bad_ip_csum;
 	uint32_t rx_bad_l4_csum;
 	uint32_t rx_bad_outer_l4_csum;
@@ -867,8 +866,13 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst,
 				 nb_pkt_per_burst);
 	inc_rx_burst_stats(fs, nb_rx);
-	if (unlikely(nb_rx == 0))
-		return;
+	if (unlikely(nb_rx == 0)) {
+		/* May still need to flush some packets */
+		nb_prep = 0;
+		tx_pkts_burst = pkts_burst;
+		goto flush;
+	}
+
 
 	fs->rx_packets += nb_rx;
 	rx_bad_ip_csum = 0;
@@ -1173,33 +1177,13 @@  pkt_burst_checksum_forward(struct fwd_stream *fs)
 			"Preparing packet burst to transmit failed: %s\n",
 			rte_strerror(rte_errno));
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst,
-			nb_prep);
-
-	/*
-	 * Retry if necessary
-	 */
-	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
-		retry = 0;
-		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
-			rte_delay_us(burst_tx_delay_time);
-			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&tx_pkts_burst[nb_tx], nb_rx - nb_tx);
-		}
-	}
-	fs->tx_packets += nb_tx;
 	fs->rx_bad_ip_csum += rx_bad_ip_csum;
 	fs->rx_bad_l4_csum += rx_bad_l4_csum;
 	fs->rx_bad_outer_l4_csum += rx_bad_outer_l4_csum;
 	fs->rx_bad_outer_ip_csum += rx_bad_outer_ip_csum;
 
-	inc_tx_burst_stats(fs, nb_tx);
-	if (unlikely(nb_tx < nb_rx)) {
-		fs->fwd_dropped += (nb_rx - nb_tx);
-		do {
-			rte_pktmbuf_free(tx_pkts_burst[nb_tx]);
-		} while (++nb_tx < nb_rx);
-	}
+flush:
+	noisy_eth_tx_burst(fs, nb_prep, tx_pkts_burst);
 
 	get_end_cycles(fs, start_tsc);
 }
diff --git a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c
index fd6abc0f41..6b8fbe9f55 100644
--- a/app/test-pmd/flowgen.c
+++ b/app/test-pmd/flowgen.c
@@ -37,6 +37,7 @@ 
 #include <rte_flow.h>
 
 #include "testpmd.h"
+#include "noisy_vnf.h"
 
 static uint32_t cfg_ip_src	= RTE_IPV4(10, 254, 0, 0);
 static uint32_t cfg_ip_dst	= RTE_IPV4(10, 253, 0, 0);
@@ -71,12 +72,10 @@  pkt_burst_flow_gen(struct fwd_stream *fs)
 	uint16_t vlan_tci, vlan_tci_outer;
 	uint64_t ol_flags = 0;
 	uint16_t nb_rx;
-	uint16_t nb_tx;
 	uint16_t nb_dropped;
 	uint16_t nb_pkt;
 	uint16_t nb_clones = nb_pkt_flowgen_clones;
 	uint16_t i;
-	uint32_t retry;
 	uint64_t tx_offloads;
 	uint64_t start_tsc = 0;
 	int next_flow = RTE_PER_LCORE(_next_flow);
@@ -166,32 +165,13 @@  pkt_burst_flow_gen(struct fwd_stream *fs)
 			next_flow = 0;
 	}
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
-	/*
-	 * Retry if necessary
-	 */
-	if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
-		retry = 0;
-		while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
-			rte_delay_us(burst_tx_delay_time);
-			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_pkt - nb_tx);
-		}
-	}
-	fs->tx_packets += nb_tx;
+	nb_dropped = noisy_eth_tx_burst(fs, nb_rx, pkts_burst);
 
-	inc_tx_burst_stats(fs, nb_tx);
-	nb_dropped = nb_pkt - nb_tx;
 	if (unlikely(nb_dropped > 0)) {
 		/* Back out the flow counter. */
 		next_flow -= nb_dropped;
 		while (next_flow < 0)
 			next_flow += nb_flows_flowgen;
-
-		fs->fwd_dropped += nb_dropped;
-		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
-		} while (++nb_tx < nb_pkt);
 	}
 
 	RTE_PER_LCORE(_next_flow) = next_flow;
diff --git a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c
index 066f2a3ab7..b29f53eb41 100644
--- a/app/test-pmd/icmpecho.c
+++ b/app/test-pmd/icmpecho.c
@@ -33,6 +33,7 @@ 
 #include <rte_flow.h>
 
 #include "testpmd.h"
+#include "noisy_vnf.h"
 
 static const char *
 arp_op_name(uint16_t arp_op)
@@ -280,10 +281,8 @@  reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
 	struct rte_ipv4_hdr *ip_h;
 	struct rte_icmp_hdr *icmp_h;
 	struct rte_ether_addr eth_addr;
-	uint32_t retry;
 	uint32_t ip_addr;
 	uint16_t nb_rx;
-	uint16_t nb_tx;
 	uint16_t nb_replies;
 	uint16_t eth_type;
 	uint16_t vlan_id;
@@ -483,30 +482,7 @@  reply_to_icmp_echo_rqsts(struct fwd_stream *fs)
 
 	/* Send back ICMP echo replies, if any. */
 	if (nb_replies > 0) {
-		nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst,
-					 nb_replies);
-		/*
-		 * Retry if necessary
-		 */
-		if (unlikely(nb_tx < nb_replies) && fs->retry_enabled) {
-			retry = 0;
-			while (nb_tx < nb_replies &&
-					retry++ < burst_tx_retry_num) {
-				rte_delay_us(burst_tx_delay_time);
-				nb_tx += rte_eth_tx_burst(fs->tx_port,
-						fs->tx_queue,
-						&pkts_burst[nb_tx],
-						nb_replies - nb_tx);
-			}
-		}
-		fs->tx_packets += nb_tx;
-		inc_tx_burst_stats(fs, nb_tx);
-		if (unlikely(nb_tx < nb_replies)) {
-			fs->fwd_dropped += (nb_replies - nb_tx);
-			do {
-				rte_pktmbuf_free(pkts_burst[nb_tx]);
-			} while (++nb_tx < nb_replies);
-		}
+		noisy_eth_tx_burst(fs, nb_replies, pkts_burst);
 	}
 
 	get_end_cycles(fs, start_tsc);
diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c
index beb220fbb4..527a47d9c4 100644
--- a/app/test-pmd/macfwd.c
+++ b/app/test-pmd/macfwd.c
@@ -35,6 +35,7 @@ 
 #include <rte_flow.h>
 
 #include "testpmd.h"
+#include "noisy_vnf.h"
 
 /*
  * Forwarding of packets in MAC mode.
@@ -48,9 +49,7 @@  pkt_burst_mac_forward(struct fwd_stream *fs)
 	struct rte_port  *txp;
 	struct rte_mbuf  *mb;
 	struct rte_ether_hdr *eth_hdr;
-	uint32_t retry;
 	uint16_t nb_rx;
-	uint16_t nb_tx;
 	uint16_t i;
 	uint64_t ol_flags = 0;
 	uint64_t tx_offloads;
@@ -65,7 +64,7 @@  pkt_burst_mac_forward(struct fwd_stream *fs)
 				 nb_pkt_per_burst);
 	inc_rx_burst_stats(fs, nb_rx);
 	if (unlikely(nb_rx == 0))
-		return;
+		goto flush;
 
 	fs->rx_packets += nb_rx;
 	txp = &ports[fs->tx_port];
@@ -93,27 +92,9 @@  pkt_burst_mac_forward(struct fwd_stream *fs)
 		mb->vlan_tci = txp->tx_vlan_id;
 		mb->vlan_tci_outer = txp->tx_vlan_id_outer;
 	}
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
-	/*
-	 * Retry if necessary
-	 */
-	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
-		retry = 0;
-		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
-			rte_delay_us(burst_tx_delay_time);
-			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
-		}
-	}
 
-	fs->tx_packets += nb_tx;
-	inc_tx_burst_stats(fs, nb_tx);
-	if (unlikely(nb_tx < nb_rx)) {
-		fs->fwd_dropped += (nb_rx - nb_tx);
-		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
-		} while (++nb_tx < nb_rx);
-	}
+flush:
+	noisy_eth_tx_burst(fs, nb_rx, pkts_burst);
 
 	get_end_cycles(fs, start_tsc);
 }
diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c
index 4f8deb3382..9188049f92 100644
--- a/app/test-pmd/macswap.c
+++ b/app/test-pmd/macswap.c
@@ -42,6 +42,7 @@ 
 #else
 #include "macswap.h"
 #endif
+#include "noisy_vnf.h"
 
 /*
  * MAC swap forwarding mode: Swap the source and the destination Ethernet
@@ -53,8 +54,6 @@  pkt_burst_mac_swap(struct fwd_stream *fs)
 	struct rte_mbuf  *pkts_burst[MAX_PKT_BURST];
 	struct rte_port  *txp;
 	uint16_t nb_rx;
-	uint16_t nb_tx;
-	uint32_t retry;
 	uint64_t start_tsc = 0;
 
 	get_start_cycles(&start_tsc);
@@ -66,33 +65,16 @@  pkt_burst_mac_swap(struct fwd_stream *fs)
 				 nb_pkt_per_burst);
 	inc_rx_burst_stats(fs, nb_rx);
 	if (unlikely(nb_rx == 0))
-		return;
+		goto flush;
 
 	fs->rx_packets += nb_rx;
 	txp = &ports[fs->tx_port];
 
 	do_macswap(pkts_burst, nb_rx, txp);
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_rx);
-	/*
-	 * Retry if necessary
-	 */
-	if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) {
-		retry = 0;
-		while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
-			rte_delay_us(burst_tx_delay_time);
-			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_rx - nb_tx);
-		}
-	}
-	fs->tx_packets += nb_tx;
-	inc_tx_burst_stats(fs, nb_tx);
-	if (unlikely(nb_tx < nb_rx)) {
-		fs->fwd_dropped += (nb_rx - nb_tx);
-		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
-		} while (++nb_tx < nb_rx);
-	}
+flush:
+	noisy_eth_tx_burst(fs, nb_rx, pkts_burst);
+
 	get_end_cycles(fs, start_tsc);
 }
 
diff --git a/app/test-pmd/noisy_vnf.c b/app/test-pmd/noisy_vnf.c
index c65ec6f06a..a91f6f39d4 100644
--- a/app/test-pmd/noisy_vnf.c
+++ b/app/test-pmd/noisy_vnf.c
@@ -32,6 +32,10 @@ 
 #include <rte_malloc.h>
 
 #include "testpmd.h"
+#include "noisy_vnf.h"
+
+#define NOISY_STRSIZE 256
+#define NOISY_RING "noisy_ring_%d\n"
 
 struct noisy_config {
 	struct rte_ring *f;
@@ -80,9 +84,6 @@  sim_memory_lookups(struct noisy_config *ncf, uint16_t nb_pkts)
 {
 	uint16_t i, j;
 
-	if (!ncf->do_sim)
-		return;
-
 	for (i = 0; i < nb_pkts; i++) {
 		for (j = 0; j < noisy_lkup_num_writes; j++)
 			do_write(ncf->vnf_mem);
@@ -94,15 +95,28 @@  sim_memory_lookups(struct noisy_config *ncf, uint16_t nb_pkts)
 }
 
 static uint16_t
-do_retry(uint16_t nb_rx, uint16_t nb_tx, struct rte_mbuf **pkts,
+do_retry(uint16_t nb_rx, struct rte_mbuf **pkts,
 	 struct fwd_stream *fs)
 {
 	uint32_t retry = 0;
+	uint16_t nb_tx = 0;
 
-	while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) {
-		rte_delay_us(burst_tx_delay_time);
+	do {
 		nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
 				&pkts[nb_tx], nb_rx - nb_tx);
+		if (unlikely(nb_tx < nb_rx && fs->retry_enabled))
+			rte_delay_us(burst_tx_delay_time);
+		else
+			break;
+	} while (retry++ < burst_tx_retry_num);
+
+	if (nb_tx < nb_rx && verbose_level > 0 && fs->fwd_dropped == 0) {
+		/* if the dropped packets were queued, nb_tx could be negative */
+		printf("port %d tx_queue %d - drop "
+			   "(nb_pkt:%u - nb_tx:%u)=%u packets\n",
+			   fs->tx_port, fs->tx_queue,
+			   (unsigned int) nb_rx, (unsigned int) nb_tx,
+			   (unsigned int) (nb_rx - nb_tx));
 	}
 
 	return nb_tx;
@@ -127,49 +141,49 @@  drop_pkts(struct rte_mbuf **pkts, uint16_t nb_rx, uint16_t nb_tx)
  * Depending on which commandline parameters are specified we have
  * different cases to handle:
  *
- * 1. No FIFO size was given, so we don't do buffering of incoming
+ * 1. No memory operations were specified on cmdline, send normally.
+ * 2. No FIFO size was given, so we don't do buffering of incoming
  *    packets.  This case is pretty much what iofwd does but in this case
  *    we also do simulation of memory accesses (depending on which
  *    parameters were specified for it).
- * 2. User wants do buffer packets in a FIFO and sent out overflowing
+ * 3. User wants do buffer packets in a FIFO and sent out overflowing
  *    packets.
- * 3. User wants a FIFO and specifies a time in ms to flush all packets
+ * 4. User wants a FIFO and specifies a time in ms to flush all packets
  *    out of the FIFO
- * 4. Cases 2 and 3 combined
+ * 5. Cases 2 and 3 combined
  */
-static void
-pkt_burst_noisy_vnf(struct fwd_stream *fs)
+uint16_t
+noisy_eth_tx_burst(struct fwd_stream *fs, uint16_t nb_rx, struct rte_mbuf **pkts_burst)
 {
 	const uint64_t freq_khz = rte_get_timer_hz() / 1000;
 	struct noisy_config *ncf = noisy_cfg[fs->rx_port];
-	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
 	struct rte_mbuf *tmp_pkts[MAX_PKT_BURST];
 	uint16_t nb_deqd = 0;
-	uint16_t nb_rx = 0;
 	uint16_t nb_tx = 0;
+	uint16_t total_dropped = 0;
+	uint16_t dropped = 0;
 	uint16_t nb_enqd;
 	unsigned int fifo_free;
 	uint64_t delta_ms;
 	bool needs_flush = false;
 	uint64_t now;
 
-	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue,
-			pkts_burst, nb_pkt_per_burst);
-	inc_rx_burst_stats(fs, nb_rx);
-	if (unlikely(nb_rx == 0))
-		goto flush;
-	fs->rx_packets += nb_rx;
+	if (unlikely(nb_rx == 0)) {
+		if (ncf->do_buffering)
+			goto flush;
+		else
+			return 0;
+	}
 
 	if (!ncf->do_buffering) {
-		sim_memory_lookups(ncf, nb_rx);
-		nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-				pkts_burst, nb_rx);
-		if (unlikely(nb_tx < nb_rx) && fs->retry_enabled)
-			nb_tx += do_retry(nb_rx, nb_tx, pkts_burst, fs);
+		if (ncf->do_sim)
+			sim_memory_lookups(ncf, nb_rx);
+		nb_tx = do_retry(nb_rx, pkts_burst, fs);
 		inc_tx_burst_stats(fs, nb_tx);
 		fs->tx_packets += nb_tx;
-		fs->fwd_dropped += drop_pkts(pkts_burst, nb_rx, nb_tx);
-		return;
+		dropped = drop_pkts(pkts_burst, nb_rx, nb_tx);
+		fs->fwd_dropped += dropped;
+		return dropped;
 	}
 
 	fifo_free = rte_ring_free_count(ncf->f);
@@ -185,17 +199,17 @@  pkt_burst_noisy_vnf(struct fwd_stream *fs)
 		nb_enqd = rte_ring_enqueue_burst(ncf->f,
 				(void **) pkts_burst, nb_deqd, NULL);
 		if (nb_deqd > 0) {
-			nb_tx = rte_eth_tx_burst(fs->tx_port,
-					fs->tx_queue, tmp_pkts,
-					nb_deqd);
-			if (unlikely(nb_tx < nb_rx) && fs->retry_enabled)
-				nb_tx += do_retry(nb_rx, nb_tx, tmp_pkts, fs);
+			nb_tx = do_retry(nb_deqd, tmp_pkts, fs);
+			fs->tx_packets += nb_tx;
 			inc_tx_burst_stats(fs, nb_tx);
-			fs->fwd_dropped += drop_pkts(tmp_pkts, nb_deqd, nb_tx);
+			dropped = drop_pkts(tmp_pkts, nb_deqd, nb_tx);
+			fs->fwd_dropped += dropped;
+			total_dropped += dropped;
 		}
 	}
 
-	sim_memory_lookups(ncf, nb_enqd);
+	if (ncf->do_sim)
+		sim_memory_lookups(ncf, nb_enqd);
 
 flush:
 	if (ncf->do_flush) {
@@ -208,23 +222,21 @@  pkt_burst_noisy_vnf(struct fwd_stream *fs)
 				noisy_tx_sw_buf_flush_time > 0 && !nb_tx;
 	}
 	while (needs_flush && !rte_ring_empty(ncf->f)) {
-		unsigned int sent;
 		nb_deqd = rte_ring_dequeue_burst(ncf->f, (void **)tmp_pkts,
 				MAX_PKT_BURST, NULL);
-		sent = rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					 tmp_pkts, nb_deqd);
-		if (unlikely(sent < nb_deqd) && fs->retry_enabled)
-			nb_tx += do_retry(nb_rx, nb_tx, tmp_pkts, fs);
+		nb_tx = do_retry(nb_deqd, tmp_pkts, fs);
+		fs->tx_packets += nb_tx;
 		inc_tx_burst_stats(fs, nb_tx);
-		fs->fwd_dropped += drop_pkts(tmp_pkts, nb_deqd, sent);
+		dropped = drop_pkts(tmp_pkts, nb_deqd, nb_deqd);
+		fs->fwd_dropped += dropped;
+		total_dropped += dropped;
 		ncf->prev_time = rte_get_timer_cycles();
 	}
-}
 
-#define NOISY_STRSIZE 256
-#define NOISY_RING "noisy_ring_%d\n"
+	return total_dropped;
+}
 
-static void
+void
 noisy_fwd_end(portid_t pi)
 {
 	rte_ring_free(noisy_cfg[pi]->f);
@@ -232,7 +244,7 @@  noisy_fwd_end(portid_t pi)
 	rte_free(noisy_cfg[pi]);
 }
 
-static int
+int
 noisy_fwd_begin(portid_t pi)
 {
 	struct noisy_config *n;
@@ -290,10 +302,22 @@  stream_init_noisy_vnf(struct fwd_stream *fs)
 	fs->disabled = rx_stopped || tx_stopped;
 }
 
+static void
+pkt_burst_noisy_vnf(struct fwd_stream *fs)
+{
+	uint16_t nb_rx = 0;
+	struct rte_mbuf *pkts_burst[MAX_PKT_BURST];
+	nb_rx = rte_eth_rx_burst(fs->rx_port, fs->rx_queue,
+			pkts_burst, nb_pkt_per_burst);
+	inc_rx_burst_stats(fs, nb_rx);
+	fs->rx_packets += nb_rx;
+	noisy_eth_tx_burst(fs, nb_rx, pkts_burst);
+}
+
 struct fwd_engine noisy_vnf_engine = {
 	.fwd_mode_name  = "noisy",
-	.port_fwd_begin = noisy_fwd_begin,
-	.port_fwd_end   = noisy_fwd_end,
+	.port_fwd_begin = NULL,
+	.port_fwd_end   = NULL,
 	.stream_init    = stream_init_noisy_vnf,
 	.packet_fwd     = pkt_burst_noisy_vnf,
 };
diff --git a/app/test-pmd/noisy_vnf.h b/app/test-pmd/noisy_vnf.h
new file mode 100644
index 0000000000..023c7e21af
--- /dev/null
+++ b/app/test-pmd/noisy_vnf.h
@@ -0,0 +1,48 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2018 Red Hat Corp.
+ */
+
+#ifndef _NOISY_VNF_H_
+#define _NOISY_VNF_H_
+
+#include <stdarg.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <errno.h>
+#include <stdint.h>
+#include <unistd.h>
+#include <inttypes.h>
+
+#include <sys/queue.h>
+#include <sys/stat.h>
+
+#include <rte_common.h>
+#include <rte_log.h>
+#include <rte_debug.h>
+#include <rte_cycles.h>
+#include <rte_memory.h>
+#include <rte_launch.h>
+#include <rte_eal.h>
+#include <rte_per_lcore.h>
+#include <rte_lcore.h>
+#include <rte_memcpy.h>
+#include <rte_mempool.h>
+#include <rte_mbuf.h>
+#include <rte_ethdev.h>
+#include <rte_flow.h>
+#include <rte_malloc.h>
+
+#include "testpmd.h"
+
+void
+noisy_fwd_end(portid_t pi);
+
+int
+noisy_fwd_begin(portid_t pi);
+
+uint16_t
+noisy_eth_tx_burst(struct fwd_stream *fs, uint16_t nb_rx, struct rte_mbuf **pkts_burst);
+
+#endif
diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 134d79a555..a0b3a55e2a 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -74,6 +74,7 @@ 
 #endif
 
 #include "testpmd.h"
+#include "noisy_vnf.h"
 
 #ifndef MAP_HUGETLB
 /* FreeBSD may not have MAP_HUGETLB (in fact, it probably doesn't) */
@@ -2382,6 +2383,9 @@  start_packet_forwarding(int with_tx_first)
 		for (i = 0; i < cur_fwd_config.nb_fwd_streams; i++)
 			stream_init(fwd_streams[i]);
 
+	for (i = 0; i < cur_fwd_config.nb_fwd_ports; i++)
+		noisy_fwd_begin(fwd_ports_ids[i]);
+
 	port_fwd_begin = cur_fwd_config.fwd_eng->port_fwd_begin;
 	if (port_fwd_begin != NULL) {
 		for (i = 0; i < cur_fwd_config.nb_fwd_ports; i++) {
@@ -2446,6 +2450,8 @@  stop_packet_forwarding(void)
 		fwd_lcores[lc_id]->stopped = 1;
 	printf("\nWaiting for lcores to finish...\n");
 	rte_eal_mp_wait_lcore();
+	for (i = 0; i < cur_fwd_config.nb_fwd_ports; i++)
+		noisy_fwd_end(fwd_ports_ids[i]);
 	port_fwd_end = cur_fwd_config.fwd_eng->port_fwd_end;
 	if (port_fwd_end != NULL) {
 		for (i = 0; i < cur_fwd_config.nb_fwd_ports; i++) {
diff --git a/app/test-pmd/txonly.c b/app/test-pmd/txonly.c
index 021624952d..41daeb2841 100644
--- a/app/test-pmd/txonly.c
+++ b/app/test-pmd/txonly.c
@@ -37,6 +37,7 @@ 
 #include <rte_flow.h>
 
 #include "testpmd.h"
+#include "noisy_vnf.h"
 
 struct tx_timestamp {
 	rte_be32_t signature;
@@ -331,10 +332,9 @@  pkt_burst_transmit(struct fwd_stream *fs)
 	struct rte_mbuf *pkt;
 	struct rte_mempool *mbp;
 	struct rte_ether_hdr eth_hdr;
-	uint16_t nb_tx;
 	uint16_t nb_pkt;
+	uint16_t dropped;
 	uint16_t vlan_tci, vlan_tci_outer;
-	uint32_t retry;
 	uint64_t ol_flags = 0;
 	uint64_t tx_offloads;
 	uint64_t start_tsc = 0;
@@ -391,40 +391,14 @@  pkt_burst_transmit(struct fwd_stream *fs)
 		}
 	}
 
-	if (nb_pkt == 0)
-		return;
 
-	nb_tx = rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, nb_pkt);
+	dropped = noisy_eth_tx_burst(fs, nb_pkt, pkts_burst);
 
-	/*
-	 * Retry if necessary
-	 */
-	if (unlikely(nb_tx < nb_pkt) && fs->retry_enabled) {
-		retry = 0;
-		while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) {
-			rte_delay_us(burst_tx_delay_time);
-			nb_tx += rte_eth_tx_burst(fs->tx_port, fs->tx_queue,
-					&pkts_burst[nb_tx], nb_pkt - nb_tx);
-		}
-	}
-	fs->tx_packets += nb_tx;
+	if (nb_pkt == 0)
+		return;
 
 	if (txonly_multi_flow)
-		RTE_PER_LCORE(_ip_var) -= nb_pkt - nb_tx;
-
-	inc_tx_burst_stats(fs, nb_tx);
-	if (unlikely(nb_tx < nb_pkt)) {
-		if (verbose_level > 0 && fs->fwd_dropped == 0)
-			printf("port %d tx_queue %d - drop "
-			       "(nb_pkt:%u - nb_tx:%u)=%u packets\n",
-			       fs->tx_port, fs->tx_queue,
-			       (unsigned) nb_pkt, (unsigned) nb_tx,
-			       (unsigned) (nb_pkt - nb_tx));
-		fs->fwd_dropped += (nb_pkt - nb_tx);
-		do {
-			rte_pktmbuf_free(pkts_burst[nb_tx]);
-		} while (++nb_tx < nb_pkt);
-	}
+		RTE_PER_LCORE(_ip_var) -= dropped;
 
 	get_end_cycles(fs, start_tsc);
 }
diff --git a/doc/guides/testpmd_app_ug/run_app.rst b/doc/guides/testpmd_app_ug/run_app.rst
index 074f910fc9..e9fe6dc86a 100644
--- a/doc/guides/testpmd_app_ug/run_app.rst
+++ b/doc/guides/testpmd_app_ug/run_app.rst
@@ -484,34 +484,39 @@  The command line options are:
 *   ``--noisy-tx-sw-buffer-size``
 
     Set the number of maximum elements  of the FIFO queue to be created
-    for buffering packets. Only available with the noisy forwarding mode.
-    The default value is 0.
+    for buffering packets. Available with all forwarding modes except for
+    io and ieee1588. The default value is 0.
 
 *   ``--noisy-tx-sw-buffer-flushtime=N``
 
     Set the time before packets in the FIFO queue is flushed.
-    Only available with the noisy forwarding mode. The default value is 0.
+    Available with all forwarding modes except for io and ieee1588.
+    The default value is 0.
 
 *   ``--noisy-lkup-memory=N``
 
     Set the size of the noisy neighbor simulation memory buffer in MB to N.
-    Only available with the noisy forwarding mode. The default value is 0.
+    Available with all forwarding modes except for io and ieee1588.
+    The default value is 0.
 
 
 *   ``--noisy-lkup-num-reads=N``
 
     Set the number of reads to be done in noisy neighbor simulation memory buffer to N.
-    Only available with the noisy forwarding mode. The default value is 0.
+    Available with all forwarding modes except for io and ieee1588.
+    The default value is 0.
 
 *   ``--noisy-lkup-num-writes=N``
 
     Set the number of writes to be done in noisy neighbor simulation memory buffer to N.
-    Only available with the noisy forwarding mode. The default value is 0.
+    Available with all forwarding modes except for io and ieee1588.
+    The default value is 0.
 
 *   ``--noisy-lkup-num-reads-writes=N``
 
     Set the number of r/w accesses to be done in noisy neighbor simulation memory buffer to N.
-    Only available with the noisy forwarding mode. The default value is 0.
+    Available with all forwarding modes except for io and ieee1588.
+    The default value is 0.
 
 *   ``--no-iova-contig``