diff mbox series

[v6,2/5] app/flow-perf: add insertion rate calculation

Message ID 20200511110912.11535-3-wisamm@mellanox.com (mailing list archive)
State Rejected, archived
Headers show
Series [v6,1/5] app/flow-perf: add flow performance skeleton | expand

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Wisam Jaddo May 11, 2020, 11:09 a.m. UTC
Add insertion rate calculation feature into flow
performance application.

The application now provide the ability to test
insertion rate of specific rte_flow rule, by
stressing it to the NIC, and calculate the
insertion rate.

The application offers some options in the command
line, to configure which rule to apply.

After that the application will start producing
rules with same pattern but increasing the outer IP
source address by 1 each time, thus it will give
different flow each time, and all other items will
have open masks.

The current design have single core insertion rate.
In the future we may have a multi core insertion
rate measurement support in the app.

Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
---
 app/test-flow-perf/Makefile            |   3 +
 app/test-flow-perf/actions_gen.c       | 164 +++++++++
 app/test-flow-perf/actions_gen.h       |  29 ++
 app/test-flow-perf/config.h            |  16 +
 app/test-flow-perf/flow_gen.c          | 145 ++++++++
 app/test-flow-perf/flow_gen.h          |  37 ++
 app/test-flow-perf/items_gen.c         | 277 +++++++++++++++
 app/test-flow-perf/items_gen.h         |  31 ++
 app/test-flow-perf/main.c              | 472 ++++++++++++++++++++++++-
 app/test-flow-perf/meson.build         |   3 +
 doc/guides/rel_notes/release_20_05.rst |   3 +
 doc/guides/tools/flow-perf.rst         | 195 +++++++++-
 12 files changed, 1368 insertions(+), 7 deletions(-)
 create mode 100644 app/test-flow-perf/actions_gen.c
 create mode 100644 app/test-flow-perf/actions_gen.h
 create mode 100644 app/test-flow-perf/flow_gen.c
 create mode 100644 app/test-flow-perf/flow_gen.h
 create mode 100644 app/test-flow-perf/items_gen.c
 create mode 100644 app/test-flow-perf/items_gen.h

Comments

Andrew Rybchenko May 11, 2020, 12:05 p.m. UTC | #1
On 5/11/20 2:09 PM, Wisam Jaddo wrote:
> Add insertion rate calculation feature into flow
> performance application.
> 
> The application now provide the ability to test
> insertion rate of specific rte_flow rule, by
> stressing it to the NIC, and calculate the
> insertion rate.
> 
> The application offers some options in the command
> line, to configure which rule to apply.
> 
> After that the application will start producing
> rules with same pattern but increasing the outer IP
> source address by 1 each time, thus it will give
> different flow each time, and all other items will
> have open masks.
> 
> The current design have single core insertion rate.
> In the future we may have a multi core insertion
> rate measurement support in the app.
> 
> Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
> ---
>  app/test-flow-perf/Makefile            |   3 +
>  app/test-flow-perf/actions_gen.c       | 164 +++++++++
>  app/test-flow-perf/actions_gen.h       |  29 ++
>  app/test-flow-perf/config.h            |  16 +
>  app/test-flow-perf/flow_gen.c          | 145 ++++++++
>  app/test-flow-perf/flow_gen.h          |  37 ++
>  app/test-flow-perf/items_gen.c         | 277 +++++++++++++++
>  app/test-flow-perf/items_gen.h         |  31 ++
>  app/test-flow-perf/main.c              | 472 ++++++++++++++++++++++++-
>  app/test-flow-perf/meson.build         |   3 +
>  doc/guides/rel_notes/release_20_05.rst |   3 +
>  doc/guides/tools/flow-perf.rst         | 195 +++++++++-
>  12 files changed, 1368 insertions(+), 7 deletions(-)
>  create mode 100644 app/test-flow-perf/actions_gen.c
>  create mode 100644 app/test-flow-perf/actions_gen.h
>  create mode 100644 app/test-flow-perf/flow_gen.c
>  create mode 100644 app/test-flow-perf/flow_gen.h
>  create mode 100644 app/test-flow-perf/items_gen.c
>  create mode 100644 app/test-flow-perf/items_gen.h
> 
> diff --git a/app/test-flow-perf/Makefile b/app/test-flow-perf/Makefile
> index db043c17a..4f2db7591 100644
> --- a/app/test-flow-perf/Makefile
> +++ b/app/test-flow-perf/Makefile
> @@ -16,6 +16,9 @@ CFLAGS += $(WERROR_FLAGS)
>  #
>  # all source are stored in SRCS-y
>  #
> +SRCS-y += actions_gen.c
> +SRCS-y += flow_gen.c
> +SRCS-y += items_gen.c
>  SRCS-y += main.c
>  
>  include $(RTE_SDK)/mk/rte.app.mk
> diff --git a/app/test-flow-perf/actions_gen.c b/app/test-flow-perf/actions_gen.c
> new file mode 100644
> index 000000000..16bb3cf20
> --- /dev/null
> +++ b/app/test-flow-perf/actions_gen.c
> @@ -0,0 +1,164 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + *
> + * The file contains the implementations of actions generators.
> + * Each generator is responsible for preparing it's action instance
> + * and initializing it with needed data.
> + */
> +
> +#include <sys/types.h>
> +#include <rte_malloc.h>
> +#include <rte_flow.h>
> +#include <rte_ethdev.h>
> +
> +#include "actions_gen.h"
> +#include "config.h"
> +
> +/* Storage for struct rte_flow_action_rss including external data. */
> +struct action_rss_data {
> +	struct rte_flow_action_rss conf;
> +	uint8_t key[40];
> +	uint16_t queue[128];
> +};
> +
> +void
> +add_mark(struct rte_flow_action *actions,
> +	uint8_t actions_counter)
> +{
> +	static struct rte_flow_action_mark mark_action;

Function-local static variables a bit better than file-local
or global variable, but just a bit. See below.
At bare minimum it requires a check that the action is not
in use already. Same in many cases below.

> +
> +	do {
> +		mark_action.id = MARK_ID;
> +	} while (0);

Why do you use dummy do-while loop here? Many similar cases
below.

> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_MARK;
> +	actions[actions_counter].conf = &mark_action;
> +}
> +
> +void
> +add_queue(struct rte_flow_action *actions,
> +	uint8_t actions_counter, uint16_t queue)
> +{
> +	static struct rte_flow_action_queue queue_action;

It does not allow to use the action twice to deliver to
to queues.

> +
> +	do {
> +		queue_action.index = queue;
> +	} while (0);
> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_QUEUE;
> +	actions[actions_counter].conf = &queue_action;
> +}
> +
> +void
> +add_jump(struct rte_flow_action *actions,
> +	uint8_t actions_counter, uint16_t next_table)
> +{
> +	static struct rte_flow_action_jump jump_action;
> +
> +	do {
> +		jump_action.group = next_table;
> +	} while (0);
> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_JUMP;
> +	actions[actions_counter].conf = &jump_action;
> +}
> +
> +void
> +add_rss(struct rte_flow_action *actions,
> +	uint8_t actions_counter, uint16_t *queues,
> +	uint16_t queues_number)
> +{
> +	static struct rte_flow_action_rss *rss_action;
> +	static struct action_rss_data *rss_data;

It is better to add an empty line here to split static and
non-static variable and make it easy to catch the difference.

> +	uint16_t queue;
> +
> +	rss_data = rte_malloc("rss_data",
> +		sizeof(struct action_rss_data), 0);

Does it mean that the second invocation will make
a memory leak?

> +
> +	if (rss_data == NULL)
> +		rte_exit(EXIT_FAILURE, "No Memory available!");
> +
> +	*rss_data = (struct action_rss_data){
> +		.conf = (struct rte_flow_action_rss){
> +			.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
> +			.level = 0,
> +			.types = GET_RSS_HF(),
> +			.key_len = sizeof(rss_data->key),
> +			.queue_num = queues_number,
> +			.key = rss_data->key,
> +			.queue = rss_data->queue,
> +		},
> +		.key = { 1 },
> +		.queue = { 0 },
> +	};
> +
> +	for (queue = 0; queue < queues_number; queue++)
> +		rss_data->queue[queue] = queues[queue];
> +
> +	rss_action = &rss_data->conf;
> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_RSS;
> +	actions[actions_counter++].conf = rss_action;
> +}
> +
> +void
> +add_set_meta(struct rte_flow_action *actions,
> +	uint8_t actions_counter)
> +{
> +	static struct rte_flow_action_set_meta meta_action;
> +
> +	do {
> +		meta_action.data = RTE_BE32(META_DATA);
> +		meta_action.mask = RTE_BE32(0xffffffff);
> +	} while (0);
> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_SET_META;
> +	actions[actions_counter++].conf = &meta_action;
> +}
> +
> +void
> +add_set_tag(struct rte_flow_action *actions,
> +	uint8_t actions_counter)
> +{
> +	static struct rte_flow_action_set_tag tag_action;
> +
> +	do {
> +		tag_action.data = RTE_BE32(META_DATA);
> +		tag_action.mask = RTE_BE32(0xffffffff);
> +		tag_action.index = TAG_INDEX;
> +	} while (0);
> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_SET_TAG;
> +	actions[actions_counter++].conf = &tag_action;
> +}
> +
> +void
> +add_port_id(struct rte_flow_action *actions,
> +	uint8_t actions_counter)
> +{
> +	static struct rte_flow_action_port_id port_id;
> +
> +	do {
> +		port_id.id = PORT_ID_DST;
> +	} while (0);
> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_PORT_ID;
> +	actions[actions_counter++].conf = &port_id;
> +}
> +
> +void
> +add_drop(struct rte_flow_action *actions,
> +	uint8_t actions_counter)
> +{
> +	actions[actions_counter++].type = RTE_FLOW_ACTION_TYPE_DROP;
> +}
> +
> +void
> +add_count(struct rte_flow_action *actions,
> +	uint8_t actions_counter)
> +{
> +	static struct rte_flow_action_count count_action;

Again it means it is impossible to use the action twice in one
rule.

> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_COUNT;
> +	actions[actions_counter++].conf = &count_action;
> +}
> diff --git a/app/test-flow-perf/actions_gen.h b/app/test-flow-perf/actions_gen.h
> new file mode 100644
> index 000000000..bc7d084f3
> --- /dev/null
> +++ b/app/test-flow-perf/actions_gen.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + *
> + * This file contains the functions definitions to
> + * generate each supported action.
> + */
> +
> +#ifndef FLOW_PERF_ACTION_GEN
> +#define FLOW_PERF_ACTION_GEN
> +
> +#include <rte_flow.h>
> +
> +#include "config.h"
> +
> +void add_mark(struct rte_flow_action *actions, uint8_t actions_counter);
> +void add_queue(struct rte_flow_action *actions,
> +	uint8_t actions_counter, uint16_t queue);
> +void add_jump(struct rte_flow_action *actions,
> +	uint8_t actions_counter, uint16_t next_table);
> +void add_rss(struct rte_flow_action *actions,
> +	uint8_t actions_counter, uint16_t *queues,
> +	uint16_t queues_number);
> +void add_set_meta(struct rte_flow_action *actions, uint8_t actions_counter);
> +void add_set_tag(struct rte_flow_action *actions, uint8_t actions_counter);
> +void add_port_id(struct rte_flow_action *actions, uint8_t actions_counter);
> +void add_drop(struct rte_flow_action *actions, uint8_t actions_counter);
> +void add_count(struct rte_flow_action *actions, uint8_t actions_counter);
> +
> +#endif /* FLOW_PERF_ACTION_GEN */
> diff --git a/app/test-flow-perf/config.h b/app/test-flow-perf/config.h
> index cf41e0345..f16d0de77 100644
> --- a/app/test-flow-perf/config.h
> +++ b/app/test-flow-perf/config.h
> @@ -2,6 +2,7 @@
>   * Copyright 2020 Mellanox Technologies, Ltd
>   */
>  
> +#define FLOW_ITEM_MASK(_x) (UINT64_C(1) << _x)
>  #define GET_RSS_HF() (ETH_RSS_IP | ETH_RSS_TCP)
>  
>  /* Configuration */
> @@ -12,3 +13,18 @@
>  #define MBUF_CACHE_SIZE 512
>  #define NR_RXD  256
>  #define NR_TXD  256
> +
> +/* Items/Actions parameters */
> +#define JUMP_ACTION_TABLE 2
> +#define VLAN_VALUE 1
> +#define VNI_VALUE 1
> +#define GRE_PROTO  0x6558

lib/librte_net/rte_ether.h:303:#define RTE_ETHER_TYPE_TEB  0x6558 /**<
Transparent Ethernet Bridging. */

> +#define META_DATA 1
> +#define TAG_INDEX 0
> +#define PORT_ID_DST 1
> +#define MARK_ID 1
> +#define TEID_VALUE 1
> +
> +/* Flow items/acctions max size */
> +#define MAX_ITEMS_NUM 32
> +#define MAX_ACTIONS_NUM 32
> diff --git a/app/test-flow-perf/flow_gen.c b/app/test-flow-perf/flow_gen.c
> new file mode 100644
> index 000000000..50066d99e
> --- /dev/null
> +++ b/app/test-flow-perf/flow_gen.c
> @@ -0,0 +1,145 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + *
> + * The file contains the implementations of the method to
> + * fill items, actions & attributes in their corresponding
> + * arrays, and then generate rte_flow rule.
> + *
> + * After the generation. The rule goes to validation then
> + * creation state and then return the results.
> + */
> +
> +#include <stdint.h>
> +
> +#include "flow_gen.h"
> +#include "items_gen.h"
> +#include "actions_gen.h"
> +#include "config.h"
> +
> +static void
> +fill_attributes(struct rte_flow_attr *attr,
> +	uint32_t flow_attrs, uint16_t group)
> +{
> +	if (flow_attrs & INGRESS)
> +		attr->ingress = 1;
> +	if (flow_attrs & EGRESS)
> +		attr->egress = 1;
> +	if (flow_attrs & TRANSFER)
> +		attr->transfer = 1;
> +	attr->group = group;
> +}
> +
> +static void
> +fill_items(struct rte_flow_item *items,
> +	uint32_t flow_items, uint32_t outer_ip_src)

It looks like it is better to have the function inside
items_gen.c. It would allow to make all add_<item> functions
local to items_gen.c.

> +{
> +	uint8_t items_counter = 0;
> +
> +	/* Support outer items up to tunnel layer only. */
> +
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_META))
> +		add_meta_data(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TAG))
> +		add_meta_tag(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH))
> +		add_ether(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VLAN))
> +		add_vlan(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV4))
> +		add_ipv4(items, items_counter++, outer_ip_src);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV6))
> +		add_ipv6(items, items_counter++, outer_ip_src);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TCP))
> +		add_tcp(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_UDP))
> +		add_udp(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN))
> +		add_vxlan(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN_GPE))
> +		add_vxlan_gpe(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GRE))
> +		add_gre(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GENEVE))
> +		add_geneve(items, items_counter++);
> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GTP))
> +		add_gtp(items, items_counter++);

It could be done in a loop: define an array of structures
FLOW_ITEM_MASK(proto) values and add function which should be
called. The only exception is IPv4/IPv6 which requires extra argument -
so all add callbacks should have add_data argument
which is a structure with possible tunings.

> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_END;
> +}
> +
> +static void
> +fill_actions(struct rte_flow_action *actions,
> +	uint32_t flow_actions, uint32_t counter, uint16_t next_table,
> +	uint16_t hairpinq)


It looks like it is better to have the function inside
actions_gen.c. It would allow to make all add_<action>
functions local to actions_gen.c.

> +{
> +	uint8_t actions_counter = 0;
> +	uint16_t hairpin_queues[hairpinq];
> +	uint16_t queues[RXQ_NUM];
> +	uint16_t i;
> +
> +	/* None-fate actions */
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_MARK))
> +		add_mark(actions, actions_counter++);
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_COUNT))
> +		add_count(actions, actions_counter++);
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_META))
> +		add_set_meta(actions, actions_counter++);
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_TAG))
> +		add_set_tag(actions, actions_counter++);
> +
> +	/* Fate actions */
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_QUEUE))
> +		add_queue(actions, actions_counter++, counter % RXQ_NUM);
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_RSS)) {
> +		for (i = 0; i < RXQ_NUM; i++)
> +			queues[i] = i;
> +		add_rss(actions, actions_counter++, queues, RXQ_NUM);
> +	}
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP))
> +		add_jump(actions, actions_counter++, next_table);
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_PORT_ID))
> +		add_port_id(actions, actions_counter++);
> +	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_DROP))
> +		add_drop(actions, actions_counter++);
> +	if (flow_actions & HAIRPIN_QUEUE_ACTION)
> +		add_queue(actions, actions_counter++,
> +			(counter % hairpinq) + RXQ_NUM);
> +	if (flow_actions & HAIRPIN_RSS_ACTION) {
> +		for (i = 0; i < hairpinq; i++)
> +			hairpin_queues[i] = i + RXQ_NUM;
> +		add_rss(actions, actions_counter++, hairpin_queues, hairpinq);
> +	}
> +
> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_END;
> +}
> +
> +struct rte_flow *
> +generate_flow(uint16_t port_id,
> +	uint16_t group,
> +	uint32_t flow_attrs,
> +	uint32_t flow_items,
> +	uint32_t flow_actions,
> +	uint16_t next_table,
> +	uint32_t outer_ip_src,
> +	uint16_t hairpinq,
> +	struct rte_flow_error *error)
> +{
> +	struct rte_flow_attr attr;
> +	struct rte_flow_item items[MAX_ITEMS_NUM];
> +	struct rte_flow_action actions[MAX_ACTIONS_NUM];
> +	struct rte_flow *flow = NULL;
> +
> +	memset(items, 0, sizeof(items));
> +	memset(actions, 0, sizeof(actions));
> +	memset(&attr, 0, sizeof(struct rte_flow_attr));
> +
> +	fill_attributes(&attr, flow_attrs, group);
> +
> +	fill_actions(actions, flow_actions,
> +		outer_ip_src, next_table, hairpinq);
> +
> +	fill_items(items, flow_items, outer_ip_src);
> +
> +	flow = rte_flow_create(port_id, &attr, items, actions, error);
> +	return flow;
> +}
> diff --git a/app/test-flow-perf/flow_gen.h b/app/test-flow-perf/flow_gen.h
> new file mode 100644
> index 000000000..6b30a4ae2
> --- /dev/null
> +++ b/app/test-flow-perf/flow_gen.h
> @@ -0,0 +1,37 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + *
> + * This file contains the items, actions and attributes
> + * definition. And the methods to prepare and fill items,
> + * actions and attributes to generate rte_flow rule.
> + */
> +
> +#ifndef FLOW_PERF_FLOW_GEN
> +#define FLOW_PERF_FLOW_GEN
> +
> +#include <stdint.h>
> +#include <rte_flow.h>
> +
> +#include "config.h"
> +
> +/* Actions */
> +#define HAIRPIN_QUEUE_ACTION FLOW_ITEM_MASK(0)
> +#define HAIRPIN_RSS_ACTION   FLOW_ITEM_MASK(1)

It should be FLOW_ACTION_MASK() and it should use
action defines as a shift similar to items.

> +
> +/* Attributes */
> +#define INGRESS              FLOW_ITEM_MASK(0)
> +#define EGRESS               FLOW_ITEM_MASK(1)
> +#define TRANSFER             FLOW_ITEM_MASK(2)

It should not be FLOW_ITEM_MASK, since it is
associated with flow items. If you really need it,
it should be FLOW_ATTR_MASK().

> +
> +struct rte_flow *
> +generate_flow(uint16_t port_id,
> +	uint16_t group,
> +	uint32_t flow_attrs,
> +	uint32_t flow_items,
> +	uint32_t flow_actions,

At I understand 3 above are masks and it is better to make it uint64_t
from the very beginning.

> +	uint16_t next_table,
> +	uint32_t outer_ip_src,
> +	uint16_t hairpinq,
> +	struct rte_flow_error *error);
> +
> +#endif /* FLOW_PERF_FLOW_GEN */
> diff --git a/app/test-flow-perf/items_gen.c b/app/test-flow-perf/items_gen.c
> new file mode 100644
> index 000000000..c84f45040
> --- /dev/null
> +++ b/app/test-flow-perf/items_gen.c
> @@ -0,0 +1,277 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + *
> + * This file contain the implementations of the items
> + * related methods. Each Item have a method to prepare
> + * the item and add it into items array in given index.
> + */
> +
> +#include <stdint.h>
> +#include <rte_flow.h>
> +
> +#include "items_gen.h"
> +#include "config.h"
> +
> +void
> +add_ether(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_eth eth_spec;
> +	static struct rte_flow_item_eth eth_mask;

Same as actions, it does not allow to have two Eth items
in one rule. However, it looks like current design does not
cover it already on mask level.

> +
> +	memset(&eth_spec, 0, sizeof(struct rte_flow_item_eth));
> +	memset(&eth_mask, 0, sizeof(struct rte_flow_item_eth));
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_ETH;
> +	items[items_counter].spec = &eth_spec;
> +	items[items_counter].mask = &eth_mask;
> +}
> +
> +void
> +add_vlan(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_vlan vlan_spec;
> +	static struct rte_flow_item_vlan vlan_mask;

Split static and local variables by empty line, please.

> +	uint16_t vlan_value = VLAN_VALUE;
> +
> +	memset(&vlan_spec, 0, sizeof(struct rte_flow_item_vlan));
> +	memset(&vlan_mask, 0, sizeof(struct rte_flow_item_vlan));
> +
> +	vlan_spec.tci = RTE_BE16(vlan_value);
> +	vlan_mask.tci = RTE_BE16(0xffff);
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_VLAN;
> +	items[items_counter].spec = &vlan_spec;
> +	items[items_counter].mask = &vlan_mask;
> +}
> +
> +void
> +add_ipv4(struct rte_flow_item *items,
> +	uint8_t items_counter, rte_be32_t src_ipv4)
> +{
> +	static struct rte_flow_item_ipv4 ipv4_spec;
> +	static struct rte_flow_item_ipv4 ipv4_mask;
> +
> +	memset(&ipv4_spec, 0, sizeof(struct rte_flow_item_ipv4));
> +	memset(&ipv4_mask, 0, sizeof(struct rte_flow_item_ipv4));
> +
> +	ipv4_spec.hdr.src_addr = src_ipv4;
> +	ipv4_mask.hdr.src_addr = RTE_BE32(0xffffffff);
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_IPV4;
> +	items[items_counter].spec = &ipv4_spec;
> +	items[items_counter].mask = &ipv4_mask;
> +}
> +
> +
> +void
> +add_ipv6(struct rte_flow_item *items,
> +	uint8_t items_counter, rte_be32_t src_ipv6)
> +{
> +	static struct rte_flow_item_ipv6 ipv6_spec;
> +	static struct rte_flow_item_ipv6 ipv6_mask;
> +
> +	memset(&ipv6_spec, 0, sizeof(struct rte_flow_item_ipv6));
> +	memset(&ipv6_mask, 0, sizeof(struct rte_flow_item_ipv6));
> +
> +	/** Set ipv6 src **/
> +	memset(&ipv6_spec.hdr.src_addr, src_ipv6,
> +		sizeof(ipv6_spec.hdr.src_addr) / 2);
> +
> +	/** Full mask **/
> +	memset(&ipv6_mask.hdr.src_addr, 0xff,
> +		sizeof(ipv6_spec.hdr.src_addr));
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_IPV6;
> +	items[items_counter].spec = &ipv6_spec;
> +	items[items_counter].mask = &ipv6_mask;
> +}
> +
> +void
> +add_tcp(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_tcp tcp_spec;
> +	static struct rte_flow_item_tcp tcp_mask;
> +
> +	memset(&tcp_spec, 0, sizeof(struct rte_flow_item_tcp));
> +	memset(&tcp_mask, 0, sizeof(struct rte_flow_item_tcp));
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_TCP;
> +	items[items_counter].spec = &tcp_spec;
> +	items[items_counter].mask = &tcp_mask;
> +}
> +
> +void
> +add_udp(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_udp udp_spec;
> +	static struct rte_flow_item_udp udp_mask;
> +
> +	memset(&udp_spec, 0, sizeof(struct rte_flow_item_udp));
> +	memset(&udp_mask, 0, sizeof(struct rte_flow_item_udp));
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_UDP;
> +	items[items_counter].spec = &udp_spec;
> +	items[items_counter].mask = &udp_mask;
> +}
> +
> +void
> +add_vxlan(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_vxlan vxlan_spec;
> +	static struct rte_flow_item_vxlan vxlan_mask;

Split static and local variables by empty line, please.

> +	uint32_t vni_value;
> +	uint8_t i;
> +
> +	vni_value = VNI_VALUE;
> +
> +	memset(&vxlan_spec, 0, sizeof(struct rte_flow_item_vxlan));
> +	memset(&vxlan_mask, 0, sizeof(struct rte_flow_item_vxlan));
> +
> +	/* Set standard vxlan vni */
> +	for (i = 0; i < 3; i++) {
> +		vxlan_spec.vni[2 - i] = vni_value >> (i * 8);
> +		vxlan_mask.vni[2 - i] = 0xff;
> +	}
> +
> +	/* Standard vxlan flags */
> +	vxlan_spec.flags = 0x8;
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_VXLAN;
> +	items[items_counter].spec = &vxlan_spec;
> +	items[items_counter].mask = &vxlan_mask;
> +}
> +
> +void
> +add_vxlan_gpe(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_vxlan_gpe vxlan_gpe_spec;
> +	static struct rte_flow_item_vxlan_gpe vxlan_gpe_mask;

Split static and local variables by empty line, please.

> +	uint32_t vni_value;
> +	uint8_t i;
> +
> +	vni_value = VNI_VALUE;
> +
> +	memset(&vxlan_gpe_spec, 0, sizeof(struct rte_flow_item_vxlan_gpe));
> +	memset(&vxlan_gpe_mask, 0, sizeof(struct rte_flow_item_vxlan_gpe));
> +
> +	/* Set vxlan-gpe vni */
> +	for (i = 0; i < 3; i++) {
> +		vxlan_gpe_spec.vni[2 - i] = vni_value >> (i * 8);
> +		vxlan_gpe_mask.vni[2 - i] = 0xff;
> +	}
> +
> +	/* vxlan-gpe flags */
> +	vxlan_gpe_spec.flags = 0x0c;
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_VXLAN_GPE;
> +	items[items_counter].spec = &vxlan_gpe_spec;
> +	items[items_counter].mask = &vxlan_gpe_mask;
> +}
> +
> +void
> +add_gre(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_gre gre_spec;
> +	static struct rte_flow_item_gre gre_mask;

Split static and local variables by empty line, please.

> +	uint16_t proto;
> +
> +	proto = GRE_PROTO;
> +
> +	memset(&gre_spec, 0, sizeof(struct rte_flow_item_gre));
> +	memset(&gre_mask, 0, sizeof(struct rte_flow_item_gre));
> +
> +	gre_spec.protocol = RTE_BE16(proto);
> +	gre_mask.protocol = RTE_BE16(0xffff);
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_GRE;
> +	items[items_counter].spec = &gre_spec;
> +	items[items_counter].mask = &gre_mask;
> +}
> +
> +void
> +add_geneve(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_geneve geneve_spec;
> +	static struct rte_flow_item_geneve geneve_mask;

Split static and local variables by empty line, please.

> +	uint32_t vni_value;
> +	uint8_t i;
> +
> +	vni_value = VNI_VALUE;
> +
> +	memset(&geneve_spec, 0, sizeof(struct rte_flow_item_geneve));
> +	memset(&geneve_mask, 0, sizeof(struct rte_flow_item_geneve));
> +
> +	for (i = 0; i < 3; i++) {
> +		geneve_spec.vni[2 - i] = vni_value >> (i * 8);
> +		geneve_mask.vni[2 - i] = 0xff;
> +	}
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_GENEVE;
> +	items[items_counter].spec = &geneve_spec;
> +	items[items_counter].mask = &geneve_mask;
> +}
> +
> +void
> +add_gtp(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_gtp gtp_spec;
> +	static struct rte_flow_item_gtp gtp_mask;

Split static and local variables by empty line, please.

> +	uint32_t teid_value;
> +
> +	teid_value = TEID_VALUE;
> +
> +	memset(&gtp_spec, 0, sizeof(struct rte_flow_item_gtp));
> +	memset(&gtp_mask, 0, sizeof(struct rte_flow_item_gtp));
> +
> +	gtp_spec.teid = RTE_BE32(teid_value);
> +	gtp_mask.teid = RTE_BE32(0xffffffff);
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_GTP;
> +	items[items_counter].spec = &gtp_spec;
> +	items[items_counter].mask = &gtp_mask;
> +}
> +
> +void
> +add_meta_data(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_meta meta_spec;
> +	static struct rte_flow_item_meta meta_mask;

Split static and local variables by empty line, please.

> +	uint32_t data;
> +
> +	data = META_DATA;
> +
> +	memset(&meta_spec, 0, sizeof(struct rte_flow_item_meta));
> +	memset(&meta_mask, 0, sizeof(struct rte_flow_item_meta));
> +
> +	meta_spec.data = RTE_BE32(data);
> +	meta_mask.data = RTE_BE32(0xffffffff);
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_META;
> +	items[items_counter].spec = &meta_spec;
> +	items[items_counter].mask = &meta_mask;
> +}
> +
> +
> +void
> +add_meta_tag(struct rte_flow_item *items, uint8_t items_counter)
> +{
> +	static struct rte_flow_item_tag tag_spec;
> +	static struct rte_flow_item_tag tag_mask;

Split static and local variables by empty line, please.

> +	uint32_t data;
> +	uint8_t index;
> +
> +	data = META_DATA;
> +	index = TAG_INDEX;
> +
> +	memset(&tag_spec, 0, sizeof(struct rte_flow_item_tag));
> +	memset(&tag_mask, 0, sizeof(struct rte_flow_item_tag));
> +
> +	tag_spec.data = RTE_BE32(data);
> +	tag_mask.data = RTE_BE32(0xffffffff);
> +	tag_spec.index = index;
> +	tag_mask.index = 0xff;
> +
> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_TAG;
> +	items[items_counter].spec = &tag_spec;
> +	items[items_counter].mask = &tag_mask;
> +}
> diff --git a/app/test-flow-perf/items_gen.h b/app/test-flow-perf/items_gen.h
> new file mode 100644
> index 000000000..0edbc0b37
> --- /dev/null
> +++ b/app/test-flow-perf/items_gen.h
> @@ -0,0 +1,31 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2020 Mellanox Technologies, Ltd
> + *
> + * This file contains the items related methods
> + */
> +
> +#ifndef FLOW_PERF_ITEMS_GEN
> +#define FLOW_PERF_ITEMS_GEN
> +
> +#include <stdint.h>
> +#include <rte_flow.h>
> +
> +#include "config.h"
> +
> +void add_ether(struct rte_flow_item *items, uint8_t items_counter);
> +void add_vlan(struct rte_flow_item *items, uint8_t items_counter);
> +void add_ipv4(struct rte_flow_item *items,
> +	uint8_t items_counter, rte_be32_t src_ipv4);
> +void add_ipv6(struct rte_flow_item *items,
> +	uint8_t items_counter, rte_be32_t src_ipv6);
> +void add_udp(struct rte_flow_item *items, uint8_t items_counter);
> +void add_tcp(struct rte_flow_item *items, uint8_t items_counter);
> +void add_vxlan(struct rte_flow_item *items, uint8_t items_counter);
> +void add_vxlan_gpe(struct rte_flow_item *items, uint8_t items_counter);
> +void add_gre(struct rte_flow_item *items, uint8_t items_counter);
> +void add_geneve(struct rte_flow_item *items, uint8_t items_counter);
> +void add_gtp(struct rte_flow_item *items, uint8_t items_counter);
> +void add_meta_data(struct rte_flow_item *items, uint8_t items_counter);
> +void add_meta_tag(struct rte_flow_item *items, uint8_t items_counter);
> +
> +#endif /* FLOW_PERF_ITEMS_GEN */
> diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
> index 8659870af..1feb73e6f 100644
> --- a/app/test-flow-perf/main.c
> +++ b/app/test-flow-perf/main.c
> @@ -26,6 +26,7 @@
>  #include <getopt.h>
>  #include <stdbool.h>
>  #include <sys/time.h>
> +#include <signal.h>
>  
>  #include <rte_malloc.h>
>  #include <rte_mempool.h>
> @@ -34,29 +35,257 @@
>  #include <rte_flow.h>
>  
>  #include "config.h"
> +#include "flow_gen.h"
>  
> -static uint32_t nb_lcores;
> +#define MAX_ITERATIONS             100
> +#define DEFAULT_RULES_COUNT    4000000
> +#define DEFAULT_ITERATION       100000
> +
> +struct rte_flow *flow;
> +static uint8_t flow_group;
> +
> +static uint32_t flow_items;
> +static uint32_t flow_actions;
> +static uint32_t flow_attrs;
> +static volatile bool force_quit;
> +static bool dump_iterations;
>  static struct rte_mempool *mbuf_mp;
> +static uint32_t nb_lcores;
> +static uint32_t flows_count;
> +static uint32_t iterations_number;
> +static uint32_t hairpinq;
>  
>  static void
>  usage(char *progname)
>  {
>  	printf("\nusage: %s\n", progname);
> +	printf("\nControl configurations:\n");
> +	printf("  --flows-count=N: to set the number of needed"
> +		" flows to insert, default is 4,000,000\n");
> +	printf("  --dump-iterations: To print rates for each"
> +		" iteration\n");
> +
> +	printf("To set flow attributes:\n");
> +	printf("  --ingress: set ingress attribute in flows\n");
> +	printf("  --egress: set egress attribute in flows\n");
> +	printf("  --transfer: set transfer attribute in flows\n");
> +	printf("  --group=N: set group for all flows,"
> +		" default is 0\n");
> +
> +	printf("To set flow items:\n");
> +	printf("  --ether: add ether layer in flow items\n");
> +	printf("  --vlan: add vlan layer in flow items\n");
> +	printf("  --ipv4: add ipv4 layer in flow items\n");
> +	printf("  --ipv6: add ipv6 layer in flow items\n");
> +	printf("  --tcp: add tcp layer in flow items\n");
> +	printf("  --udp: add udp layer in flow items\n");
> +	printf("  --vxlan: add vxlan layer in flow items\n");
> +	printf("  --vxlan-gpe: add vxlan-gpe layer in flow items\n");
> +	printf("  --gre: add gre layer in flow items\n");
> +	printf("  --geneve: add geneve layer in flow items\n");
> +	printf("  --gtp: add gtp layer in flow items\n");
> +	printf("  --meta: add meta layer in flow items\n");
> +	printf("  --tag: add tag layer in flow items\n");
> +
> +	printf("To set flow actions:\n");
> +	printf("  --port-id: add port-id action in flow actions\n");
> +	printf("  --rss: add rss action in flow actions\n");
> +	printf("  --queue: add queue action in flow actions\n");
> +	printf("  --jump: add jump action in flow actions\n");
> +	printf("  --mark: add mark action in flow actions\n");
> +	printf("  --count: add count action in flow actions\n");
> +	printf("  --set-meta: add set meta action in flow actions\n");
> +	printf("  --set-tag: add set tag action in flow actions\n");
> +	printf("  --drop: add drop action in flow actions\n");
> +	printf("  --hairpin-queue=N: add hairpin-queue action in flow actions\n");
> +	printf("  --hairpin-rss=N: add hairping-rss action in flow actions\n");
>  }
>  
>  static void
>  args_parse(int argc, char **argv)
>  {
>  	char **argvopt;
> -	int opt;
> +	int n, opt;
>  	int opt_idx;
> +	size_t i;
> +
> +	static const struct option_dict {
> +		const char *str;
> +		const uint64_t mask;
> +		uint32_t *bitmap;

Should be uint64_t

> +	} flow_options[] = {
> +		{
> +			.str = "ether",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "ipv4",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV4),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "ipv6",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV6),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "vlan",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VLAN),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "tcp",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TCP),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "udp",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_UDP),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "vxlan",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "vxlan-gpe",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "gre",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GRE),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "geneve",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GENEVE),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "gtp",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GTP),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "meta",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_META),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "tag",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TAG),
> +			.bitmap = &flow_items
> +		},
> +		{
> +			.str = "ingress",
> +			.mask = INGRESS,
> +			.bitmap = &flow_attrs
> +		},
> +		{
> +			.str = "egress",
> +			.mask = EGRESS,
> +			.bitmap = &flow_attrs
> +		},
> +		{
> +			.str = "transfer",
> +			.mask = TRANSFER,
> +			.bitmap = &flow_attrs
> +		},
> +		{
> +			.str = "port-id",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_PORT_ID),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "rss",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_RSS),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "queue",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_QUEUE),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "jump",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "mark",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_MARK),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "count",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_COUNT),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "set-meta",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_META),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "set-tag",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_TAG),
> +			.bitmap = &flow_actions
> +		},
> +		{
> +			.str = "drop",
> +			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_DROP),
> +			.bitmap = &flow_actions
> +		}
> +	};
> +
>  	static struct option lgopts[] = {

static const

>  		/* Control */
>  		{ "help",                       0, 0, 0 },
> +		{ "flows-count",                1, 0, 0 },
> +		{ "dump-iterations",            0, 0, 0 },

It looks like above it the path which should be defined
here.

> +		/* Attributes */
> +		{ "ingress",                    0, 0, 0 },
> +		{ "egress",                     0, 0, 0 },
> +		{ "transfer",                   0, 0, 0 },
> +		{ "group",                      1, 0, 0 },
> +		/* Items */
> +		{ "ether",                      0, 0, 0 },
> +		{ "vlan",                       0, 0, 0 },
> +		{ "ipv4",                       0, 0, 0 },
> +		{ "ipv6",                       0, 0, 0 },
> +		{ "tcp",                        0, 0, 0 },
> +		{ "udp",                        0, 0, 0 },
> +		{ "vxlan",                      0, 0, 0 },
> +		{ "vxlan-gpe",                  0, 0, 0 },
> +		{ "gre",                        0, 0, 0 },
> +		{ "geneve",                     0, 0, 0 },
> +		{ "gtp",                        0, 0, 0 },
> +		{ "meta",                       0, 0, 0 },
> +		{ "tag",                        0, 0, 0 },
> +		/* Actions */
> +		{ "port-id",                    0, 0, 0 },
> +		{ "rss",                        0, 0, 0 },
> +		{ "queue",                      0, 0, 0 },
> +		{ "jump",                       0, 0, 0 },
> +		{ "mark",                       0, 0, 0 },
> +		{ "count",                      0, 0, 0 },
> +		{ "set-meta",                   0, 0, 0 },
> +		{ "set-tag",                    0, 0, 0 },
> +		{ "drop",                       0, 0, 0 },
> +		{ "hairpin-queue",              1, 0, 0 },
> +		{ "hairpin-rss",                1, 0, 0 },

This part should be added by code which iterates by
flow_options. I.e. allocate lgopts dynamically, copy
static options there by memcpy() and add dynamic as
described above. May be flow_options require extra
field  'has_arg'.

>  	};
>  
> +	flow_items = 0;
> +	flow_actions = 0;
> +	flow_attrs = 0;
> +	hairpinq = 0;
>  	argvopt = argv;
>  
> +	printf(":: Flow -> ");
>  	while ((opt = getopt_long(argc, argvopt, "",
>  				lgopts, &opt_idx)) != EOF) {
>  		switch (opt) {
> @@ -65,6 +294,65 @@ args_parse(int argc, char **argv)
>  				usage(argv[0]);
>  				rte_exit(EXIT_SUCCESS, "Displayed help\n");
>  			}
> +
> +			if (strcmp(lgopts[opt_idx].name, "group") == 0) {
> +				n = atoi(optarg);
> +				if (n >= 0)
> +					flow_group = n;
> +				else
> +					rte_exit(EXIT_SUCCESS,
> +						"flow group should be >= 0");
> +				printf("group %d ", flow_group);
> +			}
> +
> +			for (i = 0; i < RTE_DIM(flow_options); i++)
> +				if (strcmp(lgopts[opt_idx].name,
> +						flow_options[i].str) == 0) {
> +					*flow_options[i].bitmap |=
> +						flow_options[i].mask;
> +					printf("%s / ", flow_options[i].str);
> +				}
> +
> +			if (strcmp(lgopts[opt_idx].name,
> +					"hairpin-rss") == 0) {
> +				n = atoi(optarg);
> +				if (n > 0)
> +					hairpinq = n;
> +				else
> +					rte_exit(EXIT_SUCCESS,
> +						"Hairpin queues should be > 0 ");
> +
> +				flow_actions |= HAIRPIN_RSS_ACTION;
> +				printf("hairpin-rss / ");
> +			}
> +			if (strcmp(lgopts[opt_idx].name,
> +					"hairpin-queue") == 0) {
> +				n = atoi(optarg);
> +				if (n > 0)
> +					hairpinq = n;
> +				else
> +					rte_exit(EXIT_SUCCESS,
> +						"Hairpin queues should be > 0 ");
> +
> +				flow_actions |= HAIRPIN_QUEUE_ACTION;
> +				printf("hairpin-queue / ");
> +			}
> +
> +			/* Control */
> +			if (strcmp(lgopts[opt_idx].name,
> +					"flows-count") == 0) {
> +				n = atoi(optarg);
> +				if (n > (int) iterations_number)
> +					flows_count = n;
> +				else {
> +					printf("\n\nflows_count should be > %d",
> +						iterations_number);
> +					rte_exit(EXIT_SUCCESS, " ");
> +				}
> +			}
> +			if (strcmp(lgopts[opt_idx].name,
> +					"dump-iterations") == 0)
> +				dump_iterations = true;
>  			break;
>  		default:
>  			fprintf(stderr, "Invalid option: %s\n", argv[optind]);
> @@ -73,6 +361,130 @@ args_parse(int argc, char **argv)
>  			break;
>  		}
>  	}
> +	printf("end_flow\n");
> +}
> +
> +static void
> +print_flow_error(struct rte_flow_error error)
> +{
> +	printf("Flow can't be created %d message: %s\n",
> +		error.type,
> +		error.message ? error.message : "(no stated reason)");
> +}
> +
> +static inline void
> +flows_handler(void)
> +{
> +	struct rte_flow_error error;
> +	clock_t start_iter, end_iter;
> +	double cpu_time_used;
> +	double flows_rate;
> +	double cpu_time_per_iter[MAX_ITERATIONS];
> +	double delta;
> +	uint16_t nr_ports;
> +	uint32_t i;
> +	int port_id;
> +	int iter_id;
> +	uint32_t eagain_counter = 0;
> +
> +	nr_ports = rte_eth_dev_count_avail();
> +
> +	for (i = 0; i < MAX_ITERATIONS; i++)
> +		cpu_time_per_iter[i] = -1;
> +
> +	if (iterations_number > flows_count)
> +		iterations_number = flows_count;
> +
> +	printf(":: Flows Count per port: %d\n", flows_count);
> +
> +	for (port_id = 0; port_id < nr_ports; port_id++) {
> +		cpu_time_used = 0;
> +		if (flow_group > 0) {
> +			/*
> +			 * Create global rule to jump into flow_group,
> +			 * this way the app will avoid the default rules.
> +			 *
> +			 * Golbal rule:
> +			 * group 0 eth / end actions jump group <flow_group>
> +			 *
> +			 */
> +			flow = generate_flow(port_id, 0, flow_attrs,
> +				FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH),
> +				FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP),
> +				flow_group, 0, 0, &error);
> +
> +			if (flow == NULL) {
> +				print_flow_error(error);
> +				rte_exit(EXIT_FAILURE, "error in creating flow");
> +			}
> +		}
> +
> +		/* Insertion Rate */
> +		printf("Flows insertion on port = %d\n", port_id);
> +		start_iter = clock();
> +		for (i = 0; i < flows_count; i++) {
> +			do {
> +				rte_errno = 0;
> +				flow = generate_flow(port_id, flow_group,
> +					flow_attrs, flow_items, flow_actions,
> +					JUMP_ACTION_TABLE, i, hairpinq, &error);
> +				if (flow == NULL)
> +					eagain_counter++;
> +			} while (rte_errno == EAGAIN);
> +
> +			if (force_quit)
> +				i = flows_count;
> +
> +			if (!flow) {
> +				print_flow_error(error);
> +				rte_exit(EXIT_FAILURE, "error in creating flow");
> +			}
> +
> +			if (i && !((i + 1) % iterations_number)) {
> +				/* Save the insertion rate of each iter */
> +				end_iter = clock();
> +				delta = (double) (end_iter - start_iter);
> +				iter_id = ((i + 1) / iterations_number) - 1;
> +				cpu_time_per_iter[iter_id] =
> +					delta / CLOCKS_PER_SEC;
> +				cpu_time_used += cpu_time_per_iter[iter_id];
> +				start_iter = clock();
> +			}
> +		}
> +
> +		/* Iteration rate per iteration */
> +		if (dump_iterations)
> +			for (i = 0; i < MAX_ITERATIONS; i++) {
> +				if (cpu_time_per_iter[i] == -1)
> +					continue;
> +				delta = (double)(iterations_number /
> +					cpu_time_per_iter[i]);
> +				flows_rate = delta / 1000;
> +				printf(":: Iteration #%d: %d flows "
> +					"in %f sec[ Rate = %f K/Sec ]\n",
> +					i, iterations_number,
> +					cpu_time_per_iter[i], flows_rate);
> +			}
> +
> +		/* Insertion rate for all flows */
> +		flows_rate = ((double) (flows_count / cpu_time_used) / 1000);
> +		printf("\n:: Total flow insertion rate -> %f K/Sec\n",
> +						flows_rate);
> +		printf(":: The time for creating %d in flows %f seconds\n",
> +						flows_count, cpu_time_used);
> +		printf(":: EAGAIN counter = %d\n", eagain_counter);
> +	}
> +}
> +
> +static void
> +signal_handler(int signum)
> +{
> +	if (signum == SIGINT || signum == SIGTERM) {
> +		printf("\n\nSignal %d received, preparing to exit...\n",
> +					signum);
> +		printf("Error: Stats are wrong due to sudden signal!\n\n");
> +		force_quit = true;
> +	}
>  }
>  
>  static void
> @@ -80,8 +492,13 @@ init_port(void)
>  {
>  	int ret;
>  	uint16_t std_queue;
> +	uint16_t hairpin_q;
>  	uint16_t port_id;
>  	uint16_t nr_ports;
> +	uint16_t nr_queues;
> +	struct rte_eth_hairpin_conf hairpin_conf = {
> +		.peer_count = 1,
> +	};
>  	struct rte_eth_conf port_conf = {
>  		.rx_adv_conf = {
>  			.rss_conf.rss_hf =
> @@ -92,6 +509,10 @@ init_port(void)
>  	struct rte_eth_rxconf rxq_conf;
>  	struct rte_eth_dev_info dev_info;
>  
> +	nr_queues = RXQ_NUM;
> +	if (hairpinq != 0)
> +		nr_queues = RXQ_NUM + hairpinq;
> +
>  	nr_ports = rte_eth_dev_count_avail();
>  	if (nr_ports == 0)
>  		rte_exit(EXIT_FAILURE, "Error: no port detected\n");
> @@ -116,8 +537,8 @@ init_port(void)
>  
>  		printf(":: initializing port: %d\n", port_id);
>  
> -		ret = rte_eth_dev_configure(port_id, RXQ_NUM,
> -				TXQ_NUM, &port_conf);
> +		ret = rte_eth_dev_configure(port_id, nr_queues,
> +				nr_queues, &port_conf);
>  		if (ret < 0)
>  			rte_exit(EXIT_FAILURE,
>  				":: cannot configure device: err=%d, port=%u\n",
> @@ -153,6 +574,38 @@ init_port(void)
>  				":: promiscuous mode enable failed: err=%s, port=%u\n",
>  				rte_strerror(-ret), port_id);
>  
> +		if (hairpinq != 0) {
> +			for (hairpin_q = RXQ_NUM, std_queue = 0;
> +					std_queue < nr_queues;
> +					hairpin_q++, std_queue++) {
> +				hairpin_conf.peers[0].port = port_id;
> +				hairpin_conf.peers[0].queue =
> +					std_queue + TXQ_NUM;
> +				ret = rte_eth_rx_hairpin_queue_setup(
> +						port_id, hairpin_q,
> +						NR_RXD, &hairpin_conf);
> +				if (ret != 0)
> +					rte_exit(EXIT_FAILURE,
> +						":: Hairpin rx queue setup failed: err=%d, port=%u\n",
> +						ret, port_id);
> +			}
> +
> +			for (hairpin_q = TXQ_NUM, std_queue = 0;
> +					std_queue < nr_queues;
> +					hairpin_q++, std_queue++) {
> +				hairpin_conf.peers[0].port = port_id;
> +				hairpin_conf.peers[0].queue =
> +					std_queue + RXQ_NUM;
> +				ret = rte_eth_tx_hairpin_queue_setup(
> +						port_id, hairpin_q,
> +						NR_TXD, &hairpin_conf);
> +				if (ret != 0)
> +					rte_exit(EXIT_FAILURE,
> +						":: Hairpin tx queue setup failed: err=%d, port=%u\n",
> +						ret, port_id);
> +			}
> +		}
> +
>  		ret = rte_eth_dev_start(port_id);
>  		if (ret < 0)
>  			rte_exit(EXIT_FAILURE,
> @@ -174,6 +627,15 @@ main(int argc, char **argv)
>  	if (ret < 0)
>  		rte_exit(EXIT_FAILURE, "EAL init failed\n");
>  
> +	force_quit = false;
> +	dump_iterations = false;
> +	flows_count = DEFAULT_RULES_COUNT;
> +	iterations_number = DEFAULT_ITERATION;
> +	flow_group = 0;
> +
> +	signal(SIGINT, signal_handler);
> +	signal(SIGTERM, signal_handler);
> +
>  	argc -= ret;
>  	argv += ret;
>  	if (argc > 1)
> @@ -185,6 +647,8 @@ main(int argc, char **argv)
>  	if (nb_lcores <= 1)
>  		rte_exit(EXIT_FAILURE, "This app needs at least two cores\n");
>  
> +	flows_handler();
> +
>  	RTE_ETH_FOREACH_DEV(port) {
>  		rte_flow_flush(port, &error);
>  		rte_eth_dev_stop(port);
> diff --git a/app/test-flow-perf/meson.build b/app/test-flow-perf/meson.build
> index 25711378f..6eaf83b41 100644
> --- a/app/test-flow-perf/meson.build
> +++ b/app/test-flow-perf/meson.build
> @@ -2,6 +2,9 @@
>  # Copyright(c) 2020 Mellanox Technologies, Ltd
>  
>  sources = files(
> +	'actions_gen.c',
> +	'flow_gen.c',
> +	'items_gen.c',
>  	'main.c',
>  )
>  
> diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst
> index 7abcae3aa..0e4dcf1ad 100644
> --- a/doc/guides/rel_notes/release_20_05.rst
> +++ b/doc/guides/rel_notes/release_20_05.rst
> @@ -216,6 +216,9 @@ New Features
>  
>    Add new application to test rte_flow performance.
>  
> +  Application features:
> +  * Measure rte_flow insertion rate.
> +
>  
>  Removed Items
>  -------------
> diff --git a/doc/guides/tools/flow-perf.rst b/doc/guides/tools/flow-perf.rst
> index 49eb450ae..6f3f7dafb 100644
> --- a/doc/guides/tools/flow-perf.rst
> +++ b/doc/guides/tools/flow-perf.rst
> @@ -1,10 +1,29 @@
>  ..	SPDX-License-Identifier: BSD-3-Clause
>  	Copyright 2020 Mellanox Technologies, Ltd
>  
> -Flow performance tool
> +Flow Performance Tool

It should be good from the very beginning in the first patch.

>  =====================
>  
>  Application for rte_flow performance testing.
> +The application provide the ability to test insertion rate of specific
> +rte_flow rule, by stressing it to the NIC, and calculate the insertion
> +rate.
> +
> +The application offers some options in the command line, to configure
> +which rule to apply.
> +
> +After that the application will start producing rules with same pattern
> +but increasing the outer IP source address by 1 each time, thus it will
> +give different flow each time, and all other items will have open masks.
> +
> +
> +Known Limitations
> +=================
> +
> +The current version has limitations which can be removed in future:
> +
> +* Support outer items up to tunnel layer only.
> +* Single core insertion only.
>  
>  
>  Compiling the Application
> @@ -27,7 +46,7 @@ or :doc:`EAL parameters (FreeBSD) <../freebsd_gsg/freebsd_eal_parameters>` for
>  a list of available EAL command-line options.
>  
>  
> -Flow performance Options
> +Flow Performance Options
>  ------------------------
>  
>  The following are the command-line options for the flow performance application.
> @@ -36,9 +55,179 @@ with a ``--`` separator:
>  
>  .. code-block:: console
>  
> -	sudo ./dpdk-test-flow-perf -n 4 -w 08:00.0 --
> +	sudo ./dpdk-test-flow_perf -n 4 -w 08:00.0 -- --ingress --ether --ipv4 --queue --flows-count=1000000
>  
>  The command line options are:
>  
>  *	``--help``
>  	Display a help message and quit.
> +
> +*	``--flows-count=N``
> +	Set the number of needed flows to insert,
> +	where 1 <= N <= "number of flows".
> +	The default value is 4,000,000.
> +
> +*	``--dump-iterations``
> +	Print rates for each iteration of flows.
> +	Default iteration is 1,00,000.
> +
> +
> +Attributes:
> +
> +*	``--ingress``
> +	Set Ingress attribute to all flows attributes.
> +
> +*	``--egress``
> +	Set Egress attribute to all flows attributes.
> +
> +*	``--transfer``
> +	Set Transfer attribute to all flows attributes.
> +
> +*	``--group=N``
> +	Set group for all flows, where N >= 0.
> +	Default group is 0.
> +
> +Items:
> +
> +*	``--ether``
> +	Add Ether item to all flows items, This item have open mask.
> +
> +*	``--vlan``
> +	Add VLAN item to all flows items,
> +	This item have VLAN value defined in user_parameters.h
> +	under ``VNI_VALUE`` with full mask, default value = 1.
> +	Other fields are open mask.
> +
> +*	``--ipv4``
> +	Add IPv4 item to all flows items,
> +	This item have incremental source IP, with full mask.
> +	Other fields are open mask.
> +
> +*	``--ipv6``
> +	Add IPv6 item to all flows item,
> +	This item have incremental source IP, with full mask.
> +	Other fields are open mask.
> +
> +*	``--tcp``
> +	Add TCP item to all flows items, This item have open mask.
> +
> +*	``--udp``
> +	Add UDP item to all flows items, This item have open mask.
> +
> +*	``--vxlan``
> +	Add VXLAN item to all flows items,
> +	This item have VNI value defined in user_parameters.h
> +	under ``VNI_VALUE`` with full mask, default value = 1.
> +	Other fields are open mask.
> +
> +*	``--vxlan-gpe``
> +	Add VXLAN-GPE item to all flows items,
> +	This item have VNI value defined in user_parameters.h
> +	under ``VNI_VALUE`` with full mask, default value = 1.
> +	Other fields are open mask.
> +
> +*	``--gre``
> +	Add GRE item to all flows items,
> +	This item have protocol value defined in user_parameters.h
> +	under ``GRE_PROTO`` with full mask, default protocol = 0x6558 "Ether"
> +	Other fields are open mask.
> +
> +*	``--geneve``
> +	Add GENEVE item to all flows items,
> +	This item have VNI value defined in user_parameters.h
> +	under ``VNI_VALUE`` with full mask, default value = 1.
> +	Other fields are open mask.
> +
> +*	``--gtp``
> +	Add GTP item to all flows items,
> +	This item have TEID value defined in user_parameters.h
> +	under ``TEID_VALUE`` with full mask, default value = 1.
> +	Other fields are open mask.
> +
> +*	``--meta``
> +	Add Meta item to all flows items,
> +	This item have data value defined in user_parameters.h
> +	under ``META_DATA`` with full mask, default value = 1.
> +	Other fields are open mask.
> +
> +*	``--tag``
> +	Add Tag item to all flows items,
> +	This item have data value defined in user_parameters.h
> +	under ``META_DATA`` with full mask, default value = 1.
> +
> +	Also it have tag value defined in user_parameters.h
> +	under ``TAG_INDEX`` with full mask, default value = 0.
> +	Other fields are open mask.
> +
> +
> +Actions:
> +
> +*	``--port-id``
> +	Add port redirection action to all flows actions.
> +	Port redirection destination is defined in user_parameters.h
> +	under PORT_ID_DST, default value = 1.
> +
> +*	``--rss``
> +	Add RSS action to all flows actions,
> +	The queues in RSS action will be all queues configured
> +	in the app.
> +
> +*	``--queue``
> +	Add queue action to all flows items,
> +	The queue will change in round robin state for each flow.
> +
> +	For example:
> +		The app running with 4 RX queues
> +		Flow #0: queue index 0
> +		Flow #1: queue index 1
> +		Flow #2: queue index 2
> +		Flow #3: queue index 3
> +		Flow #4: queue index 0
> +		...
> +
> +*	``--jump``
> +	Add jump action to all flows actions.
> +	Jump action destination is defined in user_parameters.h
> +	under ``JUMP_ACTION_TABLE``, default value = 2.
> +
> +*	``--mark``
> +	Add mark action to all flows actions.
> +	Mark action id is defined in user_parameters.h
> +	under ``MARK_ID``, default value = 1.
> +
> +*	``--count``
> +	Add count action to all flows actions.
> +
> +*	``--set-meta``
> +	Add set-meta action to all flows actions.
> +	Meta data is defined in user_parameters.h under ``META_DATA``
> +	with full mask, default value = 1.
> +
> +*	``--set-tag``
> +	Add set-tag action to all flows actions.
> +	Meta data is defined in user_parameters.h under ``META_DATA``
> +	with full mask, default value = 1.
> +
> +	Tag index is defined in user_parameters.h under ``TAG_INDEX``
> +	with full mask, default value = 0.
> +
> +*	``--drop``
> +	Add drop action to all flows actions.
> +
> +*	``--hairpin-queue=N``
> +	Add hairpin queue action to all flows actions.
> +	The queue will change in round robin state for each flow.
> +
> +	For example:
> +		The app running with 4 RX hairpin queues and 4 normal RX queues
> +		Flow #0: queue index 4
> +		Flow #1: queue index 5
> +		Flow #2: queue index 6
> +		Flow #3: queue index 7
> +		Flow #4: queue index 4
> +		...
> +
> +*	``--hairpin-rss=N``
> +	Add hairpin RSS action to all flows actions.
> +	The queues in RSS action will be all hairpin queues configured
> +	in the app.
>
Wisam Jaddo May 12, 2020, 10:34 a.m. UTC | #2
>-----Original Message-----
>From: Andrew Rybchenko <arybchenko@solarflare.com>
>Sent: Monday, May 11, 2020 3:05 PM
>To: Wisam Monther <wisamm@mellanox.com>; dev@dpdk.org; Jack Min
><jackmin@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
>jerinjacobk@gmail.com; ajit.khaparde@broadcom.com
>Subject: Re: [PATCH v6 2/5] app/flow-perf: add insertion rate calculation
>
>On 5/11/20 2:09 PM, Wisam Jaddo wrote:
>> Add insertion rate calculation feature into flow
>> performance application.
>>
>> The application now provide the ability to test
>> insertion rate of specific rte_flow rule, by
>> stressing it to the NIC, and calculate the
>> insertion rate.
>>
>> The application offers some options in the command
>> line, to configure which rule to apply.
>>
>> After that the application will start producing
>> rules with same pattern but increasing the outer IP
>> source address by 1 each time, thus it will give
>> different flow each time, and all other items will
>> have open masks.
>>
>> The current design have single core insertion rate.
>> In the future we may have a multi core insertion
>> rate measurement support in the app.
>>
>> Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
>> ---
>>  app/test-flow-perf/Makefile            |   3 +
>>  app/test-flow-perf/actions_gen.c       | 164 +++++++++
>>  app/test-flow-perf/actions_gen.h       |  29 ++
>>  app/test-flow-perf/config.h            |  16 +
>>  app/test-flow-perf/flow_gen.c          | 145 ++++++++
>>  app/test-flow-perf/flow_gen.h          |  37 ++
>>  app/test-flow-perf/items_gen.c         | 277 +++++++++++++++
>>  app/test-flow-perf/items_gen.h         |  31 ++
>>  app/test-flow-perf/main.c              | 472 ++++++++++++++++++++++++-
>>  app/test-flow-perf/meson.build         |   3 +
>>  doc/guides/rel_notes/release_20_05.rst |   3 +
>>  doc/guides/tools/flow-perf.rst         | 195 +++++++++-
>>  12 files changed, 1368 insertions(+), 7 deletions(-)
>>  create mode 100644 app/test-flow-perf/actions_gen.c
>>  create mode 100644 app/test-flow-perf/actions_gen.h
>>  create mode 100644 app/test-flow-perf/flow_gen.c
>>  create mode 100644 app/test-flow-perf/flow_gen.h
>>  create mode 100644 app/test-flow-perf/items_gen.c
>>  create mode 100644 app/test-flow-perf/items_gen.h
>>
>> diff --git a/app/test-flow-perf/Makefile b/app/test-flow-perf/Makefile
>> index db043c17a..4f2db7591 100644
>> --- a/app/test-flow-perf/Makefile
>> +++ b/app/test-flow-perf/Makefile
>> @@ -16,6 +16,9 @@ CFLAGS += $(WERROR_FLAGS)
>>  #
>>  # all source are stored in SRCS-y
>>  #
>> +SRCS-y += actions_gen.c
>> +SRCS-y += flow_gen.c
>> +SRCS-y += items_gen.c
>>  SRCS-y += main.c
>>
>>  include $(RTE_SDK)/mk/rte.app.mk
>> diff --git a/app/test-flow-perf/actions_gen.c b/app/test-flow-
>perf/actions_gen.c
>> new file mode 100644
>> index 000000000..16bb3cf20
>> --- /dev/null
>> +++ b/app/test-flow-perf/actions_gen.c
>> @@ -0,0 +1,164 @@
>> +/* SPDX-License-Identifier: BSD-3-Clause
>> + * Copyright 2020 Mellanox Technologies, Ltd
>> + *
>> + * The file contains the implementations of actions generators.
>> + * Each generator is responsible for preparing it's action instance
>> + * and initializing it with needed data.
>> + */
>> +
>> +#include <sys/types.h>
>> +#include <rte_malloc.h>
>> +#include <rte_flow.h>
>> +#include <rte_ethdev.h>
>> +
>> +#include "actions_gen.h"
>> +#include "config.h"
>> +
>> +/* Storage for struct rte_flow_action_rss including external data. */
>> +struct action_rss_data {
>> +	struct rte_flow_action_rss conf;
>> +	uint8_t key[40];
>> +	uint16_t queue[128];
>> +};
>> +
>> +void
>> +add_mark(struct rte_flow_action *actions,
>> +	uint8_t actions_counter)
>> +{
>> +	static struct rte_flow_action_mark mark_action;
>
>Function-local static variables a bit better than file-local
>or global variable, but just a bit. See below.
>At bare minimum it requires a check that the action is not
>in use already. Same in many cases below.

Yes it's better,
What you mean by " At bare minimum it requires a check that the action is not in use already"
Can you please elaborate?

>
>> +
>> +	do {
>> +		mark_action.id = MARK_ID;
>> +	} while (0);
>
>Why do you use dummy do-while loop here? Many similar cases
>below.

Sometimes, it create the flow before setting the correct id, I think it's compiler stuff
So the dummy loop to make sure the compiler finish it's execution and make sure
When the flow is created the action have correct value.

>
>> +
>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_MARK;
>> +	actions[actions_counter].conf = &mark_action;
>> +}
>> +
>> +void
>> +add_queue(struct rte_flow_action *actions,
>> +	uint8_t actions_counter, uint16_t queue)
>> +{
>> +	static struct rte_flow_action_queue queue_action;
>
>It does not allow to use the action twice to deliver to
>to queues.

Yes, it's needed only once

>
>> +
>> +	do {
>> +		queue_action.index = queue;
>> +	} while (0);
>> +
>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_QUEUE;
>> +	actions[actions_counter].conf = &queue_action;
>> +}
>> +
>> +void
>> +add_jump(struct rte_flow_action *actions,
>> +	uint8_t actions_counter, uint16_t next_table)
>> +{
>> +	static struct rte_flow_action_jump jump_action;
>> +
>> +	do {
>> +		jump_action.group = next_table;
>> +	} while (0);
>> +
>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_JUMP;
>> +	actions[actions_counter].conf = &jump_action;
>> +}
>> +
>> +void
>> +add_rss(struct rte_flow_action *actions,
>> +	uint8_t actions_counter, uint16_t *queues,
>> +	uint16_t queues_number)
>> +{
>> +	static struct rte_flow_action_rss *rss_action;
>> +	static struct action_rss_data *rss_data;
>
>It is better to add an empty line here to split static and
>non-static variable and make it easy to catch the difference.
>
>> +	uint16_t queue;
>> +
>> +	rss_data = rte_malloc("rss_data",
>> +		sizeof(struct action_rss_data), 0);
>
>Does it mean that the second invocation will make
>a memory leak?

Not exactly, because the second invocation may have different queues and need
To be used with different flow, so I think it's ok to re malloc it.

>
>> +
>> +	if (rss_data == NULL)
>> +		rte_exit(EXIT_FAILURE, "No Memory available!");
>> +
>> +	*rss_data = (struct action_rss_data){
>> +		.conf = (struct rte_flow_action_rss){
>> +			.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
>> +			.level = 0,
>> +			.types = GET_RSS_HF(),
>> +			.key_len = sizeof(rss_data->key),
>> +			.queue_num = queues_number,
>> +			.key = rss_data->key,
>> +			.queue = rss_data->queue,
>> +		},
>> +		.key = { 1 },
>> +		.queue = { 0 },
>> +	};
>> +
>> +	for (queue = 0; queue < queues_number; queue++)
>> +		rss_data->queue[queue] = queues[queue];
>> +
>> +	rss_action = &rss_data->conf;
>> +
>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_RSS;
>> +	actions[actions_counter++].conf = rss_action;
>> +}
>> +
>> +void
>> +add_count(struct rte_flow_action *actions,
>> +	uint8_t actions_counter)
>> +{
>> +	static struct rte_flow_action_count count_action;
>
>Again it means it is impossible to use the action twice in one
>rule.

Yes,
If I removed the static from the inner scope here,
The action will be freed when we reach the end of the file,
And since the design here to add the action into actions array to be used later in the creation,
It will not have correct action in case of none-static in the inner scope.

If any action needs to have privilege to be called twice in single rule
New support needs to be done.

This is the design for it, I'll add into the known limitation.

>> +static void
>> +fill_items(struct rte_flow_item *items,
>> +	uint32_t flow_items, uint32_t outer_ip_src)
>
>It looks like it is better to have the function inside
>items_gen.c. It would allow to make all add_<item> functions
>local to items_gen.c.
>
>> +{
>> +	uint8_t items_counter = 0;
>> +
>> +	/* Support outer items up to tunnel layer only. */
>> +
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_META))
>> +		add_meta_data(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TAG))
>> +		add_meta_tag(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH))
>> +		add_ether(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VLAN))
>> +		add_vlan(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV4))
>> +		add_ipv4(items, items_counter++, outer_ip_src);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV6))
>> +		add_ipv6(items, items_counter++, outer_ip_src);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TCP))
>> +		add_tcp(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_UDP))
>> +		add_udp(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN))
>> +		add_vxlan(items, items_counter++);
>> +	if (flow_items &
>FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN_GPE))
>> +		add_vxlan_gpe(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GRE))
>> +		add_gre(items, items_counter++);
>> +	if (flow_items &
>FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GENEVE))
>> +		add_geneve(items, items_counter++);
>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GTP))
>> +		add_gtp(items, items_counter++);
>
>It could be done in a loop: define an array of structures
>FLOW_ITEM_MASK(proto) values and add function which should be
>called. The only exception is IPv4/IPv6 which requires extra argument -
>so all add callbacks should have add_data argument
>which is a structure with possible tunings.

Ok, let me re-phrase to make sure I got it:
1- Move it to items_gen.c
2- Create array of structures with all items.
The structure should be something like this:
 static const struct items_dict {
        const uint64_t mask;
        void (*funct)(struct rte_flow_item *items,
                        uint8_t items_counter, rte_be32_t src_ip);
        bool add_args;
    }
3- I need to change the signature "parameters" to all other than ipv4,ipv6 to have another parameter to
Have matched singture.
4- in none-ipv4-ipv6 item I need to call the RTE_SET_USED(src_ip); 
5- loop over the array to add items

Is this right?

>
>> +
>> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_END;
>> +}
>> +
>> +static void
>> +fill_actions(struct rte_flow_action *actions,
>> +	uint32_t flow_actions, uint32_t counter, uint16_t next_table,
>> +	uint16_t hairpinq)
>
>
>It looks like it is better to have the function inside
>actions_gen.c. It would allow to make all add_<action>
>functions local to actions_gen.c.

Sure, but  this one will remain as is, since the actions have different signature and other actions when be added
Will have more different parameters.

>
>> +{
>> +	uint8_t actions_counter = 0;
>> +	uint16_t hairpin_queues[hairpinq];
>> +	uint16_t queues[RXQ_NUM];
>> +	uint16_t i;
>> +
>> +	/* None-fate actions */
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_MARK))
>> +		add_mark(actions, actions_counter++);
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_COUNT))
>> +		add_count(actions, actions_counter++);
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_META))
>> +		add_set_meta(actions, actions_counter++);
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_TAG))
>> +		add_set_tag(actions, actions_counter++);
>> +
>> +	/* Fate actions */
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_QUEUE))
>> +		add_queue(actions, actions_counter++, counter %
>RXQ_NUM);
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_RSS)) {
>> +		for (i = 0; i < RXQ_NUM; i++)
>> +			queues[i] = i;
>> +		add_rss(actions, actions_counter++, queues, RXQ_NUM);
>> +	}
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP))
>> +		add_jump(actions, actions_counter++, next_table);
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_PORT_ID))
>> +		add_port_id(actions, actions_counter++);
>> +	if (flow_actions &
>FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_DROP))
>> +		add_drop(actions, actions_counter++);
>> +	if (flow_actions & HAIRPIN_QUEUE_ACTION)
>> +		add_queue(actions, actions_counter++,
>> +			(counter % hairpinq) + RXQ_NUM);
>> +	if (flow_actions & HAIRPIN_RSS_ACTION) {
>> +		for (i = 0; i < hairpinq; i++)
>> +			hairpin_queues[i] = i + RXQ_NUM;
>> +		add_rss(actions, actions_counter++, hairpin_queues,
>hairpinq);
>> +	}
>> +
>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_END;
>> +}
>> +
>> +
>> +void
>> +add_ether(struct rte_flow_item *items, uint8_t items_counter)
>> +{
>> +	static struct rte_flow_item_eth eth_spec;
>> +	static struct rte_flow_item_eth eth_mask;
>
>Same as actions, it does not allow to have two Eth items
>in one rule. However, it looks like current design does not
>cover it already on mask level.

Yes, indeed.
Already add this in flow_perf.rst as known limitation.

>
>>  		/* Control */
>>  		{ "help",                       0, 0, 0 },
>> +		{ "flows-count",                1, 0, 0 },
>> +		{ "dump-iterations",            0, 0, 0 },
>
>It looks like above it the path which should be defined
>here.

I'm not following, what do you mean?

>
>> +		/* Attributes */
>> +		{ "ingress",                    0, 0, 0 },
>> +		{ "egress",                     0, 0, 0 },
>> +		{ "transfer",                   0, 0, 0 },
>> +		{ "group",                      1, 0, 0 },
>> +		/* Items */
>> +		{ "ether",                      0, 0, 0 },
>> +		{ "vlan",                       0, 0, 0 },
>> +		{ "ipv4",                       0, 0, 0 },
>> +		{ "ipv6",                       0, 0, 0 },
>> +		{ "tcp",                        0, 0, 0 },
>> +		{ "udp",                        0, 0, 0 },
>> +		{ "vxlan",                      0, 0, 0 },
>> +		{ "vxlan-gpe",                  0, 0, 0 },
>> +		{ "gre",                        0, 0, 0 },
>> +		{ "geneve",                     0, 0, 0 },
>> +		{ "gtp",                        0, 0, 0 },
>> +		{ "meta",                       0, 0, 0 },
>> +		{ "tag",                        0, 0, 0 },
>> +		/* Actions */
>> +		{ "port-id",                    0, 0, 0 },
>> +		{ "rss",                        0, 0, 0 },
>> +		{ "queue",                      0, 0, 0 },
>> +		{ "jump",                       0, 0, 0 },
>> +		{ "mark",                       0, 0, 0 },
>> +		{ "count",                      0, 0, 0 },
>> +		{ "set-meta",                   0, 0, 0 },
>> +		{ "set-tag",                    0, 0, 0 },
>> +		{ "drop",                       0, 0, 0 },
>> +		{ "hairpin-queue",              1, 0, 0 },
>> +		{ "hairpin-rss",                1, 0, 0 },
>
>This part should be added by code which iterates by
>flow_options. I.e. allocate lgopts dynamically, copy
>static options there by memcpy() and add dynamic as
>described above. May be flow_options require extra
>field  'has_arg'.

Regard this one,
Some option have special code, like hairpin, group and control, so even with has_arg
It cannot be looped in pretty way, I tried it and it became ugly, I prefer to leave this as it.
I don't think it's critical to get rid of it? What do you think?
Andrew Rybchenko May 12, 2020, 11:07 a.m. UTC | #3
On 5/12/20 1:34 PM, Wisam Monther wrote:
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Monday, May 11, 2020 3:05 PM
>> To: Wisam Monther <wisamm@mellanox.com>; dev@dpdk.org; Jack Min
>> <jackmin@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
>> jerinjacobk@gmail.com; ajit.khaparde@broadcom.com
>> Subject: Re: [PATCH v6 2/5] app/flow-perf: add insertion rate calculation
>>
>> On 5/11/20 2:09 PM, Wisam Jaddo wrote:
>>> Add insertion rate calculation feature into flow
>>> performance application.
>>>
>>> The application now provide the ability to test
>>> insertion rate of specific rte_flow rule, by
>>> stressing it to the NIC, and calculate the
>>> insertion rate.
>>>
>>> The application offers some options in the command
>>> line, to configure which rule to apply.
>>>
>>> After that the application will start producing
>>> rules with same pattern but increasing the outer IP
>>> source address by 1 each time, thus it will give
>>> different flow each time, and all other items will
>>> have open masks.
>>>
>>> The current design have single core insertion rate.
>>> In the future we may have a multi core insertion
>>> rate measurement support in the app.
>>>
>>> Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
>>> ---
>>>  app/test-flow-perf/Makefile            |   3 +
>>>  app/test-flow-perf/actions_gen.c       | 164 +++++++++
>>>  app/test-flow-perf/actions_gen.h       |  29 ++
>>>  app/test-flow-perf/config.h            |  16 +
>>>  app/test-flow-perf/flow_gen.c          | 145 ++++++++
>>>  app/test-flow-perf/flow_gen.h          |  37 ++
>>>  app/test-flow-perf/items_gen.c         | 277 +++++++++++++++
>>>  app/test-flow-perf/items_gen.h         |  31 ++
>>>  app/test-flow-perf/main.c              | 472 ++++++++++++++++++++++++-
>>>  app/test-flow-perf/meson.build         |   3 +
>>>  doc/guides/rel_notes/release_20_05.rst |   3 +
>>>  doc/guides/tools/flow-perf.rst         | 195 +++++++++-
>>>  12 files changed, 1368 insertions(+), 7 deletions(-)
>>>  create mode 100644 app/test-flow-perf/actions_gen.c
>>>  create mode 100644 app/test-flow-perf/actions_gen.h
>>>  create mode 100644 app/test-flow-perf/flow_gen.c
>>>  create mode 100644 app/test-flow-perf/flow_gen.h
>>>  create mode 100644 app/test-flow-perf/items_gen.c
>>>  create mode 100644 app/test-flow-perf/items_gen.h
>>>
>>> diff --git a/app/test-flow-perf/Makefile b/app/test-flow-perf/Makefile
>>> index db043c17a..4f2db7591 100644
>>> --- a/app/test-flow-perf/Makefile
>>> +++ b/app/test-flow-perf/Makefile
>>> @@ -16,6 +16,9 @@ CFLAGS += $(WERROR_FLAGS)
>>>  #
>>>  # all source are stored in SRCS-y
>>>  #
>>> +SRCS-y += actions_gen.c
>>> +SRCS-y += flow_gen.c
>>> +SRCS-y += items_gen.c
>>>  SRCS-y += main.c
>>>
>>>  include $(RTE_SDK)/mk/rte.app.mk
>>> diff --git a/app/test-flow-perf/actions_gen.c b/app/test-flow-
>> perf/actions_gen.c
>>> new file mode 100644
>>> index 000000000..16bb3cf20
>>> --- /dev/null
>>> +++ b/app/test-flow-perf/actions_gen.c
>>> @@ -0,0 +1,164 @@
>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>> + * Copyright 2020 Mellanox Technologies, Ltd
>>> + *
>>> + * The file contains the implementations of actions generators.
>>> + * Each generator is responsible for preparing it's action instance
>>> + * and initializing it with needed data.
>>> + */
>>> +
>>> +#include <sys/types.h>
>>> +#include <rte_malloc.h>
>>> +#include <rte_flow.h>
>>> +#include <rte_ethdev.h>
>>> +
>>> +#include "actions_gen.h"
>>> +#include "config.h"
>>> +
>>> +/* Storage for struct rte_flow_action_rss including external data. */
>>> +struct action_rss_data {
>>> +	struct rte_flow_action_rss conf;
>>> +	uint8_t key[40];
>>> +	uint16_t queue[128];
>>> +};
>>> +
>>> +void
>>> +add_mark(struct rte_flow_action *actions,
>>> +	uint8_t actions_counter)
>>> +{
>>> +	static struct rte_flow_action_mark mark_action;
>>
>> Function-local static variables a bit better than file-local
>> or global variable, but just a bit. See below.
>> At bare minimum it requires a check that the action is not
>> in use already. Same in many cases below.
> 
> Yes it's better,
> What you mean by " At bare minimum it requires a check that the action is not in use already"
> Can you please elaborate?

In theory, nothing prevents to call the function twice in
attempt to add MARK action twice. Right now design
design guaranees that it will be used only once since
bitmask is used to mark required actions.

However the design using bitmask is bad since it does
not allow to control order of actions in flow rule.
Same concern is applicable to items bitmask. Application
user should control order, not internal logic.

So, if the design drawback is fixed, something
should ensure that the action is not added twice.
It is not that important for MASK, since it is constant now,
but important for QUEUE.

>>
>>> +
>>> +	do {
>>> +		mark_action.id = MARK_ID;
>>> +	} while (0);
>>
>> Why do you use dummy do-while loop here? Many similar cases
>> below.
> 
> Sometimes, it create the flow before setting the correct id, I think it's compiler stuff
> So the dummy loop to make sure the compiler finish it's execution and make sure
> When the flow is created the action have correct value.

As far as I know, C does not work like this. It will not help
to cope with race conditions if your design has race
conditions. Right now it is hard to judge for me, but do/while
loop is definitely unnecessary/useless here.

>>
>>> +
>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_MARK;
>>> +	actions[actions_counter].conf = &mark_action;
>>> +}
>>> +
>>> +void
>>> +add_queue(struct rte_flow_action *actions,
>>> +	uint8_t actions_counter, uint16_t queue)
>>> +{
>>> +	static struct rte_flow_action_queue queue_action;
>>
>> It does not allow to use the action twice to deliver to
>> to queues.
> 
> Yes, it's needed only once

What if I would like to test a rule which delivers to two
QUEUEs using DUP? I'm just highlighting hard limitation in
the the design. Not a blocker itself, but should be taken
into account.

> 
>>
>>> +
>>> +	do {
>>> +		queue_action.index = queue;
>>> +	} while (0);
>>> +
>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_QUEUE;
>>> +	actions[actions_counter].conf = &queue_action;
>>> +}
>>> +
>>> +void
>>> +add_jump(struct rte_flow_action *actions,
>>> +	uint8_t actions_counter, uint16_t next_table)
>>> +{
>>> +	static struct rte_flow_action_jump jump_action;
>>> +
>>> +	do {
>>> +		jump_action.group = next_table;
>>> +	} while (0);
>>> +
>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_JUMP;
>>> +	actions[actions_counter].conf = &jump_action;
>>> +}
>>> +
>>> +void
>>> +add_rss(struct rte_flow_action *actions,
>>> +	uint8_t actions_counter, uint16_t *queues,
>>> +	uint16_t queues_number)
>>> +{
>>> +	static struct rte_flow_action_rss *rss_action;
>>> +	static struct action_rss_data *rss_data;
>>
>> It is better to add an empty line here to split static and
>> non-static variable and make it easy to catch the difference.
>>
>>> +	uint16_t queue;
>>> +
>>> +	rss_data = rte_malloc("rss_data",
>>> +		sizeof(struct action_rss_data), 0);
>>
>> Does it mean that the second invocation will make
>> a memory leak?
> 
> Not exactly, because the second invocation may have different queues and need
> To be used with different flow, so I think it's ok to re malloc it.

Sorry, but it does not answer my question. Is it a memory leak?
If yes, why it is not a problem here?

>>
>>> +
>>> +	if (rss_data == NULL)
>>> +		rte_exit(EXIT_FAILURE, "No Memory available!");
>>> +
>>> +	*rss_data = (struct action_rss_data){
>>> +		.conf = (struct rte_flow_action_rss){
>>> +			.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
>>> +			.level = 0,
>>> +			.types = GET_RSS_HF(),
>>> +			.key_len = sizeof(rss_data->key),
>>> +			.queue_num = queues_number,
>>> +			.key = rss_data->key,
>>> +			.queue = rss_data->queue,
>>> +		},
>>> +		.key = { 1 },
>>> +		.queue = { 0 },
>>> +	};
>>> +
>>> +	for (queue = 0; queue < queues_number; queue++)
>>> +		rss_data->queue[queue] = queues[queue];
>>> +
>>> +	rss_action = &rss_data->conf;
>>> +
>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_RSS;
>>> +	actions[actions_counter++].conf = rss_action;
>>> +}
>>> +
>>> +void
>>> +add_count(struct rte_flow_action *actions,
>>> +	uint8_t actions_counter)
>>> +{
>>> +	static struct rte_flow_action_count count_action;
>>
>> Again it means it is impossible to use the action twice in one
>> rule.
> 
> Yes,
> If I removed the static from the inner scope here,
> The action will be freed when we reach the end of the file,

Not sure that I understand.

> And since the design here to add the action into actions array to be used later in the creation,
> It will not have correct action in case of none-static in the inner scope.
> 
> If any action needs to have privilege to be called twice in single rule
> New support needs to be done.
> 
> This is the design for it, I'll add into the known limitation.
> 
>>> +static void
>>> +fill_items(struct rte_flow_item *items,
>>> +	uint32_t flow_items, uint32_t outer_ip_src)
>>
>> It looks like it is better to have the function inside
>> items_gen.c. It would allow to make all add_<item> functions
>> local to items_gen.c.
>>
>>> +{
>>> +	uint8_t items_counter = 0;
>>> +
>>> +	/* Support outer items up to tunnel layer only. */
>>> +
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_META))
>>> +		add_meta_data(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TAG))
>>> +		add_meta_tag(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH))
>>> +		add_ether(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VLAN))
>>> +		add_vlan(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV4))
>>> +		add_ipv4(items, items_counter++, outer_ip_src);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV6))
>>> +		add_ipv6(items, items_counter++, outer_ip_src);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TCP))
>>> +		add_tcp(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_UDP))
>>> +		add_udp(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN))
>>> +		add_vxlan(items, items_counter++);
>>> +	if (flow_items &
>> FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN_GPE))
>>> +		add_vxlan_gpe(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GRE))
>>> +		add_gre(items, items_counter++);
>>> +	if (flow_items &
>> FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GENEVE))
>>> +		add_geneve(items, items_counter++);
>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GTP))
>>> +		add_gtp(items, items_counter++);
>>
>> It could be done in a loop: define an array of structures
>> FLOW_ITEM_MASK(proto) values and add function which should be
>> called. The only exception is IPv4/IPv6 which requires extra argument -
>> so all add callbacks should have add_data argument
>> which is a structure with possible tunings.
> 
> Ok, let me re-phrase to make sure I got it:
> 1- Move it to items_gen.c
> 2- Create array of structures with all items.
> The structure should be something like this:
>  static const struct items_dict {
>         const uint64_t mask;

const is not required here

>         void (*funct)(struct rte_flow_item *items,
>                         uint8_t items_counter, rte_be32_t src_ip);

The last argument should be a pointer to a structure
with src_ip field to make it generic and be able to
add more argument values there.

>         bool add_args;
>     }
> 3- I need to change the signature "parameters" to all other than ipv4,ipv6 to have another parameter to
> Have matched singture.
> 4- in none-ipv4-ipv6 item I need to call the RTE_SET_USED(src_ip); 
> 5- loop over the array to add items
> 
> Is this right?

Yes, with above corrections.

>>
>>> +
>>> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_END;
>>> +}
>>> +
>>> +static void
>>> +fill_actions(struct rte_flow_action *actions,
>>> +	uint32_t flow_actions, uint32_t counter, uint16_t next_table,
>>> +	uint16_t hairpinq)
>>
>>
>> It looks like it is better to have the function inside
>> actions_gen.c. It would allow to make all add_<action>
>> functions local to actions_gen.c.
> 
> Sure, but  this one will remain as is, since the actions have different signature and other actions when be added
> Will have more different parameters.

Sorry, still don't understand why.

>>
>>> +{
>>> +	uint8_t actions_counter = 0;
>>> +	uint16_t hairpin_queues[hairpinq];
>>> +	uint16_t queues[RXQ_NUM];
>>> +	uint16_t i;
>>> +
>>> +	/* None-fate actions */
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_MARK))
>>> +		add_mark(actions, actions_counter++);
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_COUNT))
>>> +		add_count(actions, actions_counter++);
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_META))
>>> +		add_set_meta(actions, actions_counter++);
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_TAG))
>>> +		add_set_tag(actions, actions_counter++);
>>> +
>>> +	/* Fate actions */
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_QUEUE))
>>> +		add_queue(actions, actions_counter++, counter %
>> RXQ_NUM);
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_RSS)) {
>>> +		for (i = 0; i < RXQ_NUM; i++)
>>> +			queues[i] = i;
>>> +		add_rss(actions, actions_counter++, queues, RXQ_NUM);
>>> +	}
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP))
>>> +		add_jump(actions, actions_counter++, next_table);
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_PORT_ID))
>>> +		add_port_id(actions, actions_counter++);
>>> +	if (flow_actions &
>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_DROP))
>>> +		add_drop(actions, actions_counter++);
>>> +	if (flow_actions & HAIRPIN_QUEUE_ACTION)
>>> +		add_queue(actions, actions_counter++,
>>> +			(counter % hairpinq) + RXQ_NUM);
>>> +	if (flow_actions & HAIRPIN_RSS_ACTION) {
>>> +		for (i = 0; i < hairpinq; i++)
>>> +			hairpin_queues[i] = i + RXQ_NUM;
>>> +		add_rss(actions, actions_counter++, hairpin_queues,
>> hairpinq);
>>> +	}
>>> +
>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_END;
>>> +}
>>> +
>>> +
>>> +void
>>> +add_ether(struct rte_flow_item *items, uint8_t items_counter)
>>> +{
>>> +	static struct rte_flow_item_eth eth_spec;
>>> +	static struct rte_flow_item_eth eth_mask;
>>
>> Same as actions, it does not allow to have two Eth items
>> in one rule. However, it looks like current design does not
>> cover it already on mask level.
> 
> Yes, indeed.
> Already add this in flow_perf.rst as known limitation.
> 
>>
>>>  		/* Control */
>>>  		{ "help",                       0, 0, 0 },
>>> +		{ "flows-count",                1, 0, 0 },
>>> +		{ "dump-iterations",            0, 0, 0 },
>>
>> It looks like above it the path which should be defined
>> here.
> 
> I'm not following, what do you mean?

See below. I was just trying to separately static and
dynamically added parts.

>>
>>> +		/* Attributes */
>>> +		{ "ingress",                    0, 0, 0 },
>>> +		{ "egress",                     0, 0, 0 },
>>> +		{ "transfer",                   0, 0, 0 },
>>> +		{ "group",                      1, 0, 0 },
>>> +		/* Items */
>>> +		{ "ether",                      0, 0, 0 },
>>> +		{ "vlan",                       0, 0, 0 },
>>> +		{ "ipv4",                       0, 0, 0 },
>>> +		{ "ipv6",                       0, 0, 0 },
>>> +		{ "tcp",                        0, 0, 0 },
>>> +		{ "udp",                        0, 0, 0 },
>>> +		{ "vxlan",                      0, 0, 0 },
>>> +		{ "vxlan-gpe",                  0, 0, 0 },
>>> +		{ "gre",                        0, 0, 0 },
>>> +		{ "geneve",                     0, 0, 0 },
>>> +		{ "gtp",                        0, 0, 0 },
>>> +		{ "meta",                       0, 0, 0 },
>>> +		{ "tag",                        0, 0, 0 },
>>> +		/* Actions */
>>> +		{ "port-id",                    0, 0, 0 },
>>> +		{ "rss",                        0, 0, 0 },
>>> +		{ "queue",                      0, 0, 0 },
>>> +		{ "jump",                       0, 0, 0 },
>>> +		{ "mark",                       0, 0, 0 },
>>> +		{ "count",                      0, 0, 0 },
>>> +		{ "set-meta",                   0, 0, 0 },
>>> +		{ "set-tag",                    0, 0, 0 },
>>> +		{ "drop",                       0, 0, 0 },
>>> +		{ "hairpin-queue",              1, 0, 0 },
>>> +		{ "hairpin-rss",                1, 0, 0 },
>>
>> This part should be added by code which iterates by
>> flow_options. I.e. allocate lgopts dynamically, copy
>> static options there by memcpy() and add dynamic as
>> described above. May be flow_options require extra
>> field  'has_arg'.
> 
> Regard this one,
> Some option have special code, like hairpin, group and control, so even with has_arg
> It cannot be looped in pretty way, I tried it and it became ugly, I prefer to leave this as it.
> I don't think it's critical to get rid of it? What do you think?

Yes, it is not critical, but looking at above array, I don't
understand how it could become ugly.
Wisam Jaddo June 2, 2020, 12:43 p.m. UTC | #4
Hey Andrew,
Sorry for late response "we were in holiday".
Please see comments inline, anything that I didn't reply on, will fixed in next version.
I'll wait for your reply before sending next version in order to reduce review and don't encounter
Duplicate comments.

Thanks for review up to now really appreciate it.

>-----Original Message-----
>From: Andrew Rybchenko <arybchenko@solarflare.com>
>Sent: Tuesday, May 12, 2020 2:08 PM
>To: Wisam Monther <wisamm@mellanox.com>; dev@dpdk.org; Jack Min
><jackmin@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
>jerinjacobk@gmail.com; ajit.khaparde@broadcom.com
>Subject: Re: [dpdk-dev] [PATCH v6 2/5] app/flow-perf: add insertion rate
>calculation
>
>On 5/12/20 1:34 PM, Wisam Monther wrote:
>>> -----Original Message-----
>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>>> Sent: Monday, May 11, 2020 3:05 PM
>>> To: Wisam Monther <wisamm@mellanox.com>; dev@dpdk.org; Jack Min
>>> <jackmin@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
>>> jerinjacobk@gmail.com; ajit.khaparde@broadcom.com
>>> Subject: Re: [PATCH v6 2/5] app/flow-perf: add insertion rate
>>> calculation
>>>
>>> On 5/11/20 2:09 PM, Wisam Jaddo wrote:
>>>> Add insertion rate calculation feature into flow performance
>>>> application.
>>>>
>>>> The application now provide the ability to test insertion rate of
>>>> specific rte_flow rule, by stressing it to the NIC, and calculate
>>>> the insertion rate.
>>>>
>>>> The application offers some options in the command line, to
>>>> configure which rule to apply.
>>>>
>>>> After that the application will start producing rules with same
>>>> pattern but increasing the outer IP source address by 1 each time,
>>>> thus it will give different flow each time, and all other items will
>>>> have open masks.
>>>>
>>>> The current design have single core insertion rate.
>>>> In the future we may have a multi core insertion rate measurement
>>>> support in the app.
>>>>
>>>> Signed-off-by: Wisam Jaddo <wisamm@mellanox.com>
>>>> ---
>>>>  app/test-flow-perf/Makefile            |   3 +
>>>>  app/test-flow-perf/actions_gen.c       | 164 +++++++++
>>>>  app/test-flow-perf/actions_gen.h       |  29 ++
>>>>  app/test-flow-perf/config.h            |  16 +
>>>>  app/test-flow-perf/flow_gen.c          | 145 ++++++++
>>>>  app/test-flow-perf/flow_gen.h          |  37 ++
>>>>  app/test-flow-perf/items_gen.c         | 277 +++++++++++++++
>>>>  app/test-flow-perf/items_gen.h         |  31 ++
>>>>  app/test-flow-perf/main.c              | 472 ++++++++++++++++++++++++-
>>>>  app/test-flow-perf/meson.build         |   3 +
>>>>  doc/guides/rel_notes/release_20_05.rst |   3 +
>>>>  doc/guides/tools/flow-perf.rst         | 195 +++++++++-
>>>>  12 files changed, 1368 insertions(+), 7 deletions(-)  create mode
>>>> 100644 app/test-flow-perf/actions_gen.c  create mode 100644
>>>> app/test-flow-perf/actions_gen.h  create mode 100644
>>>> app/test-flow-perf/flow_gen.c  create mode 100644
>>>> app/test-flow-perf/flow_gen.h  create mode 100644
>>>> app/test-flow-perf/items_gen.c  create mode 100644
>>>> app/test-flow-perf/items_gen.h
>>>>
>>>> diff --git a/app/test-flow-perf/Makefile
>>>> b/app/test-flow-perf/Makefile index db043c17a..4f2db7591 100644
>>>> --- a/app/test-flow-perf/Makefile
>>>> +++ b/app/test-flow-perf/Makefile
>>>> @@ -16,6 +16,9 @@ CFLAGS += $(WERROR_FLAGS)  #  # all source are
>>>> stored in SRCS-y  #
>>>> +SRCS-y += actions_gen.c
>>>> +SRCS-y += flow_gen.c
>>>> +SRCS-y += items_gen.c
>>>>  SRCS-y += main.c
>>>>
>>>>  include $(RTE_SDK)/mk/rte.app.mk
>>>> diff --git a/app/test-flow-perf/actions_gen.c b/app/test-flow-
>>> perf/actions_gen.c
>>>> new file mode 100644
>>>> index 000000000..16bb3cf20
>>>> --- /dev/null
>>>> +++ b/app/test-flow-perf/actions_gen.c
>>>> @@ -0,0 +1,164 @@
>>>> +/* SPDX-License-Identifier: BSD-3-Clause
>>>> + * Copyright 2020 Mellanox Technologies, Ltd
>>>> + *
>>>> + * The file contains the implementations of actions generators.
>>>> + * Each generator is responsible for preparing it's action instance
>>>> + * and initializing it with needed data.
>>>> + */
>>>> +
>>>> +#include <sys/types.h>
>>>> +#include <rte_malloc.h>
>>>> +#include <rte_flow.h>
>>>> +#include <rte_ethdev.h>
>>>> +
>>>> +#include "actions_gen.h"
>>>> +#include "config.h"
>>>> +
>>>> +/* Storage for struct rte_flow_action_rss including external data.
>>>> +*/ struct action_rss_data {
>>>> +	struct rte_flow_action_rss conf;
>>>> +	uint8_t key[40];
>>>> +	uint16_t queue[128];
>>>> +};
>>>> +
>>>> +void
>>>> +add_mark(struct rte_flow_action *actions,
>>>> +	uint8_t actions_counter)
>>>> +{
>>>> +	static struct rte_flow_action_mark mark_action;
>>>
>>> Function-local static variables a bit better than file-local or
>>> global variable, but just a bit. See below.
>>> At bare minimum it requires a check that the action is not in use
>>> already. Same in many cases below.
>>
>> Yes it's better,
>> What you mean by " At bare minimum it requires a check that the action is
>not in use already"
>> Can you please elaborate?
>
>In theory, nothing prevents to call the function twice in attempt to add MARK
>action twice. Right now design design guaranees that it will be used only once
>since bitmask is used to mark required actions.
>
>However the design using bitmask is bad since it does not allow to control
>order of actions in flow rule.
>Same concern is applicable to items bitmask. Application user should control
>order, not internal logic.
>
>So, if the design drawback is fixed, something should ensure that the action is
>not added twice.
>It is not that important for MASK, since it is constant now, but important for
>QUEUE.
>

But this application is used for performance stuff aka insertion & pps measurements.
It's not PMD to check what's valid or not, even in future if the bit wise design changed I think
We will create the flow items/actions according to the user, it's the user responsibility to enter valid flow structre
Otherwise the PMD should reject the rte flow, for example: due to two QUEUE actions.
If we are gonna add constrains for such stuff we will need to write many condition including items/actions
While those checks are PMD responsibility.
What do you think?

>>>
>>>> +
>>>> +	do {
>>>> +		mark_action.id = MARK_ID;
>>>> +	} while (0);
>>>
>>> Why do you use dummy do-while loop here? Many similar cases below.
>>
>> Sometimes, it create the flow before setting the correct id, I think
>> it's compiler stuff So the dummy loop to make sure the compiler finish
>> it's execution and make sure When the flow is created the action have
>correct value.
>
>As far as I know, C does not work like this. It will not help to cope with race
>conditions if your design has race conditions. Right now it is hard to judge for
>me, but do/while loop is definitely unnecessary/useless here.
>

Don't know, but I did it with and without the loop the value changes,
Without the loop the first flow will have wrong value at least. 
With break point here or with loop the flow will have correct value.

Don't know really what to do here :/

>>>
>>>> +
>>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_MARK;
>>>> +	actions[actions_counter].conf = &mark_action; }
>>>> +
>>>> +void
>>>> +add_queue(struct rte_flow_action *actions,
>>>> +	uint8_t actions_counter, uint16_t queue) {
>>>> +	static struct rte_flow_action_queue queue_action;
>>>
>>> It does not allow to use the action twice to deliver to to queues.
>>
>> Yes, it's needed only once
>
>What if I would like to test a rule which delivers to two QUEUEs using DUP? I'm
>just highlighting hard limitation in the the design. Not a blocker itself, but
>should be taken into account.
>

In future we can change the bit wise design to allow such things,
I'll take this responsibility on my self when adding the inner items as well,
I'll provide new design in future for this area to allow the user to create as many as he want
From items and actions even if it's dup.

>>
>>>
>>>> +
>>>> +	do {
>>>> +		queue_action.index = queue;
>>>> +	} while (0);
>>>> +
>>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_QUEUE;
>>>> +	actions[actions_counter].conf = &queue_action; }
>>>> +
>>>> +void
>>>> +add_jump(struct rte_flow_action *actions,
>>>> +	uint8_t actions_counter, uint16_t next_table) {
>>>> +	static struct rte_flow_action_jump jump_action;
>>>> +
>>>> +	do {
>>>> +		jump_action.group = next_table;
>>>> +	} while (0);
>>>> +
>>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_JUMP;
>>>> +	actions[actions_counter].conf = &jump_action; }
>>>> +
>>>> +void
>>>> +add_rss(struct rte_flow_action *actions,
>>>> +	uint8_t actions_counter, uint16_t *queues,
>>>> +	uint16_t queues_number)
>>>> +{
>>>> +	static struct rte_flow_action_rss *rss_action;
>>>> +	static struct action_rss_data *rss_data;
>>>
>>> It is better to add an empty line here to split static and non-static
>>> variable and make it easy to catch the difference.
>>>
>>>> +	uint16_t queue;
>>>> +
>>>> +	rss_data = rte_malloc("rss_data",
>>>> +		sizeof(struct action_rss_data), 0);
>>>
>>> Does it mean that the second invocation will make a memory leak?
>>
>> Not exactly, because the second invocation may have different queues
>> and need To be used with different flow, so I think it's ok to re malloc it.
>
>Sorry, but it does not answer my question. Is it a memory leak?
>If yes, why it is not a problem here?
>

No I don't think such thing is memory leak,
Why?

You see at the end of each add_function() we have a line like:
actions[actions_counter++].conf = rss_action;

So I'm actually still having this memory in track for the rte flow,
Next call for same function will malloc another memory but also will be kept in track
In the items list passed in parameter.
So I don't see memory leak here.

>>>
>>>> +
>>>> +	if (rss_data == NULL)
>>>> +		rte_exit(EXIT_FAILURE, "No Memory available!");
>>>> +
>>>> +	*rss_data = (struct action_rss_data){
>>>> +		.conf = (struct rte_flow_action_rss){
>>>> +			.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
>>>> +			.level = 0,
>>>> +			.types = GET_RSS_HF(),
>>>> +			.key_len = sizeof(rss_data->key),
>>>> +			.queue_num = queues_number,
>>>> +			.key = rss_data->key,
>>>> +			.queue = rss_data->queue,
>>>> +		},
>>>> +		.key = { 1 },
>>>> +		.queue = { 0 },
>>>> +	};
>>>> +
>>>> +	for (queue = 0; queue < queues_number; queue++)
>>>> +		rss_data->queue[queue] = queues[queue];
>>>> +
>>>> +	rss_action = &rss_data->conf;
>>>> +
>>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_RSS;
>>>> +	actions[actions_counter++].conf = rss_action; }
>>>> +
>>>> +void
>>>> +add_count(struct rte_flow_action *actions,
>>>> +	uint8_t actions_counter)
>>>> +{
>>>> +	static struct rte_flow_action_count count_action;
>>>
>>> Again it means it is impossible to use the action twice in one rule.
>>
>> Yes,
>> If I removed the static from the inner scope here, The action will be
>> freed when we reach the end of the file,
>
>Not sure that I understand.

since the design here to add the action into actions array to be
used later in the creation, It will not have correct action in case of none-
static in the inner scope since the memory will be freed at the end of the function for the inner
variables.

>
>> And since the design here to add the action into actions array to be
>> used later in the creation, It will not have correct action in case of none-
>static in the inner scope.
>>
>> If any action needs to have privilege to be called twice in single
>> rule New support needs to be done.
>>
>> This is the design for it, I'll add into the known limitation.
>>
>>>> +static void
>>>> +fill_items(struct rte_flow_item *items,
>>>> +	uint32_t flow_items, uint32_t outer_ip_src)
>>>
>>> It looks like it is better to have the function inside items_gen.c.
>>> It would allow to make all add_<item> functions local to items_gen.c.
>>>
>>>> +{
>>>> +	uint8_t items_counter = 0;
>>>> +
>>>> +	/* Support outer items up to tunnel layer only. */
>>>> +
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_META))
>>>> +		add_meta_data(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TAG))
>>>> +		add_meta_tag(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH))
>>>> +		add_ether(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VLAN))
>>>> +		add_vlan(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV4))
>>>> +		add_ipv4(items, items_counter++, outer_ip_src);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV6))
>>>> +		add_ipv6(items, items_counter++, outer_ip_src);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TCP))
>>>> +		add_tcp(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_UDP))
>>>> +		add_udp(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN))
>>>> +		add_vxlan(items, items_counter++);
>>>> +	if (flow_items &
>>> FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN_GPE))
>>>> +		add_vxlan_gpe(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GRE))
>>>> +		add_gre(items, items_counter++);
>>>> +	if (flow_items &
>>> FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GENEVE))
>>>> +		add_geneve(items, items_counter++);
>>>> +	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GTP))
>>>> +		add_gtp(items, items_counter++);
>>>
>>> It could be done in a loop: define an array of structures
>>> FLOW_ITEM_MASK(proto) values and add function which should be called.
>>> The only exception is IPv4/IPv6 which requires extra argument - so
>>> all add callbacks should have add_data argument which is a structure
>>> with possible tunings.
>>
>> Ok, let me re-phrase to make sure I got it:
>> 1- Move it to items_gen.c
>> 2- Create array of structures with all items.
>> The structure should be something like this:
>>  static const struct items_dict {
>>         const uint64_t mask;
>
>const is not required here
>
>>         void (*funct)(struct rte_flow_item *items,
>>                         uint8_t items_counter, rte_be32_t src_ip);
>
>The last argument should be a pointer to a structure with src_ip field to make
>it generic and be able to add more argument values there.
>
>>         bool add_args;
>>     }
>> 3- I need to change the signature "parameters" to all other than
>> ipv4,ipv6 to have another parameter to Have matched singture.
>> 4- in none-ipv4-ipv6 item I need to call the RTE_SET_USED(src_ip);
>> 5- loop over the array to add items
>>
>> Is this right?
>
>Yes, with above corrections.
>
>>>
>>>> +
>>>> +	items[items_counter].type = RTE_FLOW_ITEM_TYPE_END; }
>>>> +
>>>> +static void
>>>> +fill_actions(struct rte_flow_action *actions,
>>>> +	uint32_t flow_actions, uint32_t counter, uint16_t next_table,
>>>> +	uint16_t hairpinq)
>>>
>>>
>>> It looks like it is better to have the function inside actions_gen.c.
>>> It would allow to make all add_<action> functions local to
>>> actions_gen.c.
>>
>> Sure, but  this one will remain as is, since the actions have
>> different signature and other actions when be added Will have more
>different parameters.
>
>Sorry, still don't understand why.
>
>>>
>>>> +{
>>>> +	uint8_t actions_counter = 0;
>>>> +	uint16_t hairpin_queues[hairpinq];
>>>> +	uint16_t queues[RXQ_NUM];
>>>> +	uint16_t i;
>>>> +
>>>> +	/* None-fate actions */
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_MARK))
>>>> +		add_mark(actions, actions_counter++);
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_COUNT))
>>>> +		add_count(actions, actions_counter++);
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_META))
>>>> +		add_set_meta(actions, actions_counter++);
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_TAG))
>>>> +		add_set_tag(actions, actions_counter++);
>>>> +
>>>> +	/* Fate actions */
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_QUEUE))
>>>> +		add_queue(actions, actions_counter++, counter %
>>> RXQ_NUM);
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_RSS)) {
>>>> +		for (i = 0; i < RXQ_NUM; i++)
>>>> +			queues[i] = i;
>>>> +		add_rss(actions, actions_counter++, queues, RXQ_NUM);
>>>> +	}
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP))
>>>> +		add_jump(actions, actions_counter++, next_table);
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_PORT_ID))
>>>> +		add_port_id(actions, actions_counter++);
>>>> +	if (flow_actions &
>>> FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_DROP))
>>>> +		add_drop(actions, actions_counter++);
>>>> +	if (flow_actions & HAIRPIN_QUEUE_ACTION)
>>>> +		add_queue(actions, actions_counter++,
>>>> +			(counter % hairpinq) + RXQ_NUM);
>>>> +	if (flow_actions & HAIRPIN_RSS_ACTION) {
>>>> +		for (i = 0; i < hairpinq; i++)
>>>> +			hairpin_queues[i] = i + RXQ_NUM;
>>>> +		add_rss(actions, actions_counter++, hairpin_queues,
>>> hairpinq);
>>>> +	}
>>>> +
>>>> +	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_END; }
>>>> +
>>>> +
>>>> +void
>>>> +add_ether(struct rte_flow_item *items, uint8_t items_counter) {
>>>> +	static struct rte_flow_item_eth eth_spec;
>>>> +	static struct rte_flow_item_eth eth_mask;
>>>
>>> Same as actions, it does not allow to have two Eth items in one rule.
>>> However, it looks like current design does not cover it already on
>>> mask level.
>>
>> Yes, indeed.
>> Already add this in flow_perf.rst as known limitation.
>>
>>>
>>>>  		/* Control */
>>>>  		{ "help",                       0, 0, 0 },
>>>> +		{ "flows-count",                1, 0, 0 },
>>>> +		{ "dump-iterations",            0, 0, 0 },
>>>
>>> It looks like above it the path which should be defined here.
>>
>> I'm not following, what do you mean?
>
>See below. I was just trying to separately static and dynamically added parts.

But to do something like this I need to define the size of this statically,
And anyone want to add any option/item or action later on he need to do modification in size here
And in loop perhaps.

I still see this is better and easier to maintain.
What do you think?

>
>>>
>>>> +		/* Attributes */
>>>> +		{ "ingress",                    0, 0, 0 },
>>>> +		{ "egress",                     0, 0, 0 },
>>>> +		{ "transfer",                   0, 0, 0 },
>>>> +		{ "group",                      1, 0, 0 },
>>>> +		/* Items */
>>>> +		{ "ether",                      0, 0, 0 },
>>>> +		{ "vlan",                       0, 0, 0 },
>>>> +		{ "ipv4",                       0, 0, 0 },
>>>> +		{ "ipv6",                       0, 0, 0 },
>>>> +		{ "tcp",                        0, 0, 0 },
>>>> +		{ "udp",                        0, 0, 0 },
>>>> +		{ "vxlan",                      0, 0, 0 },
>>>> +		{ "vxlan-gpe",                  0, 0, 0 },
>>>> +		{ "gre",                        0, 0, 0 },
>>>> +		{ "geneve",                     0, 0, 0 },
>>>> +		{ "gtp",                        0, 0, 0 },
>>>> +		{ "meta",                       0, 0, 0 },
>>>> +		{ "tag",                        0, 0, 0 },
>>>> +		/* Actions */
>>>> +		{ "port-id",                    0, 0, 0 },
>>>> +		{ "rss",                        0, 0, 0 },
>>>> +		{ "queue",                      0, 0, 0 },
>>>> +		{ "jump",                       0, 0, 0 },
>>>> +		{ "mark",                       0, 0, 0 },
>>>> +		{ "count",                      0, 0, 0 },
>>>> +		{ "set-meta",                   0, 0, 0 },
>>>> +		{ "set-tag",                    0, 0, 0 },
>>>> +		{ "drop",                       0, 0, 0 },
>>>> +		{ "hairpin-queue",              1, 0, 0 },
>>>> +		{ "hairpin-rss",                1, 0, 0 },
>>>
>>> This part should be added by code which iterates by flow_options.
>>> I.e. allocate lgopts dynamically, copy static options there by
>>> memcpy() and add dynamic as described above. May be flow_options
>>> require extra field  'has_arg'.
>>
>> Regard this one,
>> Some option have special code, like hairpin, group and control, so
>> even with has_arg It cannot be looped in pretty way, I tried it and it became
>ugly, I prefer to leave this as it.
>> I don't think it's critical to get rid of it? What do you think?
>
>Yes, it is not critical, but looking at above array, I don't understand how it could
>become ugly.
diff mbox series

Patch

diff --git a/app/test-flow-perf/Makefile b/app/test-flow-perf/Makefile
index db043c17a..4f2db7591 100644
--- a/app/test-flow-perf/Makefile
+++ b/app/test-flow-perf/Makefile
@@ -16,6 +16,9 @@  CFLAGS += $(WERROR_FLAGS)
 #
 # all source are stored in SRCS-y
 #
+SRCS-y += actions_gen.c
+SRCS-y += flow_gen.c
+SRCS-y += items_gen.c
 SRCS-y += main.c
 
 include $(RTE_SDK)/mk/rte.app.mk
diff --git a/app/test-flow-perf/actions_gen.c b/app/test-flow-perf/actions_gen.c
new file mode 100644
index 000000000..16bb3cf20
--- /dev/null
+++ b/app/test-flow-perf/actions_gen.c
@@ -0,0 +1,164 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ *
+ * The file contains the implementations of actions generators.
+ * Each generator is responsible for preparing it's action instance
+ * and initializing it with needed data.
+ */
+
+#include <sys/types.h>
+#include <rte_malloc.h>
+#include <rte_flow.h>
+#include <rte_ethdev.h>
+
+#include "actions_gen.h"
+#include "config.h"
+
+/* Storage for struct rte_flow_action_rss including external data. */
+struct action_rss_data {
+	struct rte_flow_action_rss conf;
+	uint8_t key[40];
+	uint16_t queue[128];
+};
+
+void
+add_mark(struct rte_flow_action *actions,
+	uint8_t actions_counter)
+{
+	static struct rte_flow_action_mark mark_action;
+
+	do {
+		mark_action.id = MARK_ID;
+	} while (0);
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_MARK;
+	actions[actions_counter].conf = &mark_action;
+}
+
+void
+add_queue(struct rte_flow_action *actions,
+	uint8_t actions_counter, uint16_t queue)
+{
+	static struct rte_flow_action_queue queue_action;
+
+	do {
+		queue_action.index = queue;
+	} while (0);
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_QUEUE;
+	actions[actions_counter].conf = &queue_action;
+}
+
+void
+add_jump(struct rte_flow_action *actions,
+	uint8_t actions_counter, uint16_t next_table)
+{
+	static struct rte_flow_action_jump jump_action;
+
+	do {
+		jump_action.group = next_table;
+	} while (0);
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_JUMP;
+	actions[actions_counter].conf = &jump_action;
+}
+
+void
+add_rss(struct rte_flow_action *actions,
+	uint8_t actions_counter, uint16_t *queues,
+	uint16_t queues_number)
+{
+	static struct rte_flow_action_rss *rss_action;
+	static struct action_rss_data *rss_data;
+	uint16_t queue;
+
+	rss_data = rte_malloc("rss_data",
+		sizeof(struct action_rss_data), 0);
+
+	if (rss_data == NULL)
+		rte_exit(EXIT_FAILURE, "No Memory available!");
+
+	*rss_data = (struct action_rss_data){
+		.conf = (struct rte_flow_action_rss){
+			.func = RTE_ETH_HASH_FUNCTION_DEFAULT,
+			.level = 0,
+			.types = GET_RSS_HF(),
+			.key_len = sizeof(rss_data->key),
+			.queue_num = queues_number,
+			.key = rss_data->key,
+			.queue = rss_data->queue,
+		},
+		.key = { 1 },
+		.queue = { 0 },
+	};
+
+	for (queue = 0; queue < queues_number; queue++)
+		rss_data->queue[queue] = queues[queue];
+
+	rss_action = &rss_data->conf;
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_RSS;
+	actions[actions_counter++].conf = rss_action;
+}
+
+void
+add_set_meta(struct rte_flow_action *actions,
+	uint8_t actions_counter)
+{
+	static struct rte_flow_action_set_meta meta_action;
+
+	do {
+		meta_action.data = RTE_BE32(META_DATA);
+		meta_action.mask = RTE_BE32(0xffffffff);
+	} while (0);
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_SET_META;
+	actions[actions_counter++].conf = &meta_action;
+}
+
+void
+add_set_tag(struct rte_flow_action *actions,
+	uint8_t actions_counter)
+{
+	static struct rte_flow_action_set_tag tag_action;
+
+	do {
+		tag_action.data = RTE_BE32(META_DATA);
+		tag_action.mask = RTE_BE32(0xffffffff);
+		tag_action.index = TAG_INDEX;
+	} while (0);
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_SET_TAG;
+	actions[actions_counter++].conf = &tag_action;
+}
+
+void
+add_port_id(struct rte_flow_action *actions,
+	uint8_t actions_counter)
+{
+	static struct rte_flow_action_port_id port_id;
+
+	do {
+		port_id.id = PORT_ID_DST;
+	} while (0);
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_PORT_ID;
+	actions[actions_counter++].conf = &port_id;
+}
+
+void
+add_drop(struct rte_flow_action *actions,
+	uint8_t actions_counter)
+{
+	actions[actions_counter++].type = RTE_FLOW_ACTION_TYPE_DROP;
+}
+
+void
+add_count(struct rte_flow_action *actions,
+	uint8_t actions_counter)
+{
+	static struct rte_flow_action_count count_action;
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_COUNT;
+	actions[actions_counter++].conf = &count_action;
+}
diff --git a/app/test-flow-perf/actions_gen.h b/app/test-flow-perf/actions_gen.h
new file mode 100644
index 000000000..bc7d084f3
--- /dev/null
+++ b/app/test-flow-perf/actions_gen.h
@@ -0,0 +1,29 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ *
+ * This file contains the functions definitions to
+ * generate each supported action.
+ */
+
+#ifndef FLOW_PERF_ACTION_GEN
+#define FLOW_PERF_ACTION_GEN
+
+#include <rte_flow.h>
+
+#include "config.h"
+
+void add_mark(struct rte_flow_action *actions, uint8_t actions_counter);
+void add_queue(struct rte_flow_action *actions,
+	uint8_t actions_counter, uint16_t queue);
+void add_jump(struct rte_flow_action *actions,
+	uint8_t actions_counter, uint16_t next_table);
+void add_rss(struct rte_flow_action *actions,
+	uint8_t actions_counter, uint16_t *queues,
+	uint16_t queues_number);
+void add_set_meta(struct rte_flow_action *actions, uint8_t actions_counter);
+void add_set_tag(struct rte_flow_action *actions, uint8_t actions_counter);
+void add_port_id(struct rte_flow_action *actions, uint8_t actions_counter);
+void add_drop(struct rte_flow_action *actions, uint8_t actions_counter);
+void add_count(struct rte_flow_action *actions, uint8_t actions_counter);
+
+#endif /* FLOW_PERF_ACTION_GEN */
diff --git a/app/test-flow-perf/config.h b/app/test-flow-perf/config.h
index cf41e0345..f16d0de77 100644
--- a/app/test-flow-perf/config.h
+++ b/app/test-flow-perf/config.h
@@ -2,6 +2,7 @@ 
  * Copyright 2020 Mellanox Technologies, Ltd
  */
 
+#define FLOW_ITEM_MASK(_x) (UINT64_C(1) << _x)
 #define GET_RSS_HF() (ETH_RSS_IP | ETH_RSS_TCP)
 
 /* Configuration */
@@ -12,3 +13,18 @@ 
 #define MBUF_CACHE_SIZE 512
 #define NR_RXD  256
 #define NR_TXD  256
+
+/* Items/Actions parameters */
+#define JUMP_ACTION_TABLE 2
+#define VLAN_VALUE 1
+#define VNI_VALUE 1
+#define GRE_PROTO  0x6558
+#define META_DATA 1
+#define TAG_INDEX 0
+#define PORT_ID_DST 1
+#define MARK_ID 1
+#define TEID_VALUE 1
+
+/* Flow items/acctions max size */
+#define MAX_ITEMS_NUM 32
+#define MAX_ACTIONS_NUM 32
diff --git a/app/test-flow-perf/flow_gen.c b/app/test-flow-perf/flow_gen.c
new file mode 100644
index 000000000..50066d99e
--- /dev/null
+++ b/app/test-flow-perf/flow_gen.c
@@ -0,0 +1,145 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ *
+ * The file contains the implementations of the method to
+ * fill items, actions & attributes in their corresponding
+ * arrays, and then generate rte_flow rule.
+ *
+ * After the generation. The rule goes to validation then
+ * creation state and then return the results.
+ */
+
+#include <stdint.h>
+
+#include "flow_gen.h"
+#include "items_gen.h"
+#include "actions_gen.h"
+#include "config.h"
+
+static void
+fill_attributes(struct rte_flow_attr *attr,
+	uint32_t flow_attrs, uint16_t group)
+{
+	if (flow_attrs & INGRESS)
+		attr->ingress = 1;
+	if (flow_attrs & EGRESS)
+		attr->egress = 1;
+	if (flow_attrs & TRANSFER)
+		attr->transfer = 1;
+	attr->group = group;
+}
+
+static void
+fill_items(struct rte_flow_item *items,
+	uint32_t flow_items, uint32_t outer_ip_src)
+{
+	uint8_t items_counter = 0;
+
+	/* Support outer items up to tunnel layer only. */
+
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_META))
+		add_meta_data(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TAG))
+		add_meta_tag(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH))
+		add_ether(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VLAN))
+		add_vlan(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV4))
+		add_ipv4(items, items_counter++, outer_ip_src);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV6))
+		add_ipv6(items, items_counter++, outer_ip_src);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TCP))
+		add_tcp(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_UDP))
+		add_udp(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN))
+		add_vxlan(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN_GPE))
+		add_vxlan_gpe(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GRE))
+		add_gre(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GENEVE))
+		add_geneve(items, items_counter++);
+	if (flow_items & FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GTP))
+		add_gtp(items, items_counter++);
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_END;
+}
+
+static void
+fill_actions(struct rte_flow_action *actions,
+	uint32_t flow_actions, uint32_t counter, uint16_t next_table,
+	uint16_t hairpinq)
+{
+	uint8_t actions_counter = 0;
+	uint16_t hairpin_queues[hairpinq];
+	uint16_t queues[RXQ_NUM];
+	uint16_t i;
+
+	/* None-fate actions */
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_MARK))
+		add_mark(actions, actions_counter++);
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_COUNT))
+		add_count(actions, actions_counter++);
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_META))
+		add_set_meta(actions, actions_counter++);
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_TAG))
+		add_set_tag(actions, actions_counter++);
+
+	/* Fate actions */
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_QUEUE))
+		add_queue(actions, actions_counter++, counter % RXQ_NUM);
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_RSS)) {
+		for (i = 0; i < RXQ_NUM; i++)
+			queues[i] = i;
+		add_rss(actions, actions_counter++, queues, RXQ_NUM);
+	}
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP))
+		add_jump(actions, actions_counter++, next_table);
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_PORT_ID))
+		add_port_id(actions, actions_counter++);
+	if (flow_actions & FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_DROP))
+		add_drop(actions, actions_counter++);
+	if (flow_actions & HAIRPIN_QUEUE_ACTION)
+		add_queue(actions, actions_counter++,
+			(counter % hairpinq) + RXQ_NUM);
+	if (flow_actions & HAIRPIN_RSS_ACTION) {
+		for (i = 0; i < hairpinq; i++)
+			hairpin_queues[i] = i + RXQ_NUM;
+		add_rss(actions, actions_counter++, hairpin_queues, hairpinq);
+	}
+
+	actions[actions_counter].type = RTE_FLOW_ACTION_TYPE_END;
+}
+
+struct rte_flow *
+generate_flow(uint16_t port_id,
+	uint16_t group,
+	uint32_t flow_attrs,
+	uint32_t flow_items,
+	uint32_t flow_actions,
+	uint16_t next_table,
+	uint32_t outer_ip_src,
+	uint16_t hairpinq,
+	struct rte_flow_error *error)
+{
+	struct rte_flow_attr attr;
+	struct rte_flow_item items[MAX_ITEMS_NUM];
+	struct rte_flow_action actions[MAX_ACTIONS_NUM];
+	struct rte_flow *flow = NULL;
+
+	memset(items, 0, sizeof(items));
+	memset(actions, 0, sizeof(actions));
+	memset(&attr, 0, sizeof(struct rte_flow_attr));
+
+	fill_attributes(&attr, flow_attrs, group);
+
+	fill_actions(actions, flow_actions,
+		outer_ip_src, next_table, hairpinq);
+
+	fill_items(items, flow_items, outer_ip_src);
+
+	flow = rte_flow_create(port_id, &attr, items, actions, error);
+	return flow;
+}
diff --git a/app/test-flow-perf/flow_gen.h b/app/test-flow-perf/flow_gen.h
new file mode 100644
index 000000000..6b30a4ae2
--- /dev/null
+++ b/app/test-flow-perf/flow_gen.h
@@ -0,0 +1,37 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ *
+ * This file contains the items, actions and attributes
+ * definition. And the methods to prepare and fill items,
+ * actions and attributes to generate rte_flow rule.
+ */
+
+#ifndef FLOW_PERF_FLOW_GEN
+#define FLOW_PERF_FLOW_GEN
+
+#include <stdint.h>
+#include <rte_flow.h>
+
+#include "config.h"
+
+/* Actions */
+#define HAIRPIN_QUEUE_ACTION FLOW_ITEM_MASK(0)
+#define HAIRPIN_RSS_ACTION   FLOW_ITEM_MASK(1)
+
+/* Attributes */
+#define INGRESS              FLOW_ITEM_MASK(0)
+#define EGRESS               FLOW_ITEM_MASK(1)
+#define TRANSFER             FLOW_ITEM_MASK(2)
+
+struct rte_flow *
+generate_flow(uint16_t port_id,
+	uint16_t group,
+	uint32_t flow_attrs,
+	uint32_t flow_items,
+	uint32_t flow_actions,
+	uint16_t next_table,
+	uint32_t outer_ip_src,
+	uint16_t hairpinq,
+	struct rte_flow_error *error);
+
+#endif /* FLOW_PERF_FLOW_GEN */
diff --git a/app/test-flow-perf/items_gen.c b/app/test-flow-perf/items_gen.c
new file mode 100644
index 000000000..c84f45040
--- /dev/null
+++ b/app/test-flow-perf/items_gen.c
@@ -0,0 +1,277 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ *
+ * This file contain the implementations of the items
+ * related methods. Each Item have a method to prepare
+ * the item and add it into items array in given index.
+ */
+
+#include <stdint.h>
+#include <rte_flow.h>
+
+#include "items_gen.h"
+#include "config.h"
+
+void
+add_ether(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_eth eth_spec;
+	static struct rte_flow_item_eth eth_mask;
+
+	memset(&eth_spec, 0, sizeof(struct rte_flow_item_eth));
+	memset(&eth_mask, 0, sizeof(struct rte_flow_item_eth));
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_ETH;
+	items[items_counter].spec = &eth_spec;
+	items[items_counter].mask = &eth_mask;
+}
+
+void
+add_vlan(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_vlan vlan_spec;
+	static struct rte_flow_item_vlan vlan_mask;
+	uint16_t vlan_value = VLAN_VALUE;
+
+	memset(&vlan_spec, 0, sizeof(struct rte_flow_item_vlan));
+	memset(&vlan_mask, 0, sizeof(struct rte_flow_item_vlan));
+
+	vlan_spec.tci = RTE_BE16(vlan_value);
+	vlan_mask.tci = RTE_BE16(0xffff);
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_VLAN;
+	items[items_counter].spec = &vlan_spec;
+	items[items_counter].mask = &vlan_mask;
+}
+
+void
+add_ipv4(struct rte_flow_item *items,
+	uint8_t items_counter, rte_be32_t src_ipv4)
+{
+	static struct rte_flow_item_ipv4 ipv4_spec;
+	static struct rte_flow_item_ipv4 ipv4_mask;
+
+	memset(&ipv4_spec, 0, sizeof(struct rte_flow_item_ipv4));
+	memset(&ipv4_mask, 0, sizeof(struct rte_flow_item_ipv4));
+
+	ipv4_spec.hdr.src_addr = src_ipv4;
+	ipv4_mask.hdr.src_addr = RTE_BE32(0xffffffff);
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_IPV4;
+	items[items_counter].spec = &ipv4_spec;
+	items[items_counter].mask = &ipv4_mask;
+}
+
+
+void
+add_ipv6(struct rte_flow_item *items,
+	uint8_t items_counter, rte_be32_t src_ipv6)
+{
+	static struct rte_flow_item_ipv6 ipv6_spec;
+	static struct rte_flow_item_ipv6 ipv6_mask;
+
+	memset(&ipv6_spec, 0, sizeof(struct rte_flow_item_ipv6));
+	memset(&ipv6_mask, 0, sizeof(struct rte_flow_item_ipv6));
+
+	/** Set ipv6 src **/
+	memset(&ipv6_spec.hdr.src_addr, src_ipv6,
+		sizeof(ipv6_spec.hdr.src_addr) / 2);
+
+	/** Full mask **/
+	memset(&ipv6_mask.hdr.src_addr, 0xff,
+		sizeof(ipv6_spec.hdr.src_addr));
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_IPV6;
+	items[items_counter].spec = &ipv6_spec;
+	items[items_counter].mask = &ipv6_mask;
+}
+
+void
+add_tcp(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_tcp tcp_spec;
+	static struct rte_flow_item_tcp tcp_mask;
+
+	memset(&tcp_spec, 0, sizeof(struct rte_flow_item_tcp));
+	memset(&tcp_mask, 0, sizeof(struct rte_flow_item_tcp));
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_TCP;
+	items[items_counter].spec = &tcp_spec;
+	items[items_counter].mask = &tcp_mask;
+}
+
+void
+add_udp(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_udp udp_spec;
+	static struct rte_flow_item_udp udp_mask;
+
+	memset(&udp_spec, 0, sizeof(struct rte_flow_item_udp));
+	memset(&udp_mask, 0, sizeof(struct rte_flow_item_udp));
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_UDP;
+	items[items_counter].spec = &udp_spec;
+	items[items_counter].mask = &udp_mask;
+}
+
+void
+add_vxlan(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_vxlan vxlan_spec;
+	static struct rte_flow_item_vxlan vxlan_mask;
+	uint32_t vni_value;
+	uint8_t i;
+
+	vni_value = VNI_VALUE;
+
+	memset(&vxlan_spec, 0, sizeof(struct rte_flow_item_vxlan));
+	memset(&vxlan_mask, 0, sizeof(struct rte_flow_item_vxlan));
+
+	/* Set standard vxlan vni */
+	for (i = 0; i < 3; i++) {
+		vxlan_spec.vni[2 - i] = vni_value >> (i * 8);
+		vxlan_mask.vni[2 - i] = 0xff;
+	}
+
+	/* Standard vxlan flags */
+	vxlan_spec.flags = 0x8;
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_VXLAN;
+	items[items_counter].spec = &vxlan_spec;
+	items[items_counter].mask = &vxlan_mask;
+}
+
+void
+add_vxlan_gpe(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_vxlan_gpe vxlan_gpe_spec;
+	static struct rte_flow_item_vxlan_gpe vxlan_gpe_mask;
+	uint32_t vni_value;
+	uint8_t i;
+
+	vni_value = VNI_VALUE;
+
+	memset(&vxlan_gpe_spec, 0, sizeof(struct rte_flow_item_vxlan_gpe));
+	memset(&vxlan_gpe_mask, 0, sizeof(struct rte_flow_item_vxlan_gpe));
+
+	/* Set vxlan-gpe vni */
+	for (i = 0; i < 3; i++) {
+		vxlan_gpe_spec.vni[2 - i] = vni_value >> (i * 8);
+		vxlan_gpe_mask.vni[2 - i] = 0xff;
+	}
+
+	/* vxlan-gpe flags */
+	vxlan_gpe_spec.flags = 0x0c;
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_VXLAN_GPE;
+	items[items_counter].spec = &vxlan_gpe_spec;
+	items[items_counter].mask = &vxlan_gpe_mask;
+}
+
+void
+add_gre(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_gre gre_spec;
+	static struct rte_flow_item_gre gre_mask;
+	uint16_t proto;
+
+	proto = GRE_PROTO;
+
+	memset(&gre_spec, 0, sizeof(struct rte_flow_item_gre));
+	memset(&gre_mask, 0, sizeof(struct rte_flow_item_gre));
+
+	gre_spec.protocol = RTE_BE16(proto);
+	gre_mask.protocol = RTE_BE16(0xffff);
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_GRE;
+	items[items_counter].spec = &gre_spec;
+	items[items_counter].mask = &gre_mask;
+}
+
+void
+add_geneve(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_geneve geneve_spec;
+	static struct rte_flow_item_geneve geneve_mask;
+	uint32_t vni_value;
+	uint8_t i;
+
+	vni_value = VNI_VALUE;
+
+	memset(&geneve_spec, 0, sizeof(struct rte_flow_item_geneve));
+	memset(&geneve_mask, 0, sizeof(struct rte_flow_item_geneve));
+
+	for (i = 0; i < 3; i++) {
+		geneve_spec.vni[2 - i] = vni_value >> (i * 8);
+		geneve_mask.vni[2 - i] = 0xff;
+	}
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_GENEVE;
+	items[items_counter].spec = &geneve_spec;
+	items[items_counter].mask = &geneve_mask;
+}
+
+void
+add_gtp(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_gtp gtp_spec;
+	static struct rte_flow_item_gtp gtp_mask;
+	uint32_t teid_value;
+
+	teid_value = TEID_VALUE;
+
+	memset(&gtp_spec, 0, sizeof(struct rte_flow_item_gtp));
+	memset(&gtp_mask, 0, sizeof(struct rte_flow_item_gtp));
+
+	gtp_spec.teid = RTE_BE32(teid_value);
+	gtp_mask.teid = RTE_BE32(0xffffffff);
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_GTP;
+	items[items_counter].spec = &gtp_spec;
+	items[items_counter].mask = &gtp_mask;
+}
+
+void
+add_meta_data(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_meta meta_spec;
+	static struct rte_flow_item_meta meta_mask;
+	uint32_t data;
+
+	data = META_DATA;
+
+	memset(&meta_spec, 0, sizeof(struct rte_flow_item_meta));
+	memset(&meta_mask, 0, sizeof(struct rte_flow_item_meta));
+
+	meta_spec.data = RTE_BE32(data);
+	meta_mask.data = RTE_BE32(0xffffffff);
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_META;
+	items[items_counter].spec = &meta_spec;
+	items[items_counter].mask = &meta_mask;
+}
+
+
+void
+add_meta_tag(struct rte_flow_item *items, uint8_t items_counter)
+{
+	static struct rte_flow_item_tag tag_spec;
+	static struct rte_flow_item_tag tag_mask;
+	uint32_t data;
+	uint8_t index;
+
+	data = META_DATA;
+	index = TAG_INDEX;
+
+	memset(&tag_spec, 0, sizeof(struct rte_flow_item_tag));
+	memset(&tag_mask, 0, sizeof(struct rte_flow_item_tag));
+
+	tag_spec.data = RTE_BE32(data);
+	tag_mask.data = RTE_BE32(0xffffffff);
+	tag_spec.index = index;
+	tag_mask.index = 0xff;
+
+	items[items_counter].type = RTE_FLOW_ITEM_TYPE_TAG;
+	items[items_counter].spec = &tag_spec;
+	items[items_counter].mask = &tag_mask;
+}
diff --git a/app/test-flow-perf/items_gen.h b/app/test-flow-perf/items_gen.h
new file mode 100644
index 000000000..0edbc0b37
--- /dev/null
+++ b/app/test-flow-perf/items_gen.h
@@ -0,0 +1,31 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright 2020 Mellanox Technologies, Ltd
+ *
+ * This file contains the items related methods
+ */
+
+#ifndef FLOW_PERF_ITEMS_GEN
+#define FLOW_PERF_ITEMS_GEN
+
+#include <stdint.h>
+#include <rte_flow.h>
+
+#include "config.h"
+
+void add_ether(struct rte_flow_item *items, uint8_t items_counter);
+void add_vlan(struct rte_flow_item *items, uint8_t items_counter);
+void add_ipv4(struct rte_flow_item *items,
+	uint8_t items_counter, rte_be32_t src_ipv4);
+void add_ipv6(struct rte_flow_item *items,
+	uint8_t items_counter, rte_be32_t src_ipv6);
+void add_udp(struct rte_flow_item *items, uint8_t items_counter);
+void add_tcp(struct rte_flow_item *items, uint8_t items_counter);
+void add_vxlan(struct rte_flow_item *items, uint8_t items_counter);
+void add_vxlan_gpe(struct rte_flow_item *items, uint8_t items_counter);
+void add_gre(struct rte_flow_item *items, uint8_t items_counter);
+void add_geneve(struct rte_flow_item *items, uint8_t items_counter);
+void add_gtp(struct rte_flow_item *items, uint8_t items_counter);
+void add_meta_data(struct rte_flow_item *items, uint8_t items_counter);
+void add_meta_tag(struct rte_flow_item *items, uint8_t items_counter);
+
+#endif /* FLOW_PERF_ITEMS_GEN */
diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 8659870af..1feb73e6f 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -26,6 +26,7 @@ 
 #include <getopt.h>
 #include <stdbool.h>
 #include <sys/time.h>
+#include <signal.h>
 
 #include <rte_malloc.h>
 #include <rte_mempool.h>
@@ -34,29 +35,257 @@ 
 #include <rte_flow.h>
 
 #include "config.h"
+#include "flow_gen.h"
 
-static uint32_t nb_lcores;
+#define MAX_ITERATIONS             100
+#define DEFAULT_RULES_COUNT    4000000
+#define DEFAULT_ITERATION       100000
+
+struct rte_flow *flow;
+static uint8_t flow_group;
+
+static uint32_t flow_items;
+static uint32_t flow_actions;
+static uint32_t flow_attrs;
+static volatile bool force_quit;
+static bool dump_iterations;
 static struct rte_mempool *mbuf_mp;
+static uint32_t nb_lcores;
+static uint32_t flows_count;
+static uint32_t iterations_number;
+static uint32_t hairpinq;
 
 static void
 usage(char *progname)
 {
 	printf("\nusage: %s\n", progname);
+	printf("\nControl configurations:\n");
+	printf("  --flows-count=N: to set the number of needed"
+		" flows to insert, default is 4,000,000\n");
+	printf("  --dump-iterations: To print rates for each"
+		" iteration\n");
+
+	printf("To set flow attributes:\n");
+	printf("  --ingress: set ingress attribute in flows\n");
+	printf("  --egress: set egress attribute in flows\n");
+	printf("  --transfer: set transfer attribute in flows\n");
+	printf("  --group=N: set group for all flows,"
+		" default is 0\n");
+
+	printf("To set flow items:\n");
+	printf("  --ether: add ether layer in flow items\n");
+	printf("  --vlan: add vlan layer in flow items\n");
+	printf("  --ipv4: add ipv4 layer in flow items\n");
+	printf("  --ipv6: add ipv6 layer in flow items\n");
+	printf("  --tcp: add tcp layer in flow items\n");
+	printf("  --udp: add udp layer in flow items\n");
+	printf("  --vxlan: add vxlan layer in flow items\n");
+	printf("  --vxlan-gpe: add vxlan-gpe layer in flow items\n");
+	printf("  --gre: add gre layer in flow items\n");
+	printf("  --geneve: add geneve layer in flow items\n");
+	printf("  --gtp: add gtp layer in flow items\n");
+	printf("  --meta: add meta layer in flow items\n");
+	printf("  --tag: add tag layer in flow items\n");
+
+	printf("To set flow actions:\n");
+	printf("  --port-id: add port-id action in flow actions\n");
+	printf("  --rss: add rss action in flow actions\n");
+	printf("  --queue: add queue action in flow actions\n");
+	printf("  --jump: add jump action in flow actions\n");
+	printf("  --mark: add mark action in flow actions\n");
+	printf("  --count: add count action in flow actions\n");
+	printf("  --set-meta: add set meta action in flow actions\n");
+	printf("  --set-tag: add set tag action in flow actions\n");
+	printf("  --drop: add drop action in flow actions\n");
+	printf("  --hairpin-queue=N: add hairpin-queue action in flow actions\n");
+	printf("  --hairpin-rss=N: add hairping-rss action in flow actions\n");
 }
 
 static void
 args_parse(int argc, char **argv)
 {
 	char **argvopt;
-	int opt;
+	int n, opt;
 	int opt_idx;
+	size_t i;
+
+	static const struct option_dict {
+		const char *str;
+		const uint64_t mask;
+		uint32_t *bitmap;
+	} flow_options[] = {
+		{
+			.str = "ether",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "ipv4",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV4),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "ipv6",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_IPV6),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "vlan",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VLAN),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "tcp",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TCP),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "udp",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_UDP),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "vxlan",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "vxlan-gpe",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_VXLAN_GPE),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "gre",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GRE),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "geneve",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GENEVE),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "gtp",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_GTP),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "meta",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_META),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "tag",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_TAG),
+			.bitmap = &flow_items
+		},
+		{
+			.str = "ingress",
+			.mask = INGRESS,
+			.bitmap = &flow_attrs
+		},
+		{
+			.str = "egress",
+			.mask = EGRESS,
+			.bitmap = &flow_attrs
+		},
+		{
+			.str = "transfer",
+			.mask = TRANSFER,
+			.bitmap = &flow_attrs
+		},
+		{
+			.str = "port-id",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_PORT_ID),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "rss",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_RSS),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "queue",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_QUEUE),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "jump",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "mark",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_MARK),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "count",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_COUNT),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "set-meta",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_META),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "set-tag",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_SET_TAG),
+			.bitmap = &flow_actions
+		},
+		{
+			.str = "drop",
+			.mask = FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_DROP),
+			.bitmap = &flow_actions
+		}
+	};
+
 	static struct option lgopts[] = {
 		/* Control */
 		{ "help",                       0, 0, 0 },
+		{ "flows-count",                1, 0, 0 },
+		{ "dump-iterations",            0, 0, 0 },
+		/* Attributes */
+		{ "ingress",                    0, 0, 0 },
+		{ "egress",                     0, 0, 0 },
+		{ "transfer",                   0, 0, 0 },
+		{ "group",                      1, 0, 0 },
+		/* Items */
+		{ "ether",                      0, 0, 0 },
+		{ "vlan",                       0, 0, 0 },
+		{ "ipv4",                       0, 0, 0 },
+		{ "ipv6",                       0, 0, 0 },
+		{ "tcp",                        0, 0, 0 },
+		{ "udp",                        0, 0, 0 },
+		{ "vxlan",                      0, 0, 0 },
+		{ "vxlan-gpe",                  0, 0, 0 },
+		{ "gre",                        0, 0, 0 },
+		{ "geneve",                     0, 0, 0 },
+		{ "gtp",                        0, 0, 0 },
+		{ "meta",                       0, 0, 0 },
+		{ "tag",                        0, 0, 0 },
+		/* Actions */
+		{ "port-id",                    0, 0, 0 },
+		{ "rss",                        0, 0, 0 },
+		{ "queue",                      0, 0, 0 },
+		{ "jump",                       0, 0, 0 },
+		{ "mark",                       0, 0, 0 },
+		{ "count",                      0, 0, 0 },
+		{ "set-meta",                   0, 0, 0 },
+		{ "set-tag",                    0, 0, 0 },
+		{ "drop",                       0, 0, 0 },
+		{ "hairpin-queue",              1, 0, 0 },
+		{ "hairpin-rss",                1, 0, 0 },
 	};
 
+	flow_items = 0;
+	flow_actions = 0;
+	flow_attrs = 0;
+	hairpinq = 0;
 	argvopt = argv;
 
+	printf(":: Flow -> ");
 	while ((opt = getopt_long(argc, argvopt, "",
 				lgopts, &opt_idx)) != EOF) {
 		switch (opt) {
@@ -65,6 +294,65 @@  args_parse(int argc, char **argv)
 				usage(argv[0]);
 				rte_exit(EXIT_SUCCESS, "Displayed help\n");
 			}
+
+			if (strcmp(lgopts[opt_idx].name, "group") == 0) {
+				n = atoi(optarg);
+				if (n >= 0)
+					flow_group = n;
+				else
+					rte_exit(EXIT_SUCCESS,
+						"flow group should be >= 0");
+				printf("group %d ", flow_group);
+			}
+
+			for (i = 0; i < RTE_DIM(flow_options); i++)
+				if (strcmp(lgopts[opt_idx].name,
+						flow_options[i].str) == 0) {
+					*flow_options[i].bitmap |=
+						flow_options[i].mask;
+					printf("%s / ", flow_options[i].str);
+				}
+
+			if (strcmp(lgopts[opt_idx].name,
+					"hairpin-rss") == 0) {
+				n = atoi(optarg);
+				if (n > 0)
+					hairpinq = n;
+				else
+					rte_exit(EXIT_SUCCESS,
+						"Hairpin queues should be > 0 ");
+
+				flow_actions |= HAIRPIN_RSS_ACTION;
+				printf("hairpin-rss / ");
+			}
+			if (strcmp(lgopts[opt_idx].name,
+					"hairpin-queue") == 0) {
+				n = atoi(optarg);
+				if (n > 0)
+					hairpinq = n;
+				else
+					rte_exit(EXIT_SUCCESS,
+						"Hairpin queues should be > 0 ");
+
+				flow_actions |= HAIRPIN_QUEUE_ACTION;
+				printf("hairpin-queue / ");
+			}
+
+			/* Control */
+			if (strcmp(lgopts[opt_idx].name,
+					"flows-count") == 0) {
+				n = atoi(optarg);
+				if (n > (int) iterations_number)
+					flows_count = n;
+				else {
+					printf("\n\nflows_count should be > %d",
+						iterations_number);
+					rte_exit(EXIT_SUCCESS, " ");
+				}
+			}
+			if (strcmp(lgopts[opt_idx].name,
+					"dump-iterations") == 0)
+				dump_iterations = true;
 			break;
 		default:
 			fprintf(stderr, "Invalid option: %s\n", argv[optind]);
@@ -73,6 +361,130 @@  args_parse(int argc, char **argv)
 			break;
 		}
 	}
+	printf("end_flow\n");
+}
+
+static void
+print_flow_error(struct rte_flow_error error)
+{
+	printf("Flow can't be created %d message: %s\n",
+		error.type,
+		error.message ? error.message : "(no stated reason)");
+}
+
+static inline void
+flows_handler(void)
+{
+	struct rte_flow_error error;
+	clock_t start_iter, end_iter;
+	double cpu_time_used;
+	double flows_rate;
+	double cpu_time_per_iter[MAX_ITERATIONS];
+	double delta;
+	uint16_t nr_ports;
+	uint32_t i;
+	int port_id;
+	int iter_id;
+	uint32_t eagain_counter = 0;
+
+	nr_ports = rte_eth_dev_count_avail();
+
+	for (i = 0; i < MAX_ITERATIONS; i++)
+		cpu_time_per_iter[i] = -1;
+
+	if (iterations_number > flows_count)
+		iterations_number = flows_count;
+
+	printf(":: Flows Count per port: %d\n", flows_count);
+
+	for (port_id = 0; port_id < nr_ports; port_id++) {
+		cpu_time_used = 0;
+		if (flow_group > 0) {
+			/*
+			 * Create global rule to jump into flow_group,
+			 * this way the app will avoid the default rules.
+			 *
+			 * Golbal rule:
+			 * group 0 eth / end actions jump group <flow_group>
+			 *
+			 */
+			flow = generate_flow(port_id, 0, flow_attrs,
+				FLOW_ITEM_MASK(RTE_FLOW_ITEM_TYPE_ETH),
+				FLOW_ITEM_MASK(RTE_FLOW_ACTION_TYPE_JUMP),
+				flow_group, 0, 0, &error);
+
+			if (flow == NULL) {
+				print_flow_error(error);
+				rte_exit(EXIT_FAILURE, "error in creating flow");
+			}
+		}
+
+		/* Insertion Rate */
+		printf("Flows insertion on port = %d\n", port_id);
+		start_iter = clock();
+		for (i = 0; i < flows_count; i++) {
+			do {
+				rte_errno = 0;
+				flow = generate_flow(port_id, flow_group,
+					flow_attrs, flow_items, flow_actions,
+					JUMP_ACTION_TABLE, i, hairpinq, &error);
+				if (flow == NULL)
+					eagain_counter++;
+			} while (rte_errno == EAGAIN);
+
+			if (force_quit)
+				i = flows_count;
+
+			if (!flow) {
+				print_flow_error(error);
+				rte_exit(EXIT_FAILURE, "error in creating flow");
+			}
+
+			if (i && !((i + 1) % iterations_number)) {
+				/* Save the insertion rate of each iter */
+				end_iter = clock();
+				delta = (double) (end_iter - start_iter);
+				iter_id = ((i + 1) / iterations_number) - 1;
+				cpu_time_per_iter[iter_id] =
+					delta / CLOCKS_PER_SEC;
+				cpu_time_used += cpu_time_per_iter[iter_id];
+				start_iter = clock();
+			}
+		}
+
+		/* Iteration rate per iteration */
+		if (dump_iterations)
+			for (i = 0; i < MAX_ITERATIONS; i++) {
+				if (cpu_time_per_iter[i] == -1)
+					continue;
+				delta = (double)(iterations_number /
+					cpu_time_per_iter[i]);
+				flows_rate = delta / 1000;
+				printf(":: Iteration #%d: %d flows "
+					"in %f sec[ Rate = %f K/Sec ]\n",
+					i, iterations_number,
+					cpu_time_per_iter[i], flows_rate);
+			}
+
+		/* Insertion rate for all flows */
+		flows_rate = ((double) (flows_count / cpu_time_used) / 1000);
+		printf("\n:: Total flow insertion rate -> %f K/Sec\n",
+						flows_rate);
+		printf(":: The time for creating %d in flows %f seconds\n",
+						flows_count, cpu_time_used);
+		printf(":: EAGAIN counter = %d\n", eagain_counter);
+	}
+}
+
+static void
+signal_handler(int signum)
+{
+	if (signum == SIGINT || signum == SIGTERM) {
+		printf("\n\nSignal %d received, preparing to exit...\n",
+					signum);
+		printf("Error: Stats are wrong due to sudden signal!\n\n");
+		force_quit = true;
+	}
 }
 
 static void
@@ -80,8 +492,13 @@  init_port(void)
 {
 	int ret;
 	uint16_t std_queue;
+	uint16_t hairpin_q;
 	uint16_t port_id;
 	uint16_t nr_ports;
+	uint16_t nr_queues;
+	struct rte_eth_hairpin_conf hairpin_conf = {
+		.peer_count = 1,
+	};
 	struct rte_eth_conf port_conf = {
 		.rx_adv_conf = {
 			.rss_conf.rss_hf =
@@ -92,6 +509,10 @@  init_port(void)
 	struct rte_eth_rxconf rxq_conf;
 	struct rte_eth_dev_info dev_info;
 
+	nr_queues = RXQ_NUM;
+	if (hairpinq != 0)
+		nr_queues = RXQ_NUM + hairpinq;
+
 	nr_ports = rte_eth_dev_count_avail();
 	if (nr_ports == 0)
 		rte_exit(EXIT_FAILURE, "Error: no port detected\n");
@@ -116,8 +537,8 @@  init_port(void)
 
 		printf(":: initializing port: %d\n", port_id);
 
-		ret = rte_eth_dev_configure(port_id, RXQ_NUM,
-				TXQ_NUM, &port_conf);
+		ret = rte_eth_dev_configure(port_id, nr_queues,
+				nr_queues, &port_conf);
 		if (ret < 0)
 			rte_exit(EXIT_FAILURE,
 				":: cannot configure device: err=%d, port=%u\n",
@@ -153,6 +574,38 @@  init_port(void)
 				":: promiscuous mode enable failed: err=%s, port=%u\n",
 				rte_strerror(-ret), port_id);
 
+		if (hairpinq != 0) {
+			for (hairpin_q = RXQ_NUM, std_queue = 0;
+					std_queue < nr_queues;
+					hairpin_q++, std_queue++) {
+				hairpin_conf.peers[0].port = port_id;
+				hairpin_conf.peers[0].queue =
+					std_queue + TXQ_NUM;
+				ret = rte_eth_rx_hairpin_queue_setup(
+						port_id, hairpin_q,
+						NR_RXD, &hairpin_conf);
+				if (ret != 0)
+					rte_exit(EXIT_FAILURE,
+						":: Hairpin rx queue setup failed: err=%d, port=%u\n",
+						ret, port_id);
+			}
+
+			for (hairpin_q = TXQ_NUM, std_queue = 0;
+					std_queue < nr_queues;
+					hairpin_q++, std_queue++) {
+				hairpin_conf.peers[0].port = port_id;
+				hairpin_conf.peers[0].queue =
+					std_queue + RXQ_NUM;
+				ret = rte_eth_tx_hairpin_queue_setup(
+						port_id, hairpin_q,
+						NR_TXD, &hairpin_conf);
+				if (ret != 0)
+					rte_exit(EXIT_FAILURE,
+						":: Hairpin tx queue setup failed: err=%d, port=%u\n",
+						ret, port_id);
+			}
+		}
+
 		ret = rte_eth_dev_start(port_id);
 		if (ret < 0)
 			rte_exit(EXIT_FAILURE,
@@ -174,6 +627,15 @@  main(int argc, char **argv)
 	if (ret < 0)
 		rte_exit(EXIT_FAILURE, "EAL init failed\n");
 
+	force_quit = false;
+	dump_iterations = false;
+	flows_count = DEFAULT_RULES_COUNT;
+	iterations_number = DEFAULT_ITERATION;
+	flow_group = 0;
+
+	signal(SIGINT, signal_handler);
+	signal(SIGTERM, signal_handler);
+
 	argc -= ret;
 	argv += ret;
 	if (argc > 1)
@@ -185,6 +647,8 @@  main(int argc, char **argv)
 	if (nb_lcores <= 1)
 		rte_exit(EXIT_FAILURE, "This app needs at least two cores\n");
 
+	flows_handler();
+
 	RTE_ETH_FOREACH_DEV(port) {
 		rte_flow_flush(port, &error);
 		rte_eth_dev_stop(port);
diff --git a/app/test-flow-perf/meson.build b/app/test-flow-perf/meson.build
index 25711378f..6eaf83b41 100644
--- a/app/test-flow-perf/meson.build
+++ b/app/test-flow-perf/meson.build
@@ -2,6 +2,9 @@ 
 # Copyright(c) 2020 Mellanox Technologies, Ltd
 
 sources = files(
+	'actions_gen.c',
+	'flow_gen.c',
+	'items_gen.c',
 	'main.c',
 )
 
diff --git a/doc/guides/rel_notes/release_20_05.rst b/doc/guides/rel_notes/release_20_05.rst
index 7abcae3aa..0e4dcf1ad 100644
--- a/doc/guides/rel_notes/release_20_05.rst
+++ b/doc/guides/rel_notes/release_20_05.rst
@@ -216,6 +216,9 @@  New Features
 
   Add new application to test rte_flow performance.
 
+  Application features:
+  * Measure rte_flow insertion rate.
+
 
 Removed Items
 -------------
diff --git a/doc/guides/tools/flow-perf.rst b/doc/guides/tools/flow-perf.rst
index 49eb450ae..6f3f7dafb 100644
--- a/doc/guides/tools/flow-perf.rst
+++ b/doc/guides/tools/flow-perf.rst
@@ -1,10 +1,29 @@ 
 ..	SPDX-License-Identifier: BSD-3-Clause
 	Copyright 2020 Mellanox Technologies, Ltd
 
-Flow performance tool
+Flow Performance Tool
 =====================
 
 Application for rte_flow performance testing.
+The application provide the ability to test insertion rate of specific
+rte_flow rule, by stressing it to the NIC, and calculate the insertion
+rate.
+
+The application offers some options in the command line, to configure
+which rule to apply.
+
+After that the application will start producing rules with same pattern
+but increasing the outer IP source address by 1 each time, thus it will
+give different flow each time, and all other items will have open masks.
+
+
+Known Limitations
+=================
+
+The current version has limitations which can be removed in future:
+
+* Support outer items up to tunnel layer only.
+* Single core insertion only.
 
 
 Compiling the Application
@@ -27,7 +46,7 @@  or :doc:`EAL parameters (FreeBSD) <../freebsd_gsg/freebsd_eal_parameters>` for
 a list of available EAL command-line options.
 
 
-Flow performance Options
+Flow Performance Options
 ------------------------
 
 The following are the command-line options for the flow performance application.
@@ -36,9 +55,179 @@  with a ``--`` separator:
 
 .. code-block:: console
 
-	sudo ./dpdk-test-flow-perf -n 4 -w 08:00.0 --
+	sudo ./dpdk-test-flow_perf -n 4 -w 08:00.0 -- --ingress --ether --ipv4 --queue --flows-count=1000000
 
 The command line options are:
 
 *	``--help``
 	Display a help message and quit.
+
+*	``--flows-count=N``
+	Set the number of needed flows to insert,
+	where 1 <= N <= "number of flows".
+	The default value is 4,000,000.
+
+*	``--dump-iterations``
+	Print rates for each iteration of flows.
+	Default iteration is 1,00,000.
+
+
+Attributes:
+
+*	``--ingress``
+	Set Ingress attribute to all flows attributes.
+
+*	``--egress``
+	Set Egress attribute to all flows attributes.
+
+*	``--transfer``
+	Set Transfer attribute to all flows attributes.
+
+*	``--group=N``
+	Set group for all flows, where N >= 0.
+	Default group is 0.
+
+Items:
+
+*	``--ether``
+	Add Ether item to all flows items, This item have open mask.
+
+*	``--vlan``
+	Add VLAN item to all flows items,
+	This item have VLAN value defined in user_parameters.h
+	under ``VNI_VALUE`` with full mask, default value = 1.
+	Other fields are open mask.
+
+*	``--ipv4``
+	Add IPv4 item to all flows items,
+	This item have incremental source IP, with full mask.
+	Other fields are open mask.
+
+*	``--ipv6``
+	Add IPv6 item to all flows item,
+	This item have incremental source IP, with full mask.
+	Other fields are open mask.
+
+*	``--tcp``
+	Add TCP item to all flows items, This item have open mask.
+
+*	``--udp``
+	Add UDP item to all flows items, This item have open mask.
+
+*	``--vxlan``
+	Add VXLAN item to all flows items,
+	This item have VNI value defined in user_parameters.h
+	under ``VNI_VALUE`` with full mask, default value = 1.
+	Other fields are open mask.
+
+*	``--vxlan-gpe``
+	Add VXLAN-GPE item to all flows items,
+	This item have VNI value defined in user_parameters.h
+	under ``VNI_VALUE`` with full mask, default value = 1.
+	Other fields are open mask.
+
+*	``--gre``
+	Add GRE item to all flows items,
+	This item have protocol value defined in user_parameters.h
+	under ``GRE_PROTO`` with full mask, default protocol = 0x6558 "Ether"
+	Other fields are open mask.
+
+*	``--geneve``
+	Add GENEVE item to all flows items,
+	This item have VNI value defined in user_parameters.h
+	under ``VNI_VALUE`` with full mask, default value = 1.
+	Other fields are open mask.
+
+*	``--gtp``
+	Add GTP item to all flows items,
+	This item have TEID value defined in user_parameters.h
+	under ``TEID_VALUE`` with full mask, default value = 1.
+	Other fields are open mask.
+
+*	``--meta``
+	Add Meta item to all flows items,
+	This item have data value defined in user_parameters.h
+	under ``META_DATA`` with full mask, default value = 1.
+	Other fields are open mask.
+
+*	``--tag``
+	Add Tag item to all flows items,
+	This item have data value defined in user_parameters.h
+	under ``META_DATA`` with full mask, default value = 1.
+
+	Also it have tag value defined in user_parameters.h
+	under ``TAG_INDEX`` with full mask, default value = 0.
+	Other fields are open mask.
+
+
+Actions:
+
+*	``--port-id``
+	Add port redirection action to all flows actions.
+	Port redirection destination is defined in user_parameters.h
+	under PORT_ID_DST, default value = 1.
+
+*	``--rss``
+	Add RSS action to all flows actions,
+	The queues in RSS action will be all queues configured
+	in the app.
+
+*	``--queue``
+	Add queue action to all flows items,
+	The queue will change in round robin state for each flow.
+
+	For example:
+		The app running with 4 RX queues
+		Flow #0: queue index 0
+		Flow #1: queue index 1
+		Flow #2: queue index 2
+		Flow #3: queue index 3
+		Flow #4: queue index 0
+		...
+
+*	``--jump``
+	Add jump action to all flows actions.
+	Jump action destination is defined in user_parameters.h
+	under ``JUMP_ACTION_TABLE``, default value = 2.
+
+*	``--mark``
+	Add mark action to all flows actions.
+	Mark action id is defined in user_parameters.h
+	under ``MARK_ID``, default value = 1.
+
+*	``--count``
+	Add count action to all flows actions.
+
+*	``--set-meta``
+	Add set-meta action to all flows actions.
+	Meta data is defined in user_parameters.h under ``META_DATA``
+	with full mask, default value = 1.
+
+*	``--set-tag``
+	Add set-tag action to all flows actions.
+	Meta data is defined in user_parameters.h under ``META_DATA``
+	with full mask, default value = 1.
+
+	Tag index is defined in user_parameters.h under ``TAG_INDEX``
+	with full mask, default value = 0.
+
+*	``--drop``
+	Add drop action to all flows actions.
+
+*	``--hairpin-queue=N``
+	Add hairpin queue action to all flows actions.
+	The queue will change in round robin state for each flow.
+
+	For example:
+		The app running with 4 RX hairpin queues and 4 normal RX queues
+		Flow #0: queue index 4
+		Flow #1: queue index 5
+		Flow #2: queue index 6
+		Flow #3: queue index 7
+		Flow #4: queue index 4
+		...
+
+*	``--hairpin-rss=N``
+	Add hairpin RSS action to all flows actions.
+	The queues in RSS action will be all hairpin queues configured
+	in the app.