ethdev: add flow tag

Message ID 20190704232302.19582-1-yskoh@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Ferruh Yigit
Headers
Series ethdev: add flow tag |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail apply issues

Commit Message

Yongseok Koh July 4, 2019, 11:23 p.m. UTC
  A tag is a transient data which can be used during flow match. This can be
used to store match result from a previous table so that the same pattern
need not be matched again on the next table. Even if outer header is
decapsulated on the previous match, the match result can be kept.

Some device expose internal registers of its flow processing pipeline and
those registers are quite useful for stateful connection tracking as it
keeps status of flow matching. Multiple tags are supported by specifying
index.

Example testpmd commands are:

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress pattern ... / end
    actions set_tag index 2 value 0xcc00 mask 0xff00 /
            set_tag index 3 value 0x123456 mask 0xffffff /
            vxlan_decap / jump group 1 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
            eth ... / end
    actions ... jump group 2 / end

  flow create 0 ingress group 1
    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
            tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

  flow create 0 ingress group 2
    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
            eth ... / end
    actions ... / end

Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
---
 app/test-pmd/cmdline_flow.c            | 75 ++++++++++++++++++++++++++
 doc/guides/prog_guide/rte_flow.rst     | 50 +++++++++++++++++
 doc/guides/rel_notes/release_19_08.rst |  4 ++
 lib/librte_ethdev/rte_flow.h           | 54 +++++++++++++++++++
 4 files changed, 183 insertions(+)
  

Comments

Adrien Mazarguil July 5, 2019, 1:54 p.m. UTC | #1
On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> A tag is a transient data which can be used during flow match. This can be
> used to store match result from a previous table so that the same pattern
> need not be matched again on the next table. Even if outer header is
> decapsulated on the previous match, the match result can be kept.
> 
> Some device expose internal registers of its flow processing pipeline and
> those registers are quite useful for stateful connection tracking as it
> keeps status of flow matching. Multiple tags are supported by specifying
> index.
> 
> Example testpmd commands are:
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress pattern ... / end
>     actions set_tag index 2 value 0xcc00 mask 0xff00 /
>             set_tag index 3 value 0x123456 mask 0xffffff /
>             vxlan_decap / jump group 1 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>             eth ... / end
>     actions ... jump group 2 / end
> 
>   flow create 0 ingress group 1
>     pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>             tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
>   flow create 0 ingress group 2
>     pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>             eth ... / end
>     actions ... / end
> 
> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>

Hi Yongseok,

Only high level questions for now, while it unquestionably looks useful,
from a user standpoint exposing the separate index seems redundant and not
necessarily convenient. Using the following example to illustrate:

 actions set_tag index 3 value 0x123456 mask 0xfffff

 pattern tag index is 3 value spec 0x123456 value mask 0xffffff

I might be missing something, but why isn't this enough:

 pattern tag index is 3 # match whatever is stored at index 3

Assuming it can work, then why bother with providing value spec/mask on
set_tag? A flow rule pattern matches something, sets some arbitrary tag to
be matched by a subsequent flow rule and that's it. It even seems like
relying on the index only on both occasions is enough for identification.

Same question for the opposite approach; relying on the value, never
mentioning the index.

I'm under the impression that the index is a hardware-specific constraint
that shouldn't be exposed (especially since it's an 8-bit field). If so, a
PMD could keep track of used indices without having them exposed through the
public API.
  
Yongseok Koh July 5, 2019, 6:05 p.m. UTC | #2
> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> 
> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>> A tag is a transient data which can be used during flow match. This can be
>> used to store match result from a previous table so that the same pattern
>> need not be matched again on the next table. Even if outer header is
>> decapsulated on the previous match, the match result can be kept.
>> 
>> Some device expose internal registers of its flow processing pipeline and
>> those registers are quite useful for stateful connection tracking as it
>> keeps status of flow matching. Multiple tags are supported by specifying
>> index.
>> 
>> Example testpmd commands are:
>> 
>>  flow create 0 ingress pattern ... / end
>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>            vxlan_decap / jump group 1 / end
>> 
>>  flow create 0 ingress pattern ... / end
>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>            vxlan_decap / jump group 1 / end
>> 
>>  flow create 0 ingress group 1
>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>            eth ... / end
>>    actions ... jump group 2 / end
>> 
>>  flow create 0 ingress group 1
>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
>>            eth ... / end
>>    actions ... / end
>> 
>>  flow create 0 ingress group 2
>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>            eth ... / end
>>    actions ... / end
>> 
>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> 
> Hi Yongseok,
> 
> Only high level questions for now, while it unquestionably looks useful,
> from a user standpoint exposing the separate index seems redundant and not
> necessarily convenient. Using the following example to illustrate:
> 
> actions set_tag index 3 value 0x123456 mask 0xfffff
> 
> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> 
> I might be missing something, but why isn't this enough:
> 
> pattern tag index is 3 # match whatever is stored at index 3
> 
> Assuming it can work, then why bother with providing value spec/mask on
> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> be matched by a subsequent flow rule and that's it. It even seems like
> relying on the index only on both occasions is enough for identification.
> 
> Same question for the opposite approach; relying on the value, never
> mentioning the index.
> 
> I'm under the impression that the index is a hardware-specific constraint
> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> PMD could keep track of used indices without having them exposed through the
> public API.


Thank you for review, Adrien.
Hope you are doing well. It's been long since we talked each other. :-)

Your approach will work too in general but we have a request from customer that
they want to partition this limited tag storage. Assuming that HW exposes 32bit
tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
table id and 8bit flow id. As they want to split one 32bit storage, I thought it
is better to provide mask when setting/matching the value. Even some customer
wants to store multiple flags bit by bit like ol_flags. They do want to alter
only partial bits.

And for the index, it is to reference an entry of tags array as HW can provide
larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
storage which users can use for their own sake.
	tag[0], tag[1], tag[2], tag[3]


Thanks,
Yongseok
  
Yongseok Koh July 8, 2019, 11:32 p.m. UTC | #3
> On Jul 5, 2019, at 11:05 AM, Yongseok Koh <yskoh@mellanox.com> wrote:
> 
> 
> 
>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
>> 
>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>>> A tag is a transient data which can be used during flow match. This can be
>>> used to store match result from a previous table so that the same pattern
>>> need not be matched again on the next table. Even if outer header is
>>> decapsulated on the previous match, the match result can be kept.
>>> 
>>> Some device expose internal registers of its flow processing pipeline and
>>> those registers are quite useful for stateful connection tracking as it
>>> keeps status of flow matching. Multiple tags are supported by specifying
>>> index.
>>> 
>>> Example testpmd commands are:
>>> 
>>> flow create 0 ingress pattern ... / end
>>>   actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>>           set_tag index 3 value 0x123456 mask 0xffffff /
>>>           vxlan_decap / jump group 1 / end
>>> 
>>> flow create 0 ingress pattern ... / end
>>>   actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>>           set_tag index 3 value 0x123456 mask 0xffffff /
>>>           vxlan_decap / jump group 1 / end
>>> 
>>> flow create 0 ingress group 1
>>>   pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>>           eth ... / end
>>>   actions ... jump group 2 / end
>>> 
>>> flow create 0 ingress group 1
>>>   pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>>           tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>           eth ... / end
>>>   actions ... / end
>>> 
>>> flow create 0 ingress group 2
>>>   pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>           eth ... / end
>>>   actions ... / end
>>> 
>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>> 
>> Hi Yongseok,
>> 
>> Only high level questions for now, while it unquestionably looks useful,
>> from a user standpoint exposing the separate index seems redundant and not
>> necessarily convenient. Using the following example to illustrate:
>> 
>> actions set_tag index 3 value 0x123456 mask 0xfffff
>> 
>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
>> 
>> I might be missing something, but why isn't this enough:
>> 
>> pattern tag index is 3 # match whatever is stored at index 3
>> 
>> Assuming it can work, then why bother with providing value spec/mask on
>> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
>> be matched by a subsequent flow rule and that's it. It even seems like
>> relying on the index only on both occasions is enough for identification.
>> 
>> Same question for the opposite approach; relying on the value, never
>> mentioning the index.
>> 
>> I'm under the impression that the index is a hardware-specific constraint
>> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
>> PMD could keep track of used indices without having them exposed through the
>> public API.
> 
> 
> Thank you for review, Adrien.
> Hope you are doing well. It's been long since we talked each other. :-)
> 
> Your approach will work too in general but we have a request from customer that
> they want to partition this limited tag storage. Assuming that HW exposes 32bit
> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> is better to provide mask when setting/matching the value. Even some customer
> wants to store multiple flags bit by bit like ol_flags. They do want to alter
> only partial bits.
> 
> And for the index, it is to reference an entry of tags array as HW can provide
> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> storage which users can use for their own sake.
> 	tag[0], tag[1], tag[2], tag[3]

Adrien,

Are you okay with this?

Thanks,
Yongseok
  
Adrien Mazarguil July 9, 2019, 8:38 a.m. UTC | #4
On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> > On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> > 
> > On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> >> A tag is a transient data which can be used during flow match. This can be
> >> used to store match result from a previous table so that the same pattern
> >> need not be matched again on the next table. Even if outer header is
> >> decapsulated on the previous match, the match result can be kept.
> >> 
> >> Some device expose internal registers of its flow processing pipeline and
> >> those registers are quite useful for stateful connection tracking as it
> >> keeps status of flow matching. Multiple tags are supported by specifying
> >> index.
> >> 
> >> Example testpmd commands are:
> >> 
> >>  flow create 0 ingress pattern ... / end
> >>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> >>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>            vxlan_decap / jump group 1 / end
> >> 
> >>  flow create 0 ingress pattern ... / end
> >>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> >>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>            vxlan_decap / jump group 1 / end
> >> 
> >>  flow create 0 ingress group 1
> >>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> >>            eth ... / end
> >>    actions ... jump group 2 / end
> >> 
> >>  flow create 0 ingress group 1
> >>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> >>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>            eth ... / end
> >>    actions ... / end
> >> 
> >>  flow create 0 ingress group 2
> >>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>            eth ... / end
> >>    actions ... / end
> >> 
> >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > 
> > Hi Yongseok,
> > 
> > Only high level questions for now, while it unquestionably looks useful,
> > from a user standpoint exposing the separate index seems redundant and not
> > necessarily convenient. Using the following example to illustrate:
> > 
> > actions set_tag index 3 value 0x123456 mask 0xfffff
> > 
> > pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> > 
> > I might be missing something, but why isn't this enough:
> > 
> > pattern tag index is 3 # match whatever is stored at index 3
> > 
> > Assuming it can work, then why bother with providing value spec/mask on
> > set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> > be matched by a subsequent flow rule and that's it. It even seems like
> > relying on the index only on both occasions is enough for identification.
> > 
> > Same question for the opposite approach; relying on the value, never
> > mentioning the index.
> > 
> > I'm under the impression that the index is a hardware-specific constraint
> > that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> > PMD could keep track of used indices without having them exposed through the
> > public API.
> 
> 
> Thank you for review, Adrien.
> Hope you are doing well. It's been long since we talked each other. :-)

Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
answer these days...

 <dev@dpdk.org> hey!
 <dev@dpdk.org> no private talks!

Back to the topic:

> Your approach will work too in general but we have a request from customer that
> they want to partition this limited tag storage. Assuming that HW exposes 32bit
> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> is better to provide mask when setting/matching the value. Even some customer
> wants to store multiple flags bit by bit like ol_flags. They do want to alter
> only partial bits.
> 
> And for the index, it is to reference an entry of tags array as HW can provide
> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> storage which users can use for their own sake.
> 	tag[0], tag[1], tag[2], tag[3]

OK, looks like I missed the point then. I initially took it for a funky
alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
(ingress extended [1]) but while it could be used like that, it's more of a
way to temporarily store and retrieve a small amount of data, correct?

Out of curiosity, are these registers independent from META and other
items/actions in mlx5, otherwise what happens if they are combined?

Are there other uses for these registers? Say, referencing their contents
from other places in a flow rule so they don't have to be hard-coded?

Right now I'm still uncomfortable with such a feature in the public API
because compared to META [1], this approach looks very hardware-specific and
seemingly difficult to map on different HW architectures.

However, the main problem is that as described, its end purpose seems
redundant with META, which I think can cover the use cases you gave. So what
can an application do with this that couldn't be done in a more generic
fashion through META?

I may still be missing something and I'm open to ideas, but assuming it
doesn't make it into the public rte_flow API, it remains an interesting
feature on its own merit which could be added to DPDK as PMD-specific
pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
for PMDs to expose a public header that dedicated applications can include
to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
No problem with that.

[1] "[PATCH] ethdev: extend flow metadata"
    https://mails.dpdk.org/archives/dev/2019-July/137305.html

[2] "Negative types"
    https://doc.dpdk.org/guides/prog_guide/rte_flow.html#negative-types
  
Yongseok Koh July 11, 2019, 1:59 a.m. UTC | #5
On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> > > On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
> > > 
> > > On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> > >> A tag is a transient data which can be used during flow match. This can be
> > >> used to store match result from a previous table so that the same pattern
> > >> need not be matched again on the next table. Even if outer header is
> > >> decapsulated on the previous match, the match result can be kept.
> > >> 
> > >> Some device expose internal registers of its flow processing pipeline and
> > >> those registers are quite useful for stateful connection tracking as it
> > >> keeps status of flow matching. Multiple tags are supported by specifying
> > >> index.
> > >> 
> > >> Example testpmd commands are:
> > >> 
> > >>  flow create 0 ingress pattern ... / end
> > >>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> > >>            set_tag index 3 value 0x123456 mask 0xffffff /
> > >>            vxlan_decap / jump group 1 / end
> > >> 
> > >>  flow create 0 ingress pattern ... / end
> > >>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> > >>            set_tag index 3 value 0x123456 mask 0xffffff /
> > >>            vxlan_decap / jump group 1 / end
> > >> 
> > >>  flow create 0 ingress group 1
> > >>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> > >>            eth ... / end
> > >>    actions ... jump group 2 / end
> > >> 
> > >>  flow create 0 ingress group 1
> > >>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> > >>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> > >>            eth ... / end
> > >>    actions ... / end
> > >> 
> > >>  flow create 0 ingress group 2
> > >>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> > >>            eth ... / end
> > >>    actions ... / end
> > >> 
> > >> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> > > 
> > > Hi Yongseok,
> > > 
> > > Only high level questions for now, while it unquestionably looks useful,
> > > from a user standpoint exposing the separate index seems redundant and not
> > > necessarily convenient. Using the following example to illustrate:
> > > 
> > > actions set_tag index 3 value 0x123456 mask 0xfffff
> > > 
> > > pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> > > 
> > > I might be missing something, but why isn't this enough:
> > > 
> > > pattern tag index is 3 # match whatever is stored at index 3
> > > 
> > > Assuming it can work, then why bother with providing value spec/mask on
> > > set_tag? A flow rule pattern matches something, sets some arbitrary tag to
> > > be matched by a subsequent flow rule and that's it. It even seems like
> > > relying on the index only on both occasions is enough for identification.
> > > 
> > > Same question for the opposite approach; relying on the value, never
> > > mentioning the index.
> > > 
> > > I'm under the impression that the index is a hardware-specific constraint
> > > that shouldn't be exposed (especially since it's an 8-bit field). If so, a
> > > PMD could keep track of used indices without having them exposed through the
> > > public API.
> > 
> > 
> > Thank you for review, Adrien.
> > Hope you are doing well. It's been long since we talked each other. :-)
> 
> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
> answer these days...
> 
>  <dev@dpdk.org> hey!
>  <dev@dpdk.org> no private talks!
> 
> Back to the topic:
> 
> > Your approach will work too in general but we have a request from customer that
> > they want to partition this limited tag storage. Assuming that HW exposes 32bit
> > tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
> > store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
> > table id and 8bit flow id. As they want to split one 32bit storage, I thought it
> > is better to provide mask when setting/matching the value. Even some customer
> > wants to store multiple flags bit by bit like ol_flags. They do want to alter
> > only partial bits.
> > 
> > And for the index, it is to reference an entry of tags array as HW can provide
> > larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
> > storage which users can use for their own sake.
> > 	tag[0], tag[1], tag[2], tag[3]
> 
> OK, looks like I missed the point then. I initially took it for a funky
> alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
> (ingress extended [1]) but while it could be used like that, it's more of a
> way to temporarily store and retrieve a small amount of data, correct?

Correct.

> Out of curiosity, are these registers independent from META and other
> items/actions in mlx5, otherwise what happens if they are combined?

I thought about combining it but I chose this way. Because it is transient. META
can be set by packet descriptor on Tx and can be delivered to host via mbuf on
Rx, but this TAG item can't. If I combine it, users have to query this
capability for each 32b storage. And also, there should be a way to request data
from such storages (i.e. new action , e.g. copy_meta). Let's say there are 4x32b
storages - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
mbuf->metadata, it should be something like,
	ingress / pattern .. /
	actions ... set_meta index i data x / copy_meta_to_rx index i
And if user wants to set meta[i] via mbuf on Tx,
	egress / pattern meta index is i data is x ... /
	actions ... copy_meta_to_tx index i

For sure, user is also responsible for querying these capabilities per each
meta[] storage.

As copy_meta_to_tx/rx isn't a real action, this example would confuse user.
	egress / pattern meta index is i data is x ... /
	actions ... copy_meta_to_tx index i

User might misunderstand the order of two things - item meta and copy_meta
action. I also thought about having capability bits per each meta[] storage but
it also looked complex.

I do think rte_flow item/action is better to be simple, atomic and intuitive.
That's why I made this choice.

> Are there other uses for these registers? Say, referencing their contents
> from other places in a flow rule so they don't have to be hard-coded?

Possible.
Actually, this feature is needed by connection tracking of OVS-DPDK.

> Right now I'm still uncomfortable with such a feature in the public API
> because compared to META [1], this approach looks very hardware-specific and
> seemingly difficult to map on different HW architectures.

I wouldn't say it is HW-specific. Like I explained above, I just define this new
item/action to make things easy-to-use and intuitive.

> However, the main problem is that as described, its end purpose seems
> redundant with META, which I think can cover the use cases you gave. So what
> can an application do with this that couldn't be done in a more generic
> fashion through META?
> 
> I may still be missing something and I'm open to ideas, but assuming it
> doesn't make it into the public rte_flow API, it remains an interesting
> feature on its own merit which could be added to DPDK as PMD-specific
> pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
> for PMDs to expose a public header that dedicated applications can include
> to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
> No problem with that.

That's good info. Thanks. But still considering connection-tracking-like
use-cases, this transient storage on multi-table flow pipeline is quite useful.


thanks,
Yongseok

> [1] "[PATCH] ethdev: extend flow metadata"
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-July%2F137305.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=4xI5tJ9pcVn1ooTwmZ1f0O%2BaY9p%2FL%2F8O23gr2OW7ZpI%3D&amp;reserved=0
> 
> [2] "Negative types"
>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-types&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=gFYRsOd8RzINShMvMR%2FXFKwV5RHAwThsDrvwnCrDIiQ%3D&amp;reserved=0
  
Ferruh Yigit Oct. 8, 2019, 12:57 p.m. UTC | #6
On 7/11/2019 2:59 AM, Yongseok Koh wrote:
> On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
>> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
>>>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil <adrien.mazarguil@6wind.com> wrote:
>>>>
>>>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
>>>>> A tag is a transient data which can be used during flow match. This can be
>>>>> used to store match result from a previous table so that the same pattern
>>>>> need not be matched again on the next table. Even if outer header is
>>>>> decapsulated on the previous match, the match result can be kept.
>>>>>
>>>>> Some device expose internal registers of its flow processing pipeline and
>>>>> those registers are quite useful for stateful connection tracking as it
>>>>> keeps status of flow matching. Multiple tags are supported by specifying
>>>>> index.
>>>>>
>>>>> Example testpmd commands are:
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress pattern ... / end
>>>>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
>>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
>>>>>            vxlan_decap / jump group 1 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
>>>>>            eth ... / end
>>>>>    actions ... jump group 2 / end
>>>>>
>>>>>  flow create 0 ingress group 1
>>>>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
>>>>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>>  flow create 0 ingress group 2
>>>>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
>>>>>            eth ... / end
>>>>>    actions ... / end
>>>>>
>>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
>>>>
>>>> Hi Yongseok,
>>>>
>>>> Only high level questions for now, while it unquestionably looks useful,
>>>> from a user standpoint exposing the separate index seems redundant and not
>>>> necessarily convenient. Using the following example to illustrate:
>>>>
>>>> actions set_tag index 3 value 0x123456 mask 0xfffff
>>>>
>>>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
>>>>
>>>> I might be missing something, but why isn't this enough:
>>>>
>>>> pattern tag index is 3 # match whatever is stored at index 3
>>>>
>>>> Assuming it can work, then why bother with providing value spec/mask on
>>>> set_tag? A flow rule pattern matches something, sets some arbitrary tag to
>>>> be matched by a subsequent flow rule and that's it. It even seems like
>>>> relying on the index only on both occasions is enough for identification.
>>>>
>>>> Same question for the opposite approach; relying on the value, never
>>>> mentioning the index.
>>>>
>>>> I'm under the impression that the index is a hardware-specific constraint
>>>> that shouldn't be exposed (especially since it's an 8-bit field). If so, a
>>>> PMD could keep track of used indices without having them exposed through the
>>>> public API.
>>>
>>>
>>> Thank you for review, Adrien.
>>> Hope you are doing well. It's been long since we talked each other. :-)
>>
>> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence slow to
>> answer these days...
>>
>>  <dev@dpdk.org> hey!
>>  <dev@dpdk.org> no private talks!
>>
>> Back to the topic:
>>
>>> Your approach will work too in general but we have a request from customer that
>>> they want to partition this limited tag storage. Assuming that HW exposes 32bit
>>> tags (those are 'registers' in HW pipeline in mlx5 HW). Then, customers want to
>>> store multiple data even in a 32-bit storage. For example, 16bit vlan tag, 8bit
>>> table id and 8bit flow id. As they want to split one 32bit storage, I thought it
>>> is better to provide mask when setting/matching the value. Even some customer
>>> wants to store multiple flags bit by bit like ol_flags. They do want to alter
>>> only partial bits.
>>>
>>> And for the index, it is to reference an entry of tags array as HW can provide
>>> larger registers than 32-bit. For example, mlx5 HW would provide 4 of 32b
>>> storage which users can use for their own sake.
>>> 	tag[0], tag[1], tag[2], tag[3]
>>
>> OK, looks like I missed the point then. I initially took it for a funky
>> alternative to RTE_FLOW_ITEM_TYPE_META & RTE_FLOW_ACTION_TYPE_SET_META
>> (ingress extended [1]) but while it could be used like that, it's more of a
>> way to temporarily store and retrieve a small amount of data, correct?
> 
> Correct.
> 
>> Out of curiosity, are these registers independent from META and other
>> items/actions in mlx5, otherwise what happens if they are combined?
> 
> I thought about combining it but I chose this way. Because it is transient. META
> can be set by packet descriptor on Tx and can be delivered to host via mbuf on
> Rx, but this TAG item can't. If I combine it, users have to query this
> capability for each 32b storage. And also, there should be a way to request data
> from such storages (i.e. new action , e.g. copy_meta). Let's say there are 4x32b
> storages - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
> mbuf->metadata, it should be something like,
> 	ingress / pattern .. /
> 	actions ... set_meta index i data x / copy_meta_to_rx index i
> And if user wants to set meta[i] via mbuf on Tx,
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> For sure, user is also responsible for querying these capabilities per each
> meta[] storage.
> 
> As copy_meta_to_tx/rx isn't a real action, this example would confuse user.
> 	egress / pattern meta index is i data is x ... /
> 	actions ... copy_meta_to_tx index i
> 
> User might misunderstand the order of two things - item meta and copy_meta
> action. I also thought about having capability bits per each meta[] storage but
> it also looked complex.
> 
> I do think rte_flow item/action is better to be simple, atomic and intuitive.
> That's why I made this choice.
> 
>> Are there other uses for these registers? Say, referencing their contents
>> from other places in a flow rule so they don't have to be hard-coded?
> 
> Possible.
> Actually, this feature is needed by connection tracking of OVS-DPDK.
> 
>> Right now I'm still uncomfortable with such a feature in the public API
>> because compared to META [1], this approach looks very hardware-specific and
>> seemingly difficult to map on different HW architectures.
> 
> I wouldn't say it is HW-specific. Like I explained above, I just define this new
> item/action to make things easy-to-use and intuitive.
> 
>> However, the main problem is that as described, its end purpose seems
>> redundant with META, which I think can cover the use cases you gave. So what
>> can an application do with this that couldn't be done in a more generic
>> fashion through META?
>>
>> I may still be missing something and I'm open to ideas, but assuming it
>> doesn't make it into the public rte_flow API, it remains an interesting
>> feature on its own merit which could be added to DPDK as PMD-specific
>> pattern items/actions [2]. mlx5 doesn't have any yet, but it's pretty common
>> for PMDs to expose a public header that dedicated applications can include
>> to use this kind of features (look for rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
>> No problem with that.
> 
> That's good info. Thanks. But still considering connection-tracking-like
> use-cases, this transient storage on multi-table flow pipeline is quite useful.
> 
> 
> thanks,
> Yongseok
> 
>> [1] "[PATCH] ethdev: extend flow metadata"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmails.dpdk.org%2Farchives%2Fdev%2F2019-July%2F137305.html&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=4xI5tJ9pcVn1ooTwmZ1f0O%2BaY9p%2FL%2F8O23gr2OW7ZpI%3D&amp;reserved=0
>>
>> [2] "Negative types"
>>     https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-types&amp;data=02%7C01%7Cyskoh%40mellanox.com%7Ccd2d2d88786f43d9603708d70448c623%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636982582929119170&amp;sdata=gFYRsOd8RzINShMvMR%2FXFKwV5RHAwThsDrvwnCrDIiQ%3D&amp;reserved=0

Is this RFC still valid, will there be any follow up?
If not am marking it as rejected in next a few days.
  
Slava Ovsiienko Oct. 8, 2019, 1:18 p.m. UTC | #7
> -----Original Message-----
> From: Yigit, Ferruh <ferruh.yigit@linux.intel.com>
> Sent: Tuesday, October 8, 2019 15:57
> To: Yongseok Koh <yskoh@mellanox.com>; Adrien Mazarguil
> <adrien.mazarguil@6wind.com>
> Cc: Shahaf Shuler <shahafs@mellanox.com>; Thomas Monjalon
> <thomas@monjalon.net>; Ferruh Yigit <ferruh.yigit@intel.com>; Andrew
> Rybchenko <arybchenko@solarflare.com>; Olivier Matz
> <olivier.matz@6wind.com>; dev <dev@dpdk.org>; Slava Ovsiienko
> <viacheslavo@mellanox.com>
> Subject: Re: [dpdk-dev] [PATCH] ethdev: add flow tag
> 
> On 7/11/2019 2:59 AM, Yongseok Koh wrote:
> > On Tue, Jul 09, 2019 at 10:38:06AM +0200, Adrien Mazarguil wrote:
> >> On Fri, Jul 05, 2019 at 06:05:50PM +0000, Yongseok Koh wrote:
> >>>> On Jul 5, 2019, at 6:54 AM, Adrien Mazarguil
> <adrien.mazarguil@6wind.com> wrote:
> >>>>
> >>>> On Thu, Jul 04, 2019 at 04:23:02PM -0700, Yongseok Koh wrote:
> >>>>> A tag is a transient data which can be used during flow match.
> >>>>> This can be used to store match result from a previous table so
> >>>>> that the same pattern need not be matched again on the next table.
> >>>>> Even if outer header is decapsulated on the previous match, the match
> result can be kept.
> >>>>>
> >>>>> Some device expose internal registers of its flow processing
> >>>>> pipeline and those registers are quite useful for stateful
> >>>>> connection tracking as it keeps status of flow matching. Multiple
> >>>>> tags are supported by specifying index.
> >>>>>
> >>>>> Example testpmd commands are:
> >>>>>
> >>>>>  flow create 0 ingress pattern ... / end
> >>>>>    actions set_tag index 2 value 0xaa00bb mask 0xffff00ff /
> >>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>>>>            vxlan_decap / jump group 1 / end
> >>>>>
> >>>>>  flow create 0 ingress pattern ... / end
> >>>>>    actions set_tag index 2 value 0xcc00 mask 0xff00 /
> >>>>>            set_tag index 3 value 0x123456 mask 0xffffff /
> >>>>>            vxlan_decap / jump group 1 / end
> >>>>>
> >>>>>  flow create 0 ingress group 1
> >>>>>    pattern tag index is 2 value spec 0xaa00bb value mask 0xffff00ff /
> >>>>>            eth ... / end
> >>>>>    actions ... jump group 2 / end
> >>>>>
> >>>>>  flow create 0 ingress group 1
> >>>>>    pattern tag index is 2 value spec 0xcc00 value mask 0xff00 /
> >>>>>            tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>>>>            eth ... / end
> >>>>>    actions ... / end
> >>>>>
> >>>>>  flow create 0 ingress group 2
> >>>>>    pattern tag index is 3 value spec 0x123456 value mask 0xffffff /
> >>>>>            eth ... / end
> >>>>>    actions ... / end
> >>>>>
> >>>>> Signed-off-by: Yongseok Koh <yskoh@mellanox.com>
> >>>>
> >>>> Hi Yongseok,
> >>>>
> >>>> Only high level questions for now, while it unquestionably looks
> >>>> useful, from a user standpoint exposing the separate index seems
> >>>> redundant and not necessarily convenient. Using the following example
> to illustrate:
> >>>>
> >>>> actions set_tag index 3 value 0x123456 mask 0xfffff
> >>>>
> >>>> pattern tag index is 3 value spec 0x123456 value mask 0xffffff
> >>>>
> >>>> I might be missing something, but why isn't this enough:
> >>>>
> >>>> pattern tag index is 3 # match whatever is stored at index 3
> >>>>
> >>>> Assuming it can work, then why bother with providing value
> >>>> spec/mask on set_tag? A flow rule pattern matches something, sets
> >>>> some arbitrary tag to be matched by a subsequent flow rule and
> >>>> that's it. It even seems like relying on the index only on both occasions
> is enough for identification.
> >>>>
> >>>> Same question for the opposite approach; relying on the value,
> >>>> never mentioning the index.
> >>>>
> >>>> I'm under the impression that the index is a hardware-specific
> >>>> constraint that shouldn't be exposed (especially since it's an
> >>>> 8-bit field). If so, a PMD could keep track of used indices without
> >>>> having them exposed through the public API.
> >>>
> >>>
> >>> Thank you for review, Adrien.
> >>> Hope you are doing well. It's been long since we talked each other.
> >>> :-)
> >>
> >> Yeah clearly! Hope you're doing well too. I'm somewhat busy hence
> >> slow to answer these days...
> >>
> >>  <dev@dpdk.org> hey!
> >>  <dev@dpdk.org> no private talks!
> >>
> >> Back to the topic:
> >>
> >>> Your approach will work too in general but we have a request from
> >>> customer that they want to partition this limited tag storage.
> >>> Assuming that HW exposes 32bit tags (those are 'registers' in HW
> >>> pipeline in mlx5 HW). Then, customers want to store multiple data
> >>> even in a 32-bit storage. For example, 16bit vlan tag, 8bit table id
> >>> and 8bit flow id. As they want to split one 32bit storage, I thought
> >>> it is better to provide mask when setting/matching the value. Even
> >>> some customer wants to store multiple flags bit by bit like ol_flags. They
> do want to alter only partial bits.
> >>>
> >>> And for the index, it is to reference an entry of tags array as HW
> >>> can provide larger registers than 32-bit. For example, mlx5 HW would
> >>> provide 4 of 32b storage which users can use for their own sake.
> >>> 	tag[0], tag[1], tag[2], tag[3]
> >>
> >> OK, looks like I missed the point then. I initially took it for a
> >> funky alternative to RTE_FLOW_ITEM_TYPE_META &
> >> RTE_FLOW_ACTION_TYPE_SET_META (ingress extended [1]) but while it
> >> could be used like that, it's more of a way to temporarily store and
> retrieve a small amount of data, correct?
> >
> > Correct.
> >
> >> Out of curiosity, are these registers independent from META and other
> >> items/actions in mlx5, otherwise what happens if they are combined?
> >
> > I thought about combining it but I chose this way. Because it is
> > transient. META can be set by packet descriptor on Tx and can be
> > delivered to host via mbuf on Rx, but this TAG item can't. If I
> > combine it, users have to query this capability for each 32b storage.
> > And also, there should be a way to request data from such storages
> > (i.e. new action , e.g. copy_meta). Let's say there are 4x32b storages
> > - meta[4]. If user wants to get one 32b data (meta[i]) out of them to
> > mbuf->metadata, it should be something like,
> > 	ingress / pattern .. /
> > 	actions ... set_meta index i data x / copy_meta_to_rx index i And if
> > user wants to set meta[i] via mbuf on Tx,
> > 	egress / pattern meta index is i data is x ... /
> > 	actions ... copy_meta_to_tx index i
> >
> > For sure, user is also responsible for querying these capabilities per
> > each meta[] storage.
> >
> > As copy_meta_to_tx/rx isn't a real action, this example would confuse
> user.
> > 	egress / pattern meta index is i data is x ... /
> > 	actions ... copy_meta_to_tx index i
> >
> > User might misunderstand the order of two things - item meta and
> > copy_meta action. I also thought about having capability bits per each
> > meta[] storage but it also looked complex.
> >
> > I do think rte_flow item/action is better to be simple, atomic and intuitive.
> > That's why I made this choice.
> >
> >> Are there other uses for these registers? Say, referencing their
> >> contents from other places in a flow rule so they don't have to be hard-
> coded?
> >
> > Possible.
> > Actually, this feature is needed by connection tracking of OVS-DPDK.
> >
> >> Right now I'm still uncomfortable with such a feature in the public
> >> API because compared to META [1], this approach looks very
> >> hardware-specific and seemingly difficult to map on different HW
> architectures.
> >
> > I wouldn't say it is HW-specific. Like I explained above, I just
> > define this new item/action to make things easy-to-use and intuitive.
> >
> >> However, the main problem is that as described, its end purpose seems
> >> redundant with META, which I think can cover the use cases you gave.
> >> So what can an application do with this that couldn't be done in a
> >> more generic fashion through META?
> >>
> >> I may still be missing something and I'm open to ideas, but assuming
> >> it doesn't make it into the public rte_flow API, it remains an
> >> interesting feature on its own merit which could be added to DPDK as
> >> PMD-specific pattern items/actions [2]. mlx5 doesn't have any yet,
> >> but it's pretty common for PMDs to expose a public header that
> >> dedicated applications can include to use this kind of features (look for
> rte_pmd_*.h, e.g. rte_pmd_ixgbe.h).
> >> No problem with that.
> >
> > That's good info. Thanks. But still considering
> > connection-tracking-like use-cases, this transient storage on multi-table
> flow pipeline is quite useful.
> >
> >
> > thanks,
> > Yongseok
> >
> >> [1] "[PATCH] ethdev: extend flow metadata"
> >>
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fmai
> >> ls.dpdk.org%2Farchives%2Fdev%2F2019-
> July%2F137305.html&amp;data=02%7C
> >>
> 01%7Cviacheslavo%40mellanox.com%7Cc0402133b8b2422fc23308d74bef1
> 4fd%7C
> >>
> a652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637061362537116332
> &amp;sda
> >>
> ta=I%2B%2BERHK8FXzLxXkbbjGTmNDf2e%2FsVRvQ%2FIJW4ZmaYrk%3D&a
> mp;reserve
> >> d=0
> >>
> >> [2] "Negative types"
> >>
> >>
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc
> >> .dpdk.org%2Fguides%2Fprog_guide%2Frte_flow.html%23negative-
> types&amp;
> >>
> data=02%7C01%7Cviacheslavo%40mellanox.com%7Cc0402133b8b2422fc23
> 308d74
> >>
> bef14fd%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C63706136
> 25371163
> >>
> 32&amp;sdata=o6hcNuwWnv9fADGxNcy6S9B0xwCNdlNhbloIKRiMiNo%3D&
> amp;reser
> >> ved=0
> 
> Is this RFC still valid, will there be any follow up?
> If not am marking it as rejected in next a few days.

Yes, RFC is valid, v2 and support in mlx5 Is coming.

WBR, Slava
  

Patch

diff --git a/app/test-pmd/cmdline_flow.c b/app/test-pmd/cmdline_flow.c
index eda5c5491f..37a692db54 100644
--- a/app/test-pmd/cmdline_flow.c
+++ b/app/test-pmd/cmdline_flow.c
@@ -181,6 +181,9 @@  enum index {
 	ITEM_ICMP6_ND_OPT_TLA_ETH_TLA,
 	ITEM_META,
 	ITEM_META_DATA,
+	ITEM_TAG,
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
 
 	/* Validate/create actions. */
 	ACTIONS,
@@ -275,6 +278,10 @@  enum index {
 	ACTION_SET_META,
 	ACTION_SET_META_DATA,
 	ACTION_SET_META_MASK,
+	ACTION_SET_TAG,
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
 };
 
 /** Maximum size for pattern in struct rte_flow_item_raw. */
@@ -613,6 +620,7 @@  static const enum index next_item[] = {
 	ITEM_ICMP6_ND_OPT_SLA_ETH,
 	ITEM_ICMP6_ND_OPT_TLA_ETH,
 	ITEM_META,
+	ITEM_TAG,
 	ZERO,
 };
 
@@ -839,6 +847,13 @@  static const enum index item_meta[] = {
 	ZERO,
 };
 
+static const enum index item_tag[] = {
+	ITEM_TAG_DATA,
+	ITEM_TAG_INDEX,
+	ITEM_NEXT,
+	ZERO,
+};
+
 static const enum index next_action[] = {
 	ACTION_END,
 	ACTION_VOID,
@@ -889,6 +904,7 @@  static const enum index next_action[] = {
 	ACTION_SET_MAC_SRC,
 	ACTION_SET_MAC_DST,
 	ACTION_SET_META,
+	ACTION_SET_TAG,
 	ZERO,
 };
 
@@ -1058,6 +1074,14 @@  static const enum index action_set_meta[] = {
 	ZERO,
 };
 
+static const enum index action_set_tag[] = {
+	ACTION_SET_TAG_INDEX,
+	ACTION_SET_TAG_DATA,
+	ACTION_SET_TAG_MASK,
+	ACTION_NEXT,
+	ZERO,
+};
+
 static int parse_init(struct context *, const struct token *,
 		      const char *, unsigned int,
 		      void *, unsigned int);
@@ -2161,6 +2185,26 @@  static const struct token token_list[] = {
 		.args = ARGS(ARGS_ENTRY_MASK_HTON(struct rte_flow_item_meta,
 						  data, "\xff\xff\xff\xff")),
 	},
+	[ITEM_TAG] = {
+		.name = "tag",
+		.help = "match tag value",
+		.priv = PRIV_ITEM(TAG, sizeof(struct rte_flow_item_tag)),
+		.next = NEXT(item_tag),
+		.call = parse_vc,
+	},
+	[ITEM_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED), item_param),
+		.args = ARGS(ARGS_ENTRY_HTON(struct rte_flow_item_tag, data)),
+	},
+	[ITEM_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array to match",
+		.next = NEXT(item_tag, NEXT_ENTRY(UNSIGNED),
+			     NEXT_ENTRY(ITEM_PARAM_IS)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_item_tag, index)),
+	},
 
 	/* Validate/create actions. */
 	[ACTIONS] = {
@@ -2889,6 +2933,37 @@  static const struct token token_list[] = {
 			     (struct rte_flow_action_set_meta, mask)),
 		.call = parse_vc_conf,
 	},
+	[ACTION_SET_TAG] = {
+		.name = "set_tag",
+		.help = "set tag",
+		.priv = PRIV_ACTION(SET_TAG,
+			sizeof(struct rte_flow_action_set_tag)),
+		.next = NEXT(action_set_tag),
+		.call = parse_vc,
+	},
+	[ACTION_SET_TAG_INDEX] = {
+		.name = "index",
+		.help = "index of tag array",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY(struct rte_flow_action_set_tag, index)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_DATA] = {
+		.name = "data",
+		.help = "tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, data)),
+		.call = parse_vc_conf,
+	},
+	[ACTION_SET_TAG_MASK] = {
+		.name = "mask",
+		.help = "mask for tag value",
+		.next = NEXT(action_set_tag, NEXT_ENTRY(UNSIGNED)),
+		.args = ARGS(ARGS_ENTRY_HTON
+			     (struct rte_flow_action_set_tag, mask)),
+		.call = parse_vc_conf,
+	},
 };
 
 /** Remove and return last entry from argument stack. */
diff --git a/doc/guides/prog_guide/rte_flow.rst b/doc/guides/prog_guide/rte_flow.rst
index 5092f0074e..7880425e9e 100644
--- a/doc/guides/prog_guide/rte_flow.rst
+++ b/doc/guides/prog_guide/rte_flow.rst
@@ -684,6 +684,34 @@  action sets metadata for a packet and the metadata will be reported via
    | ``mask`` | ``data`` | bit-mask applies to "spec" and "last" |
    +----------+----------+---------------------------------------+
 
+Item: ``TAG``
+^^^^^^^^^^^^^
+
+Matches tag item set by other flows. Multiple tags are supported by specifying
+``index``.
+
+- Default ``mask`` matches the specified tag value and index.
+
+.. _table_rte_flow_item_tag:
+
+.. table:: TAG
+
+   +----------+----------+----------------------------------------+
+   | Field    | Subfield  | Value                                 |
+   +==========+===========+=======================================+
+   | ``spec`` | ``data``  | 32 bit flow tag value                 |
+   |          +-----------+---------------------------------------+
+   |          | ``index`` | index of flow tag                     |
+   +----------+-----------+---------------------------------------+
+   | ``last`` | ``data``  | upper range value                     |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+   | ``mask`` | ``data``  | bit-mask applies to "spec" and "last" |
+   |          +-----------+                                       |
+   |          | ``index`` |                                       |
+   +----------+-----------+---------------------------------------+
+
 Data matching item types
 ~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -2376,6 +2404,28 @@  the other path depending on HW capability.
    | ``mask`` | bit-mask applies to "data" |
    +----------+----------------------------+
 
+Action: ``SET_TAG``
+^^^^^^^^^^^^^^^^^^^
+
+Set Tag.
+
+Tag is a transient data used during flow matching. This is not delivered to
+application. Multiple tags are supported by specifying index.
+
+.. _table_rte_flow_action_set_tag:
+
+.. table:: SET_TAG
+
+   +-----------+----------------------------+
+   | Field     | Value                      |
+   +===========+============================+
+   | ``data``  | 32 bit tag value           |
+   +-----------+----------------------------+
+   | ``mask``  | bit-mask applies to "data" |
+   +-----------+----------------------------+
+   | ``index`` | index of tag to set        |
+   +-----------+----------------------------+
+
 Negative types
 ~~~~~~~~~~~~~~
 
diff --git a/doc/guides/rel_notes/release_19_08.rst b/doc/guides/rel_notes/release_19_08.rst
index e087266da0..7c9da3fdae 100644
--- a/doc/guides/rel_notes/release_19_08.rst
+++ b/doc/guides/rel_notes/release_19_08.rst
@@ -78,6 +78,10 @@  New Features
   * Rx metadata is delivered to host via ``metadata`` field of ``rte_mbuf``
     with PKT_RX_METADATA.
 
+* **Added flow tag in rte_flow.**
+  SET_TAG action and TAG item have been added to support transient flow
+  tag.
+
 * **Updated the bnxt PMD.**
 
   Updated the bnxt PMD. The major enhancements include:
diff --git a/lib/librte_ethdev/rte_flow.h b/lib/librte_ethdev/rte_flow.h
index cda8628183..ffb8a13098 100644
--- a/lib/librte_ethdev/rte_flow.h
+++ b/lib/librte_ethdev/rte_flow.h
@@ -422,6 +422,15 @@  enum rte_flow_item_type {
 	 * See struct rte_flow_item_meta.
 	 */
 	RTE_FLOW_ITEM_TYPE_META,
+
+	/**
+	 * [META]
+	 *
+	 * Matches a tag value.
+	 *
+	 * See struct rte_flow_item_tag.
+	 */
+	RTE_FLOW_ITEM_TYPE_TAG,
 };
 
 /**
@@ -1187,6 +1196,27 @@  static const struct rte_flow_item_meta rte_flow_item_meta_mask = {
 };
 #endif
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ITEM_TYPE_TAG
+ *
+ * Matches a specified tag value at the specified index.
+ */
+struct rte_flow_item_tag {
+	uint32_t data;
+	uint8_t index;
+};
+
+/** Default mask for RTE_FLOW_ITEM_TYPE_TAG. */
+#ifndef __cplusplus
+static const struct rte_flow_item_tag rte_flow_item_rx_meta_mask = {
+	.data = 0xffffffff,
+	.index = 0xff,
+};
+#endif
+
 /**
  * @warning
  * @b EXPERIMENTAL: this structure may change without prior notice
@@ -1665,6 +1695,15 @@  enum rte_flow_action_type {
 	 * See struct rte_flow_action_set_meta.
 	 */
 	RTE_FLOW_ACTION_TYPE_SET_META,
+
+	/**
+	 * Set Tag.
+	 *
+	 * Tag is not delivered to application.
+	 *
+	 * See struct rte_flow_action_set_tag.
+	 */
+	RTE_FLOW_ACTION_TYPE_SET_TAG,
 };
 
 /**
@@ -2168,6 +2207,21 @@  struct rte_flow_action_set_meta {
 	rte_be32_t mask;
 };
 
+/**
+ * @warning
+ * @b EXPERIMENTAL: this structure may change without prior notice
+ *
+ * RTE_FLOW_ACTION_TYPE_SET_TAG
+ *
+ * Set a tag which is a transient data used during flow matching. This is not
+ * delivered to application. Multiple tags are supported by specifying index.
+ */
+struct rte_flow_action_set_tag {
+	uint32_t data;
+	uint32_t mask;
+	uint8_t index;
+};
+
 /*
  * Definition of a single action.
  *