mbox series

[0/7] replace rte atomics with GCC builtin atomics

Message ID 1679084388-19267-1-git-send-email-roretzla@linux.microsoft.com (mailing list archive)
Headers
Series replace rte atomics with GCC builtin atomics |

Message

Tyler Retzlaff March 17, 2023, 8:19 p.m. UTC
  Replace the use of rte_atomic.h types and functions, instead use GCC
supplied C++11 memory model builtins.

This series covers the libraries and drivers that are built on Windows.

The code has be converted to use the __atomic builtins but there are
additional during conversion i notice that there may be some issues
that need to be addressed.

I'll comment in the patches where my concerns are so the maintainers
may comment.

Tyler Retzlaff (7):
  ring: replace rte atomics with GCC builtin atomics
  stack: replace rte atomics with GCC builtin atomics
  dma/idxd: replace rte atomics with GCC builtin atomics
  net/ice: replace rte atomics with GCC builtin atomics
  net/ixgbe: replace rte atomics with GCC builtin atomics
  net/null: replace rte atomics with GCC builtin atomics
  net/ring: replace rte atomics with GCC builtin atomics

 drivers/dma/idxd/idxd_internal.h |  3 +--
 drivers/dma/idxd/idxd_pci.c      |  6 +++---
 drivers/net/ice/ice_dcf.c        |  1 -
 drivers/net/ice/ice_dcf_ethdev.c |  1 -
 drivers/net/ice/ice_ethdev.c     | 10 ++++++----
 drivers/net/ixgbe/ixgbe_bypass.c |  1 -
 drivers/net/ixgbe/ixgbe_ethdev.c | 12 ++++++------
 drivers/net/ixgbe/ixgbe_ethdev.h |  3 ++-
 drivers/net/ixgbe/ixgbe_flow.c   |  1 -
 drivers/net/ixgbe/ixgbe_rxtx.c   |  1 -
 drivers/net/null/rte_eth_null.c  | 20 ++++++++++----------
 drivers/net/ring/rte_eth_ring.c  | 20 ++++++++++----------
 lib/ring/rte_ring_core.h         |  1 -
 lib/ring/rte_ring_generic_pvt.h  | 10 ++++++----
 lib/stack/rte_stack_lf_generic.h | 11 +++++------
 15 files changed, 49 insertions(+), 52 deletions(-)
  

Comments

Stephen Hemminger March 17, 2023, 9:42 p.m. UTC | #1
On Fri, 17 Mar 2023 13:19:41 -0700
Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:

> Replace the use of rte_atomic.h types and functions, instead use GCC
> supplied C++11 memory model builtins.
> 
> This series covers the libraries and drivers that are built on Windows.
> 
> The code has be converted to use the __atomic builtins but there are
> additional during conversion i notice that there may be some issues
> that need to be addressed.

I don't think all these cmpset need to use SEQ_CST.
Especially for the places where it is used a loop, might
be more efficient with some of the other memory models.
  
Tyler Retzlaff March 17, 2023, 9:49 p.m. UTC | #2
On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger wrote:
> On Fri, 17 Mar 2023 13:19:41 -0700
> Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> 
> > Replace the use of rte_atomic.h types and functions, instead use GCC
> > supplied C++11 memory model builtins.
> > 
> > This series covers the libraries and drivers that are built on Windows.
> > 
> > The code has be converted to use the __atomic builtins but there are
> > additional during conversion i notice that there may be some issues
> > that need to be addressed.
> 
> I don't think all these cmpset need to use SEQ_CST.
> Especially for the places where it is used a loop, might
> be more efficient with some of the other memory models.

i agree.

however, i'm not trying to improve the code with this change, just
decouple it from rte_atomics.h so trying my best to avoid any
unnecessary semantic change.

certainly if the maintainers of this code wish to weaken the ordering
where appropriate after the change is merged they can do so and handily
this change has enabled them to do so easily allowing them to test just
their change in isolation.
  
Morten Brørup March 22, 2023, 11:28 a.m. UTC | #3
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Friday, 17 March 2023 22.49
> 
> On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger wrote:
> > On Fri, 17 Mar 2023 13:19:41 -0700
> > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> >
> > > Replace the use of rte_atomic.h types and functions, instead use GCC
> > > supplied C++11 memory model builtins.
> > >
> > > This series covers the libraries and drivers that are built on Windows.
> > >
> > > The code has be converted to use the __atomic builtins but there are
> > > additional during conversion i notice that there may be some issues
> > > that need to be addressed.
> >
> > I don't think all these cmpset need to use SEQ_CST.
> > Especially for the places where it is used a loop, might
> > be more efficient with some of the other memory models.
> 
> i agree.
> 
> however, i'm not trying to improve the code with this change, just
> decouple it from rte_atomics.h so trying my best to avoid any
> unnecessary semantic change.
> 
> certainly if the maintainers of this code wish to weaken the ordering
> where appropriate after the change is merged they can do so and handily
> this change has enabled them to do so easily allowing them to test just
> their change in isolation.

I agree with the two-step approach, where this first step is a simple search-and-replacement; but I insist that you add a FIXME or similar note where you have blindly used SEQ_CST, indicating that the memory order needs to be reviewed and potentially corrected.

Also, in a couple of the drivers, you are using int64_t for packet counters. These cannot be negative and should be uint64_t. And AFAIK, such counters can use RELAXED memory order.
  
Tyler Retzlaff March 22, 2023, 2:21 p.m. UTC | #4
On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > Sent: Friday, 17 March 2023 22.49
> > 
> > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger wrote:
> > > On Fri, 17 Mar 2023 13:19:41 -0700
> > > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> > >
> > > > Replace the use of rte_atomic.h types and functions, instead use GCC
> > > > supplied C++11 memory model builtins.
> > > >
> > > > This series covers the libraries and drivers that are built on Windows.
> > > >
> > > > The code has be converted to use the __atomic builtins but there are
> > > > additional during conversion i notice that there may be some issues
> > > > that need to be addressed.
> > >
> > > I don't think all these cmpset need to use SEQ_CST.
> > > Especially for the places where it is used a loop, might
> > > be more efficient with some of the other memory models.
> > 
> > i agree.
> > 
> > however, i'm not trying to improve the code with this change, just
> > decouple it from rte_atomics.h so trying my best to avoid any
> > unnecessary semantic change.
> > 
> > certainly if the maintainers of this code wish to weaken the ordering
> > where appropriate after the change is merged they can do so and handily
> > this change has enabled them to do so easily allowing them to test just
> > their change in isolation.
> 
> I agree with the two-step approach, where this first step is a simple search-and-replacement; but I insist that you add a FIXME or similar note where you have blindly used SEQ_CST, indicating that the memory order needs to be reviewed and potentially corrected.

i think the maintainers need to take some responsibility, if they see
optimizations they missed when previously writing the code they need to
follow up with a patch themselves. i can't do everything for them and
marking things i'm not sure about will only lead to me having to churn
patch series to remove the unwanted comments later.

keep in mind i have to touch each of these again when converting to
standard so that's a better time to review ~everything in more detail
because when converting to standard that's when suddenly you get a bunch
of code generation that is "fallback" to seq_cst that isn't happening now.

the series that converts to standard needs to be up for review as soon
as possible to maximize available time for feedback before 23.11 so it
would be better to get the simpler cut & paste normalizing the code out
of the way to unblock that series submission.

> 
> Also, in a couple of the drivers, you are using int64_t for packet counters. These cannot be negative and should be uint64_t. And AFAIK, such counters can use RELAXED memory order.

i know you don't mean to say i selected the types and rather that the
types that were selected are not quite correct for their usage. again
on the review that actually adopts std atomics is a better place to make
any potential type changes since we are "breaking" the API for 23.11
anyway. further, the std atomics series technically changes all the
types so it's probably better to make one type change then rather than
one now and one later.

i think it would be best to get these validated and merged asap so we
can get to the std atomics review. when that series is up let's discuss
further how i can mark areas of concern, with that series i expect there
will have to be some changes in order to avoid minor regressions.

thanks!
  
Morten Brørup March 22, 2023, 2:58 p.m. UTC | #5
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Wednesday, 22 March 2023 15.22
> 
> On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > Sent: Friday, 17 March 2023 22.49
> > >
> > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger wrote:
> > > > On Fri, 17 Mar 2023 13:19:41 -0700
> > > > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> > > >
> > > > > Replace the use of rte_atomic.h types and functions, instead use GCC
> > > > > supplied C++11 memory model builtins.
> > > > >
> > > > > This series covers the libraries and drivers that are built on
> Windows.
> > > > >
> > > > > The code has be converted to use the __atomic builtins but there are
> > > > > additional during conversion i notice that there may be some issues
> > > > > that need to be addressed.
> > > >
> > > > I don't think all these cmpset need to use SEQ_CST.
> > > > Especially for the places where it is used a loop, might
> > > > be more efficient with some of the other memory models.
> > >
> > > i agree.
> > >
> > > however, i'm not trying to improve the code with this change, just
> > > decouple it from rte_atomics.h so trying my best to avoid any
> > > unnecessary semantic change.
> > >
> > > certainly if the maintainers of this code wish to weaken the ordering
> > > where appropriate after the change is merged they can do so and handily
> > > this change has enabled them to do so easily allowing them to test just
> > > their change in isolation.
> >
> > I agree with the two-step approach, where this first step is a simple
> search-and-replacement; but I insist that you add a FIXME or similar note
> where you have blindly used SEQ_CST, indicating that the memory order needs to
> be reviewed and potentially corrected.
> 
> i think the maintainers need to take some responsibility, if they see
> optimizations they missed when previously writing the code they need to
> follow up with a patch themselves. i can't do everything for them and
> marking things i'm not sure about will only lead to me having to churn
> patch series to remove the unwanted comments later.

The previous atomic functions didn't have the "memory order" parameter, so the maintainers didn't have to think about it - and thus they didn't miss any optimizations when accepting the code.

I also agree 100 % that it is not your responsibility to consider or determine which memory order is appropriate!

But I think you should mark the locations where you are changing from the old rte_atomic functions (where no memory order optimization was available) to the new functions - to highlight where the option of memory ordering has been introduced and knowingly ignored (by you).

> 
> keep in mind i have to touch each of these again when converting to
> standard so that's a better time to review ~everything in more detail
> because when converting to standard that's when suddenly you get a bunch
> of code generation that is "fallback" to seq_cst that isn't happening now.

I think you should to do it when replacing the rte_atomic functions with the __atomic functions. It will make it easier to see where the memory order was knowingly ignored, and should be reviewed for optimization.

> 
> the series that converts to standard needs to be up for review as soon
> as possible to maximize available time for feedback before 23.11 so it
> would be better to get the simpler cut & paste normalizing the code out
> of the way to unblock that series submission.
> 
> >
> > Also, in a couple of the drivers, you are using int64_t for packet counters.
> These cannot be negative and should be uint64_t. And AFAIK, such counters can
> use RELAXED memory order.
> 
> i know you don't mean to say i selected the types and rather that the
> types that were selected are not quite correct for their usage.

Yes; the previous types were also signed, and you didn't change that.

> again
> on the review that actually adopts std atomics is a better place to make
> any potential type changes since we are "breaking" the API for 23.11
> anyway. further, the std atomics series technically changes all the
> types so it's probably better to make one type change then rather than
> one now and one later.
> 
> i think it would be best to get these validated and merged asap so we
> can get to the std atomics review. when that series is up let's discuss
> further how i can mark areas of concern, with that series i expect there
> will have to be some changes in order to avoid minor regressions.
> 
> thanks!

I thought it would be better to catch these details (i.e. memory ordering and signedness) early on, but I now understand that you planned to do it in a later step. So I'll let you proceed as you have planned.

Thanks for all your work on this, Tyler. It is much appreciated!

-Morten
  
Tyler Retzlaff March 22, 2023, 3:29 p.m. UTC | #6
On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > Sent: Wednesday, 22 March 2023 15.22
> > 
> > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > Sent: Friday, 17 March 2023 22.49
> > > >
> > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger wrote:
> > > > > On Fri, 17 Mar 2023 13:19:41 -0700
> > > > > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> > > > >
> > > > > > Replace the use of rte_atomic.h types and functions, instead use GCC
> > > > > > supplied C++11 memory model builtins.
> > > > > >
> > > > > > This series covers the libraries and drivers that are built on
> > Windows.
> > > > > >
> > > > > > The code has be converted to use the __atomic builtins but there are
> > > > > > additional during conversion i notice that there may be some issues
> > > > > > that need to be addressed.
> > > > >
> > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > Especially for the places where it is used a loop, might
> > > > > be more efficient with some of the other memory models.
> > > >
> > > > i agree.
> > > >
> > > > however, i'm not trying to improve the code with this change, just
> > > > decouple it from rte_atomics.h so trying my best to avoid any
> > > > unnecessary semantic change.
> > > >
> > > > certainly if the maintainers of this code wish to weaken the ordering
> > > > where appropriate after the change is merged they can do so and handily
> > > > this change has enabled them to do so easily allowing them to test just
> > > > their change in isolation.
> > >
> > > I agree with the two-step approach, where this first step is a simple
> > search-and-replacement; but I insist that you add a FIXME or similar note
> > where you have blindly used SEQ_CST, indicating that the memory order needs to
> > be reviewed and potentially corrected.
> > 
> > i think the maintainers need to take some responsibility, if they see
> > optimizations they missed when previously writing the code they need to
> > follow up with a patch themselves. i can't do everything for them and
> > marking things i'm not sure about will only lead to me having to churn
> > patch series to remove the unwanted comments later.
> 
> The previous atomic functions didn't have the "memory order" parameter, so the maintainers didn't have to think about it - and thus they didn't miss any optimizations when accepting the code.
> 
> I also agree 100 % that it is not your responsibility to consider or determine which memory order is appropriate!
> 
> But I think you should mark the locations where you are changing from the old rte_atomic functions (where no memory order optimization was available) to the new functions - to highlight where the option of memory ordering has been introduced and knowingly ignored (by you).
> 

first, i have to apologize i confused myself about which of the many
patch series i have up right now that you were commenting on.

let me ask for clarification in relation to this series.

isn't that every single usage of the rte_atomic APIs? i mean are you
literally asking for the entire patch series to look like the following
patch snippet with the expectation that maintainers will come along and
clean up/review after this series is merged?

-rte_atomic_add32(&o, v);
+//FIXME: opportunity for relaxing ordering constraint, please review
+__atomic_fetch_add(&o, v, order);

this would just be a mechanical addition to this series so i can
certainly accomodate that, i thought something more complicated was
being asked for. if this is all, then sure no problem.

> > keep in mind i have to touch each of these again when converting to
> > standard so that's a better time to review ~everything in more detail
> > because when converting to standard that's when suddenly you get a bunch
> > of code generation that is "fallback" to seq_cst that isn't happening now.
> 
> I think you should to do it when replacing the rte_atomic functions with the __atomic functions. It will make it easier to see where the memory order was knowingly ignored, and should be reviewed for optimization.
> 
> > 
> > the series that converts to standard needs to be up for review as soon
> > as possible to maximize available time for feedback before 23.11 so it
> > would be better to get the simpler cut & paste normalizing the code out
> > of the way to unblock that series submission.
> > 
> > >
> > > Also, in a couple of the drivers, you are using int64_t for packet counters.
> > These cannot be negative and should be uint64_t. And AFAIK, such counters can
> > use RELAXED memory order.
> > 
> > i know you don't mean to say i selected the types and rather that the
> > types that were selected are not quite correct for their usage.
> 
> Yes; the previous types were also signed, and you didn't change that.
> 
> > again
> > on the review that actually adopts std atomics is a better place to make
> > any potential type changes since we are "breaking" the API for 23.11
> > anyway. further, the std atomics series technically changes all the
> > types so it's probably better to make one type change then rather than
> > one now and one later.
> > 
> > i think it would be best to get these validated and merged asap so we
> > can get to the std atomics review. when that series is up let's discuss
> > further how i can mark areas of concern, with that series i expect there
> > will have to be some changes in order to avoid minor regressions.
> > 
> > thanks!
> 
> I thought it would be better to catch these details (i.e. memory ordering and signedness) early on, but I now understand that you planned to do it in a later step. So I'll let you proceed as you have planned.
> 
> Thanks for all your work on this, Tyler. It is much appreciated!

again, sorry for the confusion the sooner i can get some of these merged
the easier it will be for me to manage the final series. i hope
david/thomas can merge the simple normalization patches as soon as 23.03
cycle is complete.

> 
> -Morten
  
Morten Brørup March 22, 2023, 4:13 p.m. UTC | #7
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Wednesday, 22 March 2023 16.30
> 
> On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > Sent: Wednesday, 22 March 2023 15.22
> > >
> > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > Sent: Friday, 17 March 2023 22.49
> > > > >
> > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger wrote:
> > > > > > On Fri, 17 Mar 2023 13:19:41 -0700
> > > > > > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> > > > > >
> > > > > > > Replace the use of rte_atomic.h types and functions, instead use
> GCC
> > > > > > > supplied C++11 memory model builtins.
> > > > > > >
> > > > > > > This series covers the libraries and drivers that are built on
> > > Windows.
> > > > > > >
> > > > > > > The code has be converted to use the __atomic builtins but there
> are
> > > > > > > additional during conversion i notice that there may be some
> issues
> > > > > > > that need to be addressed.
> > > > > >
> > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > Especially for the places where it is used a loop, might
> > > > > > be more efficient with some of the other memory models.
> > > > >
> > > > > i agree.
> > > > >
> > > > > however, i'm not trying to improve the code with this change, just
> > > > > decouple it from rte_atomics.h so trying my best to avoid any
> > > > > unnecessary semantic change.
> > > > >
> > > > > certainly if the maintainers of this code wish to weaken the ordering
> > > > > where appropriate after the change is merged they can do so and
> handily
> > > > > this change has enabled them to do so easily allowing them to test
> just
> > > > > their change in isolation.
> > > >
> > > > I agree with the two-step approach, where this first step is a simple
> > > search-and-replacement; but I insist that you add a FIXME or similar note
> > > where you have blindly used SEQ_CST, indicating that the memory order
> needs to
> > > be reviewed and potentially corrected.
> > >
> > > i think the maintainers need to take some responsibility, if they see
> > > optimizations they missed when previously writing the code they need to
> > > follow up with a patch themselves. i can't do everything for them and
> > > marking things i'm not sure about will only lead to me having to churn
> > > patch series to remove the unwanted comments later.
> >
> > The previous atomic functions didn't have the "memory order" parameter, so
> the maintainers didn't have to think about it - and thus they didn't miss any
> optimizations when accepting the code.
> >
> > I also agree 100 % that it is not your responsibility to consider or
> determine which memory order is appropriate!
> >
> > But I think you should mark the locations where you are changing from the
> old rte_atomic functions (where no memory order optimization was available) to
> the new functions - to highlight where the option of memory ordering has been
> introduced and knowingly ignored (by you).
> >
> 
> first, i have to apologize i confused myself about which of the many
> patch series i have up right now that you were commenting on.

No worries... you are rushing through quite an effort for this, so a little confusion is perfectly understandable. Especially when I'm replying to an ageing email. :-)

> 
> let me ask for clarification in relation to this series.
> 
> isn't that every single usage of the rte_atomic APIs?

Probably, yes.

> i mean are you
> literally asking for the entire patch series to look like the following
> patch snippet with the expectation that maintainers will come along and
> clean up/review after this series is merged?
> 
> -rte_atomic_add32(&o, v);
> +//FIXME: opportunity for relaxing ordering constraint, please review
> +__atomic_fetch_add(&o, v, order);

Exactly. And something similar for the rte_atomicXX_t variables changed to intXX_t, such as the packet counters.

Realistically, I don't expect the maintainers to clean them up anytime soon. The purpose is to make the FIXMEs stick until someone eventually cleans them up, so they are not forgotten as time passes.

> 
> this would just be a mechanical addition to this series so i can
> certainly accomodate that, i thought something more complicated was
> being asked for. if this is all, then sure no problem.

Great.

> 
> > > keep in mind i have to touch each of these again when converting to
> > > standard so that's a better time to review ~everything in more detail
> > > because when converting to standard that's when suddenly you get a bunch
> > > of code generation that is "fallback" to seq_cst that isn't happening now.
> >
> > I think you should to do it when replacing the rte_atomic functions with the
> __atomic functions. It will make it easier to see where the memory order was
> knowingly ignored, and should be reviewed for optimization.
> >
> > >
> > > the series that converts to standard needs to be up for review as soon
> > > as possible to maximize available time for feedback before 23.11 so it
> > > would be better to get the simpler cut & paste normalizing the code out
> > > of the way to unblock that series submission.
> > >
> > > >
> > > > Also, in a couple of the drivers, you are using int64_t for packet
> counters.
> > > These cannot be negative and should be uint64_t. And AFAIK, such counters
> can
> > > use RELAXED memory order.
> > >
> > > i know you don't mean to say i selected the types and rather that the
> > > types that were selected are not quite correct for their usage.
> >
> > Yes; the previous types were also signed, and you didn't change that.
> >
> > > again
> > > on the review that actually adopts std atomics is a better place to make
> > > any potential type changes since we are "breaking" the API for 23.11
> > > anyway. further, the std atomics series technically changes all the
> > > types so it's probably better to make one type change then rather than
> > > one now and one later.
> > >
> > > i think it would be best to get these validated and merged asap so we
> > > can get to the std atomics review. when that series is up let's discuss
> > > further how i can mark areas of concern, with that series i expect there
> > > will have to be some changes in order to avoid minor regressions.
> > >
> > > thanks!
> >
> > I thought it would be better to catch these details (i.e. memory ordering
> and signedness) early on, but I now understand that you planned to do it in a
> later step. So I'll let you proceed as you have planned.
> >
> > Thanks for all your work on this, Tyler. It is much appreciated!
> 
> again, sorry for the confusion the sooner i can get some of these merged
> the easier it will be for me to manage the final series. i hope
> david/thomas can merge the simple normalization patches as soon as 23.03
> cycle is complete.

Yes. An early merge would also provide more time for reviewing and optimizing the memory order of the most important atomic operations.
  
Honnappa Nagarahalli March 22, 2023, 4:40 p.m. UTC | #8
> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, March 22, 2023 11:14 AM
> To: Tyler Retzlaff <roretzla@linux.microsoft.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; thomas@monjalon.net
> Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin atomics
> 
> > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > Sent: Wednesday, 22 March 2023 16.30
> >
> > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > Sent: Wednesday, 22 March 2023 15.22
> > > >
> > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > >
> > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger
> wrote:
> > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > <roretzla@linux.microsoft.com> wrote:
> > > > > > >
> > > > > > > > Replace the use of rte_atomic.h types and functions,
> > > > > > > > instead use
> > GCC
> > > > > > > > supplied C++11 memory model builtins.
> > > > > > > >
> > > > > > > > This series covers the libraries and drivers that are
> > > > > > > > built on
> > > > Windows.
> > > > > > > >
> > > > > > > > The code has be converted to use the __atomic builtins but
> > > > > > > > there
> > are
> > > > > > > > additional during conversion i notice that there may be
> > > > > > > > some
> > issues
> > > > > > > > that need to be addressed.
> > > > > > >
> > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > Especially for the places where it is used a loop, might be
> > > > > > > more efficient with some of the other memory models.
> > > > > >
> > > > > > i agree.
> > > > > >
> > > > > > however, i'm not trying to improve the code with this change,
> > > > > > just decouple it from rte_atomics.h so trying my best to avoid
> > > > > > any unnecessary semantic change.
> > > > > >
> > > > > > certainly if the maintainers of this code wish to weaken the
> > > > > > ordering where appropriate after the change is merged they can
> > > > > > do so and
> > handily
> > > > > > this change has enabled them to do so easily allowing them to
> > > > > > test
> > just
> > > > > > their change in isolation.
> > > > >
> > > > > I agree with the two-step approach, where this first step is a
> > > > > simple
> > > > search-and-replacement; but I insist that you add a FIXME or
> > > > similar note where you have blindly used SEQ_CST, indicating that
> > > > the memory order
> > needs to
> > > > be reviewed and potentially corrected.
> > > >
> > > > i think the maintainers need to take some responsibility, if they
> > > > see optimizations they missed when previously writing the code
> > > > they need to follow up with a patch themselves. i can't do
> > > > everything for them and marking things i'm not sure about will
> > > > only lead to me having to churn patch series to remove the unwanted
> comments later.
> > >
> > > The previous atomic functions didn't have the "memory order"
> > > parameter, so
> > the maintainers didn't have to think about it - and thus they didn't
> > miss any optimizations when accepting the code.
> > >
> > > I also agree 100 % that it is not your responsibility to consider or
> > determine which memory order is appropriate!
> > >
> > > But I think you should mark the locations where you are changing
> > > from the
> > old rte_atomic functions (where no memory order optimization was
> > available) to the new functions - to highlight where the option of
> > memory ordering has been introduced and knowingly ignored (by you).
> > >
> >
> > first, i have to apologize i confused myself about which of the many
> > patch series i have up right now that you were commenting on.
> 
> No worries... you are rushing through quite an effort for this, so a little
> confusion is perfectly understandable. Especially when I'm replying to an ageing
> email. :-)
> 
> >
> > let me ask for clarification in relation to this series.
> >
> > isn't that every single usage of the rte_atomic APIs?
> 
> Probably, yes.
> 
> > i mean are you
> > literally asking for the entire patch series to look like the
> > following patch snippet with the expectation that maintainers will
> > come along and clean up/review after this series is merged?
> >
> > -rte_atomic_add32(&o, v);
> > +//FIXME: opportunity for relaxing ordering constraint, please review
> > +__atomic_fetch_add(&o, v, order);
> 
> Exactly. And something similar for the rte_atomicXX_t variables changed to
> intXX_t, such as the packet counters.
> 
> Realistically, I don't expect the maintainers to clean them up anytime soon. The
> purpose is to make the FIXMEs stick until someone eventually cleans them up, so
> they are not forgotten as time passes.
Cleaning up the rte_atomic APIs is a different effort. There is already lot of effort that has gone into this and there is more effort happening (rte_ring being a painful one)

Instead of having FIXME, why not just send a separate patch with SEQ_CST (still a search and replace)? We can leave the tougher ones like rte_ring as they are being worked on.

> 
> >
> > this would just be a mechanical addition to this series so i can
> > certainly accomodate that, i thought something more complicated was
> > being asked for. if this is all, then sure no problem.
> 
> Great.
> 
> >
> > > > keep in mind i have to touch each of these again when converting
> > > > to standard so that's a better time to review ~everything in more
> > > > detail because when converting to standard that's when suddenly
> > > > you get a bunch of code generation that is "fallback" to seq_cst that isn't
> happening now.
> > >
> > > I think you should to do it when replacing the rte_atomic functions
> > > with the
> > __atomic functions. It will make it easier to see where the memory
> > order was knowingly ignored, and should be reviewed for optimization.
> > >
> > > >
> > > > the series that converts to standard needs to be up for review as
> > > > soon as possible to maximize available time for feedback before
> > > > 23.11 so it would be better to get the simpler cut & paste
> > > > normalizing the code out of the way to unblock that series submission.
> > > >
> > > > >
> > > > > Also, in a couple of the drivers, you are using int64_t for
> > > > > packet
> > counters.
> > > > These cannot be negative and should be uint64_t. And AFAIK, such
> > > > counters
> > can
> > > > use RELAXED memory order.
> > > >
> > > > i know you don't mean to say i selected the types and rather that
> > > > the types that were selected are not quite correct for their usage.
> > >
> > > Yes; the previous types were also signed, and you didn't change that.
> > >
> > > > again
> > > > on the review that actually adopts std atomics is a better place
> > > > to make any potential type changes since we are "breaking" the API
> > > > for 23.11 anyway. further, the std atomics series technically
> > > > changes all the types so it's probably better to make one type
> > > > change then rather than one now and one later.
> > > >
> > > > i think it would be best to get these validated and merged asap so
> > > > we can get to the std atomics review. when that series is up let's
> > > > discuss further how i can mark areas of concern, with that series
> > > > i expect there will have to be some changes in order to avoid minor
> regressions.
> > > >
> > > > thanks!
> > >
> > > I thought it would be better to catch these details (i.e. memory
> > > ordering
> > and signedness) early on, but I now understand that you planned to do
> > it in a later step. So I'll let you proceed as you have planned.
> > >
> > > Thanks for all your work on this, Tyler. It is much appreciated!
> >
> > again, sorry for the confusion the sooner i can get some of these
> > merged the easier it will be for me to manage the final series. i hope
> > david/thomas can merge the simple normalization patches as soon as
> > 23.03 cycle is complete.
> 
> Yes. An early merge would also provide more time for reviewing and optimizing
> the memory order of the most important atomic operations.
  
Morten Brørup March 22, 2023, 5:07 p.m. UTC | #9
> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Wednesday, 22 March 2023 17.40
> 
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Wednesday, March 22, 2023 11:14 AM
> >
> > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > Sent: Wednesday, 22 March 2023 16.30
> > >
> > > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > Sent: Wednesday, 22 March 2023 15.22
> > > > >
> > > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > > >
> > > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen Hemminger
> > wrote:
> > > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > > <roretzla@linux.microsoft.com> wrote:
> > > > > > > >
> > > > > > > > > Replace the use of rte_atomic.h types and functions,
> > > > > > > > > instead use
> > > GCC
> > > > > > > > > supplied C++11 memory model builtins.
> > > > > > > > >
> > > > > > > > > This series covers the libraries and drivers that are
> > > > > > > > > built on
> > > > > Windows.
> > > > > > > > >
> > > > > > > > > The code has be converted to use the __atomic builtins
> but
> > > > > > > > > there
> > > are
> > > > > > > > > additional during conversion i notice that there may be
> > > > > > > > > some
> > > issues
> > > > > > > > > that need to be addressed.
> > > > > > > >
> > > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > > Especially for the places where it is used a loop, might
> be
> > > > > > > > more efficient with some of the other memory models.
> > > > > > >
> > > > > > > i agree.
> > > > > > >
> > > > > > > however, i'm not trying to improve the code with this
> change,
> > > > > > > just decouple it from rte_atomics.h so trying my best to
> avoid
> > > > > > > any unnecessary semantic change.
> > > > > > >
> > > > > > > certainly if the maintainers of this code wish to weaken the
> > > > > > > ordering where appropriate after the change is merged they
> can
> > > > > > > do so and
> > > handily
> > > > > > > this change has enabled them to do so easily allowing them
> to
> > > > > > > test
> > > just
> > > > > > > their change in isolation.
> > > > > >
> > > > > > I agree with the two-step approach, where this first step is a
> > > > > > simple
> > > > > search-and-replacement; but I insist that you add a FIXME or
> > > > > similar note where you have blindly used SEQ_CST, indicating
> that
> > > > > the memory order
> > > needs to
> > > > > be reviewed and potentially corrected.
> > > > >
> > > > > i think the maintainers need to take some responsibility, if
> they
> > > > > see optimizations they missed when previously writing the code
> > > > > they need to follow up with a patch themselves. i can't do
> > > > > everything for them and marking things i'm not sure about will
> > > > > only lead to me having to churn patch series to remove the
> unwanted
> > comments later.
> > > >
> > > > The previous atomic functions didn't have the "memory order"
> > > > parameter, so
> > > the maintainers didn't have to think about it - and thus they didn't
> > > miss any optimizations when accepting the code.
> > > >
> > > > I also agree 100 % that it is not your responsibility to consider
> or
> > > determine which memory order is appropriate!
> > > >
> > > > But I think you should mark the locations where you are changing
> > > > from the
> > > old rte_atomic functions (where no memory order optimization was
> > > available) to the new functions - to highlight where the option of
> > > memory ordering has been introduced and knowingly ignored (by you).
> > > >
> > >
> > > first, i have to apologize i confused myself about which of the many
> > > patch series i have up right now that you were commenting on.
> >
> > No worries... you are rushing through quite an effort for this, so a
> little
> > confusion is perfectly understandable. Especially when I'm replying to
> an ageing
> > email. :-)
> >
> > >
> > > let me ask for clarification in relation to this series.
> > >
> > > isn't that every single usage of the rte_atomic APIs?
> >
> > Probably, yes.
> >
> > > i mean are you
> > > literally asking for the entire patch series to look like the
> > > following patch snippet with the expectation that maintainers will
> > > come along and clean up/review after this series is merged?
> > >
> > > -rte_atomic_add32(&o, v);
> > > +//FIXME: opportunity for relaxing ordering constraint, please
> review
> > > +__atomic_fetch_add(&o, v, order);
> >
> > Exactly. And something similar for the rte_atomicXX_t variables
> changed to
> > intXX_t, such as the packet counters.
> >
> > Realistically, I don't expect the maintainers to clean them up anytime
> soon. The
> > purpose is to make the FIXMEs stick until someone eventually cleans
> them up, so
> > they are not forgotten as time passes.
> Cleaning up the rte_atomic APIs is a different effort. There is already
> lot of effort that has gone into this and there is more effort happening
> (rte_ring being a painful one)
> 
> Instead of having FIXME, why not just send a separate patch with SEQ_CST
> (still a search and replace)? We can leave the tougher ones like
> rte_ring as they are being worked on.

The FIXME makes it possible in the future to differentiate between the instances that still need review and the instances that have been reviewed where SEQ_CST was the correct choice. (Similarly for the choice of type for variables previously rte_atomicNN_t.)

> 
> >
> > >
> > > this would just be a mechanical addition to this series so i can
> > > certainly accomodate that, i thought something more complicated was
> > > being asked for. if this is all, then sure no problem.
> >
> > Great.
> >
> > >
> > > > > keep in mind i have to touch each of these again when converting
> > > > > to standard so that's a better time to review ~everything in
> more
> > > > > detail because when converting to standard that's when suddenly
> > > > > you get a bunch of code generation that is "fallback" to seq_cst
> that isn't
> > happening now.
> > > >
> > > > I think you should to do it when replacing the rte_atomic
> functions
> > > > with the
> > > __atomic functions. It will make it easier to see where the memory
> > > order was knowingly ignored, and should be reviewed for
> optimization.
> > > >
> > > > >
> > > > > the series that converts to standard needs to be up for review
> as
> > > > > soon as possible to maximize available time for feedback before
> > > > > 23.11 so it would be better to get the simpler cut & paste
> > > > > normalizing the code out of the way to unblock that series
> submission.
> > > > >
> > > > > >
> > > > > > Also, in a couple of the drivers, you are using int64_t for
> > > > > > packet
> > > counters.
> > > > > These cannot be negative and should be uint64_t. And AFAIK, such
> > > > > counters
> > > can
> > > > > use RELAXED memory order.
> > > > >
> > > > > i know you don't mean to say i selected the types and rather
> that
> > > > > the types that were selected are not quite correct for their
> usage.
> > > >
> > > > Yes; the previous types were also signed, and you didn't change
> that.
> > > >
> > > > > again
> > > > > on the review that actually adopts std atomics is a better place
> > > > > to make any potential type changes since we are "breaking" the
> API
> > > > > for 23.11 anyway. further, the std atomics series technically
> > > > > changes all the types so it's probably better to make one type
> > > > > change then rather than one now and one later.
> > > > >
> > > > > i think it would be best to get these validated and merged asap
> so
> > > > > we can get to the std atomics review. when that series is up
> let's
> > > > > discuss further how i can mark areas of concern, with that
> series
> > > > > i expect there will have to be some changes in order to avoid
> minor
> > regressions.
> > > > >
> > > > > thanks!
> > > >
> > > > I thought it would be better to catch these details (i.e. memory
> > > > ordering
> > > and signedness) early on, but I now understand that you planned to
> do
> > > it in a later step. So I'll let you proceed as you have planned.
> > > >
> > > > Thanks for all your work on this, Tyler. It is much appreciated!
> > >
> > > again, sorry for the confusion the sooner i can get some of these
> > > merged the easier it will be for me to manage the final series. i
> hope
> > > david/thomas can merge the simple normalization patches as soon as
> > > 23.03 cycle is complete.
> >
> > Yes. An early merge would also provide more time for reviewing and
> optimizing
> > the memory order of the most important atomic operations.
>
  
Honnappa Nagarahalli March 22, 2023, 5:38 p.m. UTC | #10
> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Wednesday, March 22, 2023 12:08 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Tyler Retzlaff
> <roretzla@linux.microsoft.com>
> Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; thomas@monjalon.net; nd
> <nd@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin atomics
> 
> > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > Sent: Wednesday, 22 March 2023 17.40
> >
> > > From: Morten Brørup <mb@smartsharesystems.com>
> > > Sent: Wednesday, March 22, 2023 11:14 AM
> > >
> > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > Sent: Wednesday, 22 March 2023 16.30
> > > >
> > > > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > Sent: Wednesday, 22 March 2023 15.22
> > > > > >
> > > > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > > > >
> > > > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen
> > > > > > > > Hemminger
> > > wrote:
> > > > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > > > <roretzla@linux.microsoft.com> wrote:
> > > > > > > > >
> > > > > > > > > > Replace the use of rte_atomic.h types and functions,
> > > > > > > > > > instead use
> > > > GCC
> > > > > > > > > > supplied C++11 memory model builtins.
> > > > > > > > > >
> > > > > > > > > > This series covers the libraries and drivers that are
> > > > > > > > > > built on
> > > > > > Windows.
> > > > > > > > > >
> > > > > > > > > > The code has be converted to use the __atomic builtins
> > but
> > > > > > > > > > there
> > > > are
> > > > > > > > > > additional during conversion i notice that there may
> > > > > > > > > > be some
> > > > issues
> > > > > > > > > > that need to be addressed.
> > > > > > > > >
> > > > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > > > Especially for the places where it is used a loop, might
> > be
> > > > > > > > > more efficient with some of the other memory models.
> > > > > > > >
> > > > > > > > i agree.
> > > > > > > >
> > > > > > > > however, i'm not trying to improve the code with this
> > change,
> > > > > > > > just decouple it from rte_atomics.h so trying my best to
> > avoid
> > > > > > > > any unnecessary semantic change.
> > > > > > > >
> > > > > > > > certainly if the maintainers of this code wish to weaken
> > > > > > > > the ordering where appropriate after the change is merged
> > > > > > > > they
> > can
> > > > > > > > do so and
> > > > handily
> > > > > > > > this change has enabled them to do so easily allowing them
> > to
> > > > > > > > test
> > > > just
> > > > > > > > their change in isolation.
> > > > > > >
> > > > > > > I agree with the two-step approach, where this first step is
> > > > > > > a simple
> > > > > > search-and-replacement; but I insist that you add a FIXME or
> > > > > > similar note where you have blindly used SEQ_CST, indicating
> > that
> > > > > > the memory order
> > > > needs to
> > > > > > be reviewed and potentially corrected.
> > > > > >
> > > > > > i think the maintainers need to take some responsibility, if
> > they
> > > > > > see optimizations they missed when previously writing the code
> > > > > > they need to follow up with a patch themselves. i can't do
> > > > > > everything for them and marking things i'm not sure about will
> > > > > > only lead to me having to churn patch series to remove the
> > unwanted
> > > comments later.
> > > > >
> > > > > The previous atomic functions didn't have the "memory order"
> > > > > parameter, so
> > > > the maintainers didn't have to think about it - and thus they
> > > > didn't miss any optimizations when accepting the code.
> > > > >
> > > > > I also agree 100 % that it is not your responsibility to
> > > > > consider
> > or
> > > > determine which memory order is appropriate!
> > > > >
> > > > > But I think you should mark the locations where you are changing
> > > > > from the
> > > > old rte_atomic functions (where no memory order optimization was
> > > > available) to the new functions - to highlight where the option of
> > > > memory ordering has been introduced and knowingly ignored (by you).
> > > > >
> > > >
> > > > first, i have to apologize i confused myself about which of the
> > > > many patch series i have up right now that you were commenting on.
> > >
> > > No worries... you are rushing through quite an effort for this, so a
> > little
> > > confusion is perfectly understandable. Especially when I'm replying
> > > to
> > an ageing
> > > email. :-)
> > >
> > > >
> > > > let me ask for clarification in relation to this series.
> > > >
> > > > isn't that every single usage of the rte_atomic APIs?
> > >
> > > Probably, yes.
> > >
> > > > i mean are you
> > > > literally asking for the entire patch series to look like the
> > > > following patch snippet with the expectation that maintainers will
> > > > come along and clean up/review after this series is merged?
> > > >
> > > > -rte_atomic_add32(&o, v);
> > > > +//FIXME: opportunity for relaxing ordering constraint, please
> > review
> > > > +__atomic_fetch_add(&o, v, order);
> > >
> > > Exactly. And something similar for the rte_atomicXX_t variables
> > changed to
> > > intXX_t, such as the packet counters.
> > >
> > > Realistically, I don't expect the maintainers to clean them up
> > > anytime
> > soon. The
> > > purpose is to make the FIXMEs stick until someone eventually cleans
> > them up, so
> > > they are not forgotten as time passes.
> > Cleaning up the rte_atomic APIs is a different effort. There is
> > already lot of effort that has gone into this and there is more effort
> > happening (rte_ring being a painful one)
> >
> > Instead of having FIXME, why not just send a separate patch with
> > SEQ_CST (still a search and replace)? We can leave the tougher ones
> > like rte_ring as they are being worked on.
> 
> The FIXME makes it possible in the future to differentiate between the instances
> that still need review and the instances that have been reviewed where
> SEQ_CST was the correct choice. (Similarly for the choice of type for variables
> previously rte_atomicNN_t.)
Apologies, relooked at the heading of this patch, got confused with other patches.

The changes Arm had done for rte_atomic_ to __atomic_xxx were not direct replacements. The algorithms were studied, relaxed where required, race conditions fixed, performance benchmarked. IMO, we need to go through the same steps here.

I looked at the series, we should just review the patch and make suggested changes. Are we constrained by any deadlines for this work?

I would suggest to drop 1/7. Arm is working on removing the non-C11 algorithm for rte_ring (not sure if we will be successful). I think it is better to explore this approach rather than the changes in patch 1/7.

> 
> >
> > >
> > > >
> > > > this would just be a mechanical addition to this series so i can
> > > > certainly accomodate that, i thought something more complicated
> > > > was being asked for. if this is all, then sure no problem.
> > >
> > > Great.
> > >
> > > >
> > > > > > keep in mind i have to touch each of these again when
> > > > > > converting to standard so that's a better time to review
> > > > > > ~everything in
> > more
> > > > > > detail because when converting to standard that's when
> > > > > > suddenly you get a bunch of code generation that is "fallback"
> > > > > > to seq_cst
> > that isn't
> > > happening now.
> > > > >
> > > > > I think you should to do it when replacing the rte_atomic
> > functions
> > > > > with the
> > > > __atomic functions. It will make it easier to see where the memory
> > > > order was knowingly ignored, and should be reviewed for
> > optimization.
> > > > >
> > > > > >
> > > > > > the series that converts to standard needs to be up for review
> > as
> > > > > > soon as possible to maximize available time for feedback
> > > > > > before
> > > > > > 23.11 so it would be better to get the simpler cut & paste
> > > > > > normalizing the code out of the way to unblock that series
> > submission.
> > > > > >
> > > > > > >
> > > > > > > Also, in a couple of the drivers, you are using int64_t for
> > > > > > > packet
> > > > counters.
> > > > > > These cannot be negative and should be uint64_t. And AFAIK,
> > > > > > such counters
> > > > can
> > > > > > use RELAXED memory order.
> > > > > >
> > > > > > i know you don't mean to say i selected the types and rather
> > that
> > > > > > the types that were selected are not quite correct for their
> > usage.
> > > > >
> > > > > Yes; the previous types were also signed, and you didn't change
> > that.
> > > > >
> > > > > > again
> > > > > > on the review that actually adopts std atomics is a better
> > > > > > place to make any potential type changes since we are
> > > > > > "breaking" the
> > API
> > > > > > for 23.11 anyway. further, the std atomics series technically
> > > > > > changes all the types so it's probably better to make one type
> > > > > > change then rather than one now and one later.
> > > > > >
> > > > > > i think it would be best to get these validated and merged
> > > > > > asap
> > so
> > > > > > we can get to the std atomics review. when that series is up
> > let's
> > > > > > discuss further how i can mark areas of concern, with that
> > series
> > > > > > i expect there will have to be some changes in order to avoid
> > minor
> > > regressions.
> > > > > >
> > > > > > thanks!
> > > > >
> > > > > I thought it would be better to catch these details (i.e. memory
> > > > > ordering
> > > > and signedness) early on, but I now understand that you planned to
> > do
> > > > it in a later step. So I'll let you proceed as you have planned.
> > > > >
> > > > > Thanks for all your work on this, Tyler. It is much appreciated!
> > > >
> > > > again, sorry for the confusion the sooner i can get some of these
> > > > merged the easier it will be for me to manage the final series. i
> > hope
> > > > david/thomas can merge the simple normalization patches as soon as
> > > > 23.03 cycle is complete.
> > >
> > > Yes. An early merge would also provide more time for reviewing and
> > optimizing
> > > the memory order of the most important atomic operations.
> >
  
Tyler Retzlaff March 22, 2023, 6:06 p.m. UTC | #11
On Wed, Mar 22, 2023 at 05:38:12PM +0000, Honnappa Nagarahalli wrote:
> 
> 
> > -----Original Message-----
> > From: Morten Brørup <mb@smartsharesystems.com>
> > Sent: Wednesday, March 22, 2023 12:08 PM
> > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Tyler Retzlaff
> > <roretzla@linux.microsoft.com>
> > Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> > Ruifeng Wang <Ruifeng.Wang@arm.com>; thomas@monjalon.net; nd
> > <nd@arm.com>; nd <nd@arm.com>
> > Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin atomics
> > 
> > > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > > Sent: Wednesday, 22 March 2023 17.40
> > >
> > > > From: Morten Brørup <mb@smartsharesystems.com>
> > > > Sent: Wednesday, March 22, 2023 11:14 AM
> > > >
> > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > Sent: Wednesday, 22 March 2023 16.30
> > > > >
> > > > > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > > Sent: Wednesday, 22 March 2023 15.22
> > > > > > >
> > > > > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > > > > >
> > > > > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen
> > > > > > > > > Hemminger
> > > > wrote:
> > > > > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > > > > <roretzla@linux.microsoft.com> wrote:
> > > > > > > > > >
> > > > > > > > > > > Replace the use of rte_atomic.h types and functions,
> > > > > > > > > > > instead use
> > > > > GCC
> > > > > > > > > > > supplied C++11 memory model builtins.
> > > > > > > > > > >
> > > > > > > > > > > This series covers the libraries and drivers that are
> > > > > > > > > > > built on
> > > > > > > Windows.
> > > > > > > > > > >
> > > > > > > > > > > The code has be converted to use the __atomic builtins
> > > but
> > > > > > > > > > > there
> > > > > are
> > > > > > > > > > > additional during conversion i notice that there may
> > > > > > > > > > > be some
> > > > > issues
> > > > > > > > > > > that need to be addressed.
> > > > > > > > > >
> > > > > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > > > > Especially for the places where it is used a loop, might
> > > be
> > > > > > > > > > more efficient with some of the other memory models.
> > > > > > > > >
> > > > > > > > > i agree.
> > > > > > > > >
> > > > > > > > > however, i'm not trying to improve the code with this
> > > change,
> > > > > > > > > just decouple it from rte_atomics.h so trying my best to
> > > avoid
> > > > > > > > > any unnecessary semantic change.
> > > > > > > > >
> > > > > > > > > certainly if the maintainers of this code wish to weaken
> > > > > > > > > the ordering where appropriate after the change is merged
> > > > > > > > > they
> > > can
> > > > > > > > > do so and
> > > > > handily
> > > > > > > > > this change has enabled them to do so easily allowing them
> > > to
> > > > > > > > > test
> > > > > just
> > > > > > > > > their change in isolation.
> > > > > > > >
> > > > > > > > I agree with the two-step approach, where this first step is
> > > > > > > > a simple
> > > > > > > search-and-replacement; but I insist that you add a FIXME or
> > > > > > > similar note where you have blindly used SEQ_CST, indicating
> > > that
> > > > > > > the memory order
> > > > > needs to
> > > > > > > be reviewed and potentially corrected.
> > > > > > >
> > > > > > > i think the maintainers need to take some responsibility, if
> > > they
> > > > > > > see optimizations they missed when previously writing the code
> > > > > > > they need to follow up with a patch themselves. i can't do
> > > > > > > everything for them and marking things i'm not sure about will
> > > > > > > only lead to me having to churn patch series to remove the
> > > unwanted
> > > > comments later.
> > > > > >
> > > > > > The previous atomic functions didn't have the "memory order"
> > > > > > parameter, so
> > > > > the maintainers didn't have to think about it - and thus they
> > > > > didn't miss any optimizations when accepting the code.
> > > > > >
> > > > > > I also agree 100 % that it is not your responsibility to
> > > > > > consider
> > > or
> > > > > determine which memory order is appropriate!
> > > > > >
> > > > > > But I think you should mark the locations where you are changing
> > > > > > from the
> > > > > old rte_atomic functions (where no memory order optimization was
> > > > > available) to the new functions - to highlight where the option of
> > > > > memory ordering has been introduced and knowingly ignored (by you).
> > > > > >
> > > > >
> > > > > first, i have to apologize i confused myself about which of the
> > > > > many patch series i have up right now that you were commenting on.
> > > >
> > > > No worries... you are rushing through quite an effort for this, so a
> > > little
> > > > confusion is perfectly understandable. Especially when I'm replying
> > > > to
> > > an ageing
> > > > email. :-)
> > > >
> > > > >
> > > > > let me ask for clarification in relation to this series.
> > > > >
> > > > > isn't that every single usage of the rte_atomic APIs?
> > > >
> > > > Probably, yes.
> > > >
> > > > > i mean are you
> > > > > literally asking for the entire patch series to look like the
> > > > > following patch snippet with the expectation that maintainers will
> > > > > come along and clean up/review after this series is merged?
> > > > >
> > > > > -rte_atomic_add32(&o, v);
> > > > > +//FIXME: opportunity for relaxing ordering constraint, please
> > > review
> > > > > +__atomic_fetch_add(&o, v, order);
> > > >
> > > > Exactly. And something similar for the rte_atomicXX_t variables
> > > changed to
> > > > intXX_t, such as the packet counters.
> > > >
> > > > Realistically, I don't expect the maintainers to clean them up
> > > > anytime
> > > soon. The
> > > > purpose is to make the FIXMEs stick until someone eventually cleans
> > > them up, so
> > > > they are not forgotten as time passes.
> > > Cleaning up the rte_atomic APIs is a different effort. There is
> > > already lot of effort that has gone into this and there is more effort
> > > happening (rte_ring being a painful one)
> > >
> > > Instead of having FIXME, why not just send a separate patch with
> > > SEQ_CST (still a search and replace)? We can leave the tougher ones
> > > like rte_ring as they are being worked on.
> > 
> > The FIXME makes it possible in the future to differentiate between the instances
> > that still need review and the instances that have been reviewed where
> > SEQ_CST was the correct choice. (Similarly for the choice of type for variables
> > previously rte_atomicNN_t.)
> Apologies, relooked at the heading of this patch, got confused with other patches.

yeah, i did the same thing this morning :)

> 
> The changes Arm had done for rte_atomic_ to __atomic_xxx were not direct replacements. The algorithms were studied, relaxed where required, race conditions fixed, performance benchmarked. IMO, we need to go through the same steps here.
> 
> I looked at the series, we should just review the patch and make suggested changes. Are we constrained by any deadlines for this work?

i'm going to say yes but i'll qualify. the use of the rte_atomic_xxx
APIs drags in extra work when creating a series that performs the actual
conversions to the standard atomics.

if i don't decouple ring from rte_atomic_xxx that means i have to go
convert all the rte_atomic.h to standard atomics and working around some
of the implementation detail to do it is very time consuming. which
then has further flow on effects because then i have to go fix every
single driver that is still using rte_atomic.h.

incidentally i have a work in progress to decouple everything from
rte_atomic.h (including all drivers) but it would really negatively
impact getting standard atomics introduced if we had to serialize the
introduction behind a total removal of rte_atomic or had to make
changes to every consumer of the old rte_atomic APIs.

if we can get by with a comment on the rte_atomic_xxx lines in this
series it would be helpful. when we bring the next series for standard
atomics i'm not adverse to introducing changes to the ordering in that series
if requested so long as i can get the series up 'soon' so there is lots
of review time runway for 23.11.

> 
> I would suggest to drop 1/7. Arm is working on removing the non-C11 algorithm for rte_ring (not sure if we will be successful). I think it is better to explore this approach rather than the changes in patch 1/7.

i think my answer here is timing. i'd rather take the work from arm but
if it isn't coming for a while then it becomes a blocker.

we're waiting for the 23.07 start before this series can be merged. how
about we re-evaluate where arm is at when the merge window opens. we can
then decide to drop 1/7 or not at that time?

ty
  
Tyler Retzlaff May 2, 2023, 3:37 a.m. UTC | #12
On Wed, Mar 22, 2023 at 11:06:08AM -0700, Tyler Retzlaff wrote:
> On Wed, Mar 22, 2023 at 05:38:12PM +0000, Honnappa Nagarahalli wrote:
> > 
> > 
> > > -----Original Message-----
> > > From: Morten Brørup <mb@smartsharesystems.com>
> > > Sent: Wednesday, March 22, 2023 12:08 PM
> > > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Tyler Retzlaff
> > > <roretzla@linux.microsoft.com>
> > > Cc: Stephen Hemminger <stephen@networkplumber.org>; dev@dpdk.org;
> > > Ruifeng Wang <Ruifeng.Wang@arm.com>; thomas@monjalon.net; nd
> > > <nd@arm.com>; nd <nd@arm.com>
> > > Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin atomics
> > > 
> > > > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > > > Sent: Wednesday, 22 March 2023 17.40
> > > >
> > > > > From: Morten Brørup <mb@smartsharesystems.com>
> > > > > Sent: Wednesday, March 22, 2023 11:14 AM
> > > > >
> > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > Sent: Wednesday, 22 March 2023 16.30
> > > > > >
> > > > > > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Brørup wrote:
> > > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > > > Sent: Wednesday, 22 March 2023 15.22
> > > > > > > >
> > > > > > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Brørup wrote:
> > > > > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen
> > > > > > > > > > Hemminger
> > > > > wrote:
> > > > > > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > > > > > <roretzla@linux.microsoft.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Replace the use of rte_atomic.h types and functions,
> > > > > > > > > > > > instead use
> > > > > > GCC
> > > > > > > > > > > > supplied C++11 memory model builtins.
> > > > > > > > > > > >
> > > > > > > > > > > > This series covers the libraries and drivers that are
> > > > > > > > > > > > built on
> > > > > > > > Windows.
> > > > > > > > > > > >
> > > > > > > > > > > > The code has be converted to use the __atomic builtins
> > > > but
> > > > > > > > > > > > there
> > > > > > are
> > > > > > > > > > > > additional during conversion i notice that there may
> > > > > > > > > > > > be some
> > > > > > issues
> > > > > > > > > > > > that need to be addressed.
> > > > > > > > > > >
> > > > > > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > > > > > Especially for the places where it is used a loop, might
> > > > be
> > > > > > > > > > > more efficient with some of the other memory models.
> > > > > > > > > >
> > > > > > > > > > i agree.
> > > > > > > > > >
> > > > > > > > > > however, i'm not trying to improve the code with this
> > > > change,
> > > > > > > > > > just decouple it from rte_atomics.h so trying my best to
> > > > avoid
> > > > > > > > > > any unnecessary semantic change.
> > > > > > > > > >
> > > > > > > > > > certainly if the maintainers of this code wish to weaken
> > > > > > > > > > the ordering where appropriate after the change is merged
> > > > > > > > > > they
> > > > can
> > > > > > > > > > do so and
> > > > > > handily
> > > > > > > > > > this change has enabled them to do so easily allowing them
> > > > to
> > > > > > > > > > test
> > > > > > just
> > > > > > > > > > their change in isolation.
> > > > > > > > >
> > > > > > > > > I agree with the two-step approach, where this first step is
> > > > > > > > > a simple
> > > > > > > > search-and-replacement; but I insist that you add a FIXME or
> > > > > > > > similar note where you have blindly used SEQ_CST, indicating
> > > > that
> > > > > > > > the memory order
> > > > > > needs to
> > > > > > > > be reviewed and potentially corrected.
> > > > > > > >
> > > > > > > > i think the maintainers need to take some responsibility, if
> > > > they
> > > > > > > > see optimizations they missed when previously writing the code
> > > > > > > > they need to follow up with a patch themselves. i can't do
> > > > > > > > everything for them and marking things i'm not sure about will
> > > > > > > > only lead to me having to churn patch series to remove the
> > > > unwanted
> > > > > comments later.
> > > > > > >
> > > > > > > The previous atomic functions didn't have the "memory order"
> > > > > > > parameter, so
> > > > > > the maintainers didn't have to think about it - and thus they
> > > > > > didn't miss any optimizations when accepting the code.
> > > > > > >
> > > > > > > I also agree 100 % that it is not your responsibility to
> > > > > > > consider
> > > > or
> > > > > > determine which memory order is appropriate!
> > > > > > >
> > > > > > > But I think you should mark the locations where you are changing
> > > > > > > from the
> > > > > > old rte_atomic functions (where no memory order optimization was
> > > > > > available) to the new functions - to highlight where the option of
> > > > > > memory ordering has been introduced and knowingly ignored (by you).
> > > > > > >
> > > > > >
> > > > > > first, i have to apologize i confused myself about which of the
> > > > > > many patch series i have up right now that you were commenting on.
> > > > >
> > > > > No worries... you are rushing through quite an effort for this, so a
> > > > little
> > > > > confusion is perfectly understandable. Especially when I'm replying
> > > > > to
> > > > an ageing
> > > > > email. :-)
> > > > >
> > > > > >
> > > > > > let me ask for clarification in relation to this series.
> > > > > >
> > > > > > isn't that every single usage of the rte_atomic APIs?
> > > > >
> > > > > Probably, yes.
> > > > >
> > > > > > i mean are you
> > > > > > literally asking for the entire patch series to look like the
> > > > > > following patch snippet with the expectation that maintainers will
> > > > > > come along and clean up/review after this series is merged?
> > > > > >
> > > > > > -rte_atomic_add32(&o, v);
> > > > > > +//FIXME: opportunity for relaxing ordering constraint, please
> > > > review
> > > > > > +__atomic_fetch_add(&o, v, order);
> > > > >
> > > > > Exactly. And something similar for the rte_atomicXX_t variables
> > > > changed to
> > > > > intXX_t, such as the packet counters.
> > > > >
> > > > > Realistically, I don't expect the maintainers to clean them up
> > > > > anytime
> > > > soon. The
> > > > > purpose is to make the FIXMEs stick until someone eventually cleans
> > > > them up, so
> > > > > they are not forgotten as time passes.
> > > > Cleaning up the rte_atomic APIs is a different effort. There is
> > > > already lot of effort that has gone into this and there is more effort
> > > > happening (rte_ring being a painful one)
> > > >
> > > > Instead of having FIXME, why not just send a separate patch with
> > > > SEQ_CST (still a search and replace)? We can leave the tougher ones
> > > > like rte_ring as they are being worked on.
> > > 
> > > The FIXME makes it possible in the future to differentiate between the instances
> > > that still need review and the instances that have been reviewed where
> > > SEQ_CST was the correct choice. (Similarly for the choice of type for variables
> > > previously rte_atomicNN_t.)
> > Apologies, relooked at the heading of this patch, got confused with other patches.
> 
> yeah, i did the same thing this morning :)
> 
> > 
> > The changes Arm had done for rte_atomic_ to __atomic_xxx were not direct replacements. The algorithms were studied, relaxed where required, race conditions fixed, performance benchmarked. IMO, we need to go through the same steps here.
> > 
> > I looked at the series, we should just review the patch and make suggested changes. Are we constrained by any deadlines for this work?
> 
> i'm going to say yes but i'll qualify. the use of the rte_atomic_xxx
> APIs drags in extra work when creating a series that performs the actual
> conversions to the standard atomics.
> 
> if i don't decouple ring from rte_atomic_xxx that means i have to go
> convert all the rte_atomic.h to standard atomics and working around some
> of the implementation detail to do it is very time consuming. which
> then has further flow on effects because then i have to go fix every
> single driver that is still using rte_atomic.h.
> 
> incidentally i have a work in progress to decouple everything from
> rte_atomic.h (including all drivers) but it would really negatively
> impact getting standard atomics introduced if we had to serialize the
> introduction behind a total removal of rte_atomic or had to make
> changes to every consumer of the old rte_atomic APIs.
> 
> if we can get by with a comment on the rte_atomic_xxx lines in this
> series it would be helpful. when we bring the next series for standard
> atomics i'm not adverse to introducing changes to the ordering in that series
> if requested so long as i can get the series up 'soon' so there is lots
> of review time runway for 23.11.
> 
> > 
> > I would suggest to drop 1/7. Arm is working on removing the non-C11 algorithm for rte_ring (not sure if we will be successful). I think it is better to explore this approach rather than the changes in patch 1/7.
> 
> i think my answer here is timing. i'd rather take the work from arm but
> if it isn't coming for a while then it becomes a blocker.
> 
> we're waiting for the 23.07 start before this series can be merged. how
> about we re-evaluate where arm is at when the merge window opens. we can
> then decide to drop 1/7 or not at that time?

ping?

any update if there is going to be a series from arm as an acceptable
replacement for patch 1/7? otherwise i think we should take the patch as
is. it isn't altering the semantics of the code and is fairly low line
count change so shouldn't distrupt any out of tree work as a result of
the churn.

please update asap, this is one of the two series that is preventing
submission of the first series converting to standard atomics for
review.

thanks!

> 
> ty
  
Honnappa Nagarahalli May 2, 2023, 4:31 a.m. UTC | #13
> -----Original Message-----
> From: Tyler Retzlaff <roretzla@linux.microsoft.com>
> Sent: Monday, May 1, 2023 10:38 PM
> To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Cc: Morten Brørup <mb@smartsharesystems.com>; Stephen Hemminger
> <stephen@networkplumber.org>; dev@dpdk.org; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; thomas@monjalon.net; nd <nd@arm.com>
> Subject: Re: [PATCH 0/7] replace rte atomics with GCC builtin atomics
> 
> 
> On Wed, Mar 22, 2023 at 11:06:08AM -0700, Tyler Retzlaff wrote:
> > On Wed, Mar 22, 2023 at 05:38:12PM +0000, Honnappa Nagarahalli wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: Morten Br�rup <mb@smartsharesystems.com>
> > > > Sent: Wednesday, March 22, 2023 12:08 PM
> > > > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Tyler
> > > > Retzlaff <roretzla@linux.microsoft.com>
> > > > Cc: Stephen Hemminger <stephen@networkplumber.org>;
> dev@dpdk.org;
> > > > Ruifeng Wang <Ruifeng.Wang@arm.com>; thomas@monjalon.net; nd
> > > > <nd@arm.com>; nd <nd@arm.com>
> > > > Subject: RE: [PATCH 0/7] replace rte atomics with GCC builtin
> > > > atomics
> > > >
> > > > > From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> > > > > Sent: Wednesday, 22 March 2023 17.40
> > > > >
> > > > > > From: Morten Br�rup <mb@smartsharesystems.com>
> > > > > > Sent: Wednesday, March 22, 2023 11:14 AM
> > > > > >
> > > > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > > > Sent: Wednesday, 22 March 2023 16.30
> > > > > > >
> > > > > > > On Wed, Mar 22, 2023 at 03:58:07PM +0100, Morten Br�rup
> wrote:
> > > > > > > > > From: Tyler Retzlaff
> > > > > > > > > [mailto:roretzla@linux.microsoft.com]
> > > > > > > > > Sent: Wednesday, 22 March 2023 15.22
> > > > > > > > >
> > > > > > > > > On Wed, Mar 22, 2023 at 12:28:44PM +0100, Morten Br�rup
> wrote:
> > > > > > > > > > > From: Tyler Retzlaff
> > > > > > > > > > > [mailto:roretzla@linux.microsoft.com]
> > > > > > > > > > > Sent: Friday, 17 March 2023 22.49
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 17, 2023 at 02:42:26PM -0700, Stephen
> > > > > > > > > > > Hemminger
> > > > > > wrote:
> > > > > > > > > > > > On Fri, 17 Mar 2023 13:19:41 -0700 Tyler Retzlaff
> > > > > > > > > > > > <roretzla@linux.microsoft.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Replace the use of rte_atomic.h types and
> > > > > > > > > > > > > functions, instead use
> > > > > > > GCC
> > > > > > > > > > > > > supplied C++11 memory model builtins.
> > > > > > > > > > > > >
> > > > > > > > > > > > > This series covers the libraries and drivers
> > > > > > > > > > > > > that are built on
> > > > > > > > > Windows.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The code has be converted to use the __atomic
> > > > > > > > > > > > > builtins
> > > > > but
> > > > > > > > > > > > > there
> > > > > > > are
> > > > > > > > > > > > > additional during conversion i notice that there
> > > > > > > > > > > > > may be some
> > > > > > > issues
> > > > > > > > > > > > > that need to be addressed.
> > > > > > > > > > > >
> > > > > > > > > > > > I don't think all these cmpset need to use SEQ_CST.
> > > > > > > > > > > > Especially for the places where it is used a loop,
> > > > > > > > > > > > might
> > > > > be
> > > > > > > > > > > > more efficient with some of the other memory models.
> > > > > > > > > > >
> > > > > > > > > > > i agree.
> > > > > > > > > > >
> > > > > > > > > > > however, i'm not trying to improve the code with
> > > > > > > > > > > this
> > > > > change,
> > > > > > > > > > > just decouple it from rte_atomics.h so trying my
> > > > > > > > > > > best to
> > > > > avoid
> > > > > > > > > > > any unnecessary semantic change.
> > > > > > > > > > >
> > > > > > > > > > > certainly if the maintainers of this code wish to
> > > > > > > > > > > weaken the ordering where appropriate after the
> > > > > > > > > > > change is merged they
> > > > > can
> > > > > > > > > > > do so and
> > > > > > > handily
> > > > > > > > > > > this change has enabled them to do so easily
> > > > > > > > > > > allowing them
> > > > > to
> > > > > > > > > > > test
> > > > > > > just
> > > > > > > > > > > their change in isolation.
> > > > > > > > > >
> > > > > > > > > > I agree with the two-step approach, where this first
> > > > > > > > > > step is a simple
> > > > > > > > > search-and-replacement; but I insist that you add a
> > > > > > > > > FIXME or similar note where you have blindly used
> > > > > > > > > SEQ_CST, indicating
> > > > > that
> > > > > > > > > the memory order
> > > > > > > needs to
> > > > > > > > > be reviewed and potentially corrected.
> > > > > > > > >
> > > > > > > > > i think the maintainers need to take some
> > > > > > > > > responsibility, if
> > > > > they
> > > > > > > > > see optimizations they missed when previously writing
> > > > > > > > > the code they need to follow up with a patch themselves.
> > > > > > > > > i can't do everything for them and marking things i'm
> > > > > > > > > not sure about will only lead to me having to churn
> > > > > > > > > patch series to remove the
> > > > > unwanted
> > > > > > comments later.
> > > > > > > >
> > > > > > > > The previous atomic functions didn't have the "memory order"
> > > > > > > > parameter, so
> > > > > > > the maintainers didn't have to think about it - and thus
> > > > > > > they didn't miss any optimizations when accepting the code.
> > > > > > > >
> > > > > > > > I also agree 100 % that it is not your responsibility to
> > > > > > > > consider
> > > > > or
> > > > > > > determine which memory order is appropriate!
> > > > > > > >
> > > > > > > > But I think you should mark the locations where you are
> > > > > > > > changing from the
> > > > > > > old rte_atomic functions (where no memory order optimization
> > > > > > > was
> > > > > > > available) to the new functions - to highlight where the
> > > > > > > option of memory ordering has been introduced and knowingly
> ignored (by you).
> > > > > > > >
> > > > > > >
> > > > > > > first, i have to apologize i confused myself about which of
> > > > > > > the many patch series i have up right now that you were commenting
> on.
> > > > > >
> > > > > > No worries... you are rushing through quite an effort for
> > > > > > this, so a
> > > > > little
> > > > > > confusion is perfectly understandable. Especially when I'm
> > > > > > replying to
> > > > > an ageing
> > > > > > email. :-)
> > > > > >
> > > > > > >
> > > > > > > let me ask for clarification in relation to this series.
> > > > > > >
> > > > > > > isn't that every single usage of the rte_atomic APIs?
> > > > > >
> > > > > > Probably, yes.
> > > > > >
> > > > > > > i mean are you
> > > > > > > literally asking for the entire patch series to look like
> > > > > > > the following patch snippet with the expectation that
> > > > > > > maintainers will come along and clean up/review after this series is
> merged?
> > > > > > >
> > > > > > > -rte_atomic_add32(&o, v);
> > > > > > > +//FIXME: opportunity for relaxing ordering constraint,
> > > > > > > +please
> > > > > review
> > > > > > > +__atomic_fetch_add(&o, v, order);
> > > > > >
> > > > > > Exactly. And something similar for the rte_atomicXX_t
> > > > > > variables
> > > > > changed to
> > > > > > intXX_t, such as the packet counters.
> > > > > >
> > > > > > Realistically, I don't expect the maintainers to clean them up
> > > > > > anytime
> > > > > soon. The
> > > > > > purpose is to make the FIXMEs stick until someone eventually
> > > > > > cleans
> > > > > them up, so
> > > > > > they are not forgotten as time passes.
> > > > > Cleaning up the rte_atomic APIs is a different effort. There is
> > > > > already lot of effort that has gone into this and there is more
> > > > > effort happening (rte_ring being a painful one)
> > > > >
> > > > > Instead of having FIXME, why not just send a separate patch with
> > > > > SEQ_CST (still a search and replace)? We can leave the tougher
> > > > > ones like rte_ring as they are being worked on.
> > > >
> > > > The FIXME makes it possible in the future to differentiate between
> > > > the instances that still need review and the instances that have
> > > > been reviewed where SEQ_CST was the correct choice. (Similarly for
> > > > the choice of type for variables previously rte_atomicNN_t.)
> > > Apologies, relooked at the heading of this patch, got confused with other
> patches.
> >
> > yeah, i did the same thing this morning :)
> >
> > >
> > > The changes Arm had done for rte_atomic_ to __atomic_xxx were not direct
> replacements. The algorithms were studied, relaxed where required, race
> conditions fixed, performance benchmarked. IMO, we need to go through the
> same steps here.
> > >
> > > I looked at the series, we should just review the patch and make suggested
> changes. Are we constrained by any deadlines for this work?
> >
> > i'm going to say yes but i'll qualify. the use of the rte_atomic_xxx
> > APIs drags in extra work when creating a series that performs the
> > actual conversions to the standard atomics.
> >
> > if i don't decouple ring from rte_atomic_xxx that means i have to go
> > convert all the rte_atomic.h to standard atomics and working around
> > some of the implementation detail to do it is very time consuming.
> > which then has further flow on effects because then i have to go fix
> > every single driver that is still using rte_atomic.h.
> >
> > incidentally i have a work in progress to decouple everything from
> > rte_atomic.h (including all drivers) but it would really negatively
> > impact getting standard atomics introduced if we had to serialize the
> > introduction behind a total removal of rte_atomic or had to make
> > changes to every consumer of the old rte_atomic APIs.
> >
> > if we can get by with a comment on the rte_atomic_xxx lines in this
> > series it would be helpful. when we bring the next series for standard
> > atomics i'm not adverse to introducing changes to the ordering in that
> > series if requested so long as i can get the series up 'soon' so there
> > is lots of review time runway for 23.11.
> >
> > >
> > > I would suggest to drop 1/7. Arm is working on removing the non-C11
> algorithm for rte_ring (not sure if we will be successful). I think it is better to
> explore this approach rather than the changes in patch 1/7.
> >
> > i think my answer here is timing. i'd rather take the work from arm
> > but if it isn't coming for a while then it becomes a blocker.
> >
> > we're waiting for the 23.07 start before this series can be merged.
> > how about we re-evaluate where arm is at when the merge window opens.
> > we can then decide to drop 1/7 or not at that time?
> 
> ping?
> 
> any update if there is going to be a series from arm as an acceptable
> replacement for patch 1/7? otherwise i think we should take the patch as is. it
> isn't altering the semantics of the code and is fairly low line count change so
> shouldn't distrupt any out of tree work as a result of the churn.
Yes, we are working on a patch. There is a RFC [1], but we are still working on proving if the algorithm is correct. But, the plan is to find a solution (or present alternatives if there are no solutions) in 23.07 release.

[1] https://patchwork.dpdk.org/project/dpdk/patch/20230421191642.217011-1-wathsala.vithanage@arm.com/

> 
> please update asap, this is one of the two series that is preventing submission of
> the first series converting to standard atomics for review.
> 
> thanks!
> 
> >
> > ty