[v2,1/1] net: fix aliasing issue in checksum computation
Checks
Commit Message
That means a superfluous cast is removed and aliasing through a uint8_t
pointer is eliminated. NB: The C standard specifies that a unsigned char
pointer may alias while the C standard doesn't include such requirement
for uint8_t pointers.
Also simplified the loop since a modern C compiler can speed up (i.e.
auto-vectorize) it in a similar way. For example, GCC auto-vectorizes it
for Haswell using AVX registers while halving the number of instructions
in the generated code.
Signed-off-by: Georg Sauthoff <mail@gms.tf>
---
lib/net/rte_ip.h | 27 ++++++++-------------------
1 file changed, 8 insertions(+), 19 deletions(-)
Comments
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> Sent: Sunday, 17 October 2021 22.37
+Ferruh, as delegate to v1 in Patchwork.
>
> That means a superfluous cast is removed and aliasing through a uint8_t
> pointer is eliminated. NB: The C standard specifies that a unsigned
> char
> pointer may alias while the C standard doesn't include such requirement
> for uint8_t pointers.
>
> Also simplified the loop since a modern C compiler can speed up (i.e.
> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> it
> for Haswell using AVX registers while halving the number of
> instructions
> in the generated code.
>
> Signed-off-by: Georg Sauthoff <mail@gms.tf>
> ---
> lib/net/rte_ip.h | 27 ++++++++-------------------
> 1 file changed, 8 insertions(+), 19 deletions(-)
>
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index 05948b69b7..1b8c6519a9 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> *ipv4_hdr)
> static inline uint32_t
> __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> {
> - /* workaround gcc strict-aliasing warning */
> - uintptr_t ptr = (uintptr_t)buf;
> + /* extend strict-aliasing rules */
> typedef uint16_t __attribute__((__may_alias__)) u16_p;
> - const u16_p *u16_buf = (const u16_p *)ptr;
> -
> - while (len >= (sizeof(*u16_buf) * 4)) {
> - sum += u16_buf[0];
> - sum += u16_buf[1];
> - sum += u16_buf[2];
> - sum += u16_buf[3];
> - len -= sizeof(*u16_buf) * 4;
> - u16_buf += 4;
> - }
> - while (len >= sizeof(*u16_buf)) {
> + const u16_p *u16_buf = (const u16_p *)buf;
> + const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> +
> + for (; u16_buf != end; ++u16_buf)
> sum += *u16_buf;
> - len -= sizeof(*u16_buf);
> - u16_buf += 1;
> - }
>
> - /* if length is in odd bytes */
> - if (len == 1) {
> + /* if length is odd, keeping it byte order independent */
> + if (unlikely(len % 2)) {
> uint16_t left = 0;
> - *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> + *(unsigned char*)&left = *(const unsigned char *)end;
> sum += left;
> }
>
> --
> 2.31.1
>
Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
> > Sent: Sunday, 17 October 2021 22.37
>
> +Ferruh, as delegate to v1 in Patchwork.
>
> >
> > That means a superfluous cast is removed and aliasing through a uint8_t
> > pointer is eliminated. NB: The C standard specifies that a unsigned
> > char
> > pointer may alias while the C standard doesn't include such requirement
> > for uint8_t pointers.
> >
> > Also simplified the loop since a modern C compiler can speed up (i.e.
> > auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
> > it
> > for Haswell using AVX registers while halving the number of
> > instructions
> > in the generated code.
> >
> > Signed-off-by: Georg Sauthoff <mail@gms.tf>
> > ---
> > lib/net/rte_ip.h | 27 ++++++++-------------------
> > 1 file changed, 8 insertions(+), 19 deletions(-)
> >
> > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > index 05948b69b7..1b8c6519a9 100644
> > --- a/lib/net/rte_ip.h
> > +++ b/lib/net/rte_ip.h
> > @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
> > *ipv4_hdr)
> > static inline uint32_t
> > __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
> > {
> > - /* workaround gcc strict-aliasing warning */
> > - uintptr_t ptr = (uintptr_t)buf;
> > + /* extend strict-aliasing rules */
> > typedef uint16_t __attribute__((__may_alias__)) u16_p;
> > - const u16_p *u16_buf = (const u16_p *)ptr;
> > -
> > - while (len >= (sizeof(*u16_buf) * 4)) {
> > - sum += u16_buf[0];
> > - sum += u16_buf[1];
> > - sum += u16_buf[2];
> > - sum += u16_buf[3];
> > - len -= sizeof(*u16_buf) * 4;
> > - u16_buf += 4;
> > - }
> > - while (len >= sizeof(*u16_buf)) {
> > + const u16_p *u16_buf = (const u16_p *)buf;
> > + const u16_p *end = u16_buf + len / sizeof(*u16_buf);
> > +
> > + for (; u16_buf != end; ++u16_buf)
> > sum += *u16_buf;
> > - len -= sizeof(*u16_buf);
> > - u16_buf += 1;
> > - }
> >
> > - /* if length is in odd bytes */
> > - if (len == 1) {
> > + /* if length is odd, keeping it byte order independent */
> > + if (unlikely(len % 2)) {
> > uint16_t left = 0;
> > - *(uint8_t *)&left = *(const uint8_t *)u16_buf;
> > + *(unsigned char*)&left = *(const unsigned char *)end;
> > sum += left;
> > }
> >
> > --
> > 2.31.1
> >
>
> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
>
> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
+1, thanks for the good documentation
> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
Acked-by: Olivier Matz <olivier.matz@6wind.com>
On 10/18/2021 8:58 AM, Olivier Matz wrote:
> On Mon, Oct 18, 2021 at 09:35:41AM +0200, Morten Brørup wrote:
>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Georg Sauthoff
>>> Sent: Sunday, 17 October 2021 22.37
>>
>> +Ferruh, as delegate to v1 in Patchwork.
>>
>>>
>>> That means a superfluous cast is removed and aliasing through a uint8_t
>>> pointer is eliminated. NB: The C standard specifies that a unsigned
>>> char
>>> pointer may alias while the C standard doesn't include such requirement
>>> for uint8_t pointers.
>>>
>>> Also simplified the loop since a modern C compiler can speed up (i.e.
>>> auto-vectorize) it in a similar way. For example, GCC auto-vectorizes
>>> it
>>> for Haswell using AVX registers while halving the number of
>>> instructions
>>> in the generated code.
>>>
>>> Signed-off-by: Georg Sauthoff <mail@gms.tf>
>>> ---
>>> lib/net/rte_ip.h | 27 ++++++++-------------------
>>> 1 file changed, 8 insertions(+), 19 deletions(-)
>>>
>>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
>>> index 05948b69b7..1b8c6519a9 100644
>>> --- a/lib/net/rte_ip.h
>>> +++ b/lib/net/rte_ip.h
>>> @@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr
>>> *ipv4_hdr)
>>> static inline uint32_t
>>> __rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
>>> {
>>> - /* workaround gcc strict-aliasing warning */
>>> - uintptr_t ptr = (uintptr_t)buf;
>>> + /* extend strict-aliasing rules */
>>> typedef uint16_t __attribute__((__may_alias__)) u16_p;
>>> - const u16_p *u16_buf = (const u16_p *)ptr;
>>> -
>>> - while (len >= (sizeof(*u16_buf) * 4)) {
>>> - sum += u16_buf[0];
>>> - sum += u16_buf[1];
>>> - sum += u16_buf[2];
>>> - sum += u16_buf[3];
>>> - len -= sizeof(*u16_buf) * 4;
>>> - u16_buf += 4;
>>> - }
>>> - while (len >= sizeof(*u16_buf)) {
>>> + const u16_p *u16_buf = (const u16_p *)buf;
>>> + const u16_p *end = u16_buf + len / sizeof(*u16_buf);
>>> +
>>> + for (; u16_buf != end; ++u16_buf)
>>> sum += *u16_buf;
>>> - len -= sizeof(*u16_buf);
>>> - u16_buf += 1;
>>> - }
>>>
>>> - /* if length is in odd bytes */
>>> - if (len == 1) {
>>> + /* if length is odd, keeping it byte order independent */
>>> + if (unlikely(len % 2)) {
>>> uint16_t left = 0;
>>> - *(uint8_t *)&left = *(const uint8_t *)u16_buf;
>>> + *(unsigned char*)&left = *(const unsigned char *)end;
>>> sum += left;
>>> }
>>>
>>> --
>>> 2.31.1
>>>
>>
>> Great work documenting your thoughts behind this patch, Georg! I, for one, didn't know about the aliasing difference between uint8_t and unsigned char. :-)
>>
>> After taking a good look at v2 and the Godbolt reference to confirm the claimed benefits, there can be no doubts about this patch.
>
> +1, thanks for the good documentation
>
>> Reviewed-by: Morten Brørup <mb@smartsharesystems.com>
>
> Acked-by: Olivier Matz <olivier.matz@6wind.com>
>
Updated patch title as: "net: fix aliasing in checksum computation"
Added fixes tags:
Fixes: 6006818cfb26 ("net: new checksum functions")
Fixes: e079655c41fb ("net: fix build with gcc 4.4.7 and strict aliasing")
Cc: stable@dpdk.org
Following warning fixed in next-net:
ERROR:POINTER_LOCATION: "(foo*)" should be "(foo *)"
#65: FILE: lib/net/rte_ip.h:168:
+ *(unsigned char*)&left = *(const unsigned char *)end;
Applied to dpdk-next-net/main, thanks.
@@ -141,29 +141,18 @@ rte_ipv4_hdr_len(const struct rte_ipv4_hdr *ipv4_hdr)
static inline uint32_t
__rte_raw_cksum(const void *buf, size_t len, uint32_t sum)
{
- /* workaround gcc strict-aliasing warning */
- uintptr_t ptr = (uintptr_t)buf;
+ /* extend strict-aliasing rules */
typedef uint16_t __attribute__((__may_alias__)) u16_p;
- const u16_p *u16_buf = (const u16_p *)ptr;
-
- while (len >= (sizeof(*u16_buf) * 4)) {
- sum += u16_buf[0];
- sum += u16_buf[1];
- sum += u16_buf[2];
- sum += u16_buf[3];
- len -= sizeof(*u16_buf) * 4;
- u16_buf += 4;
- }
- while (len >= sizeof(*u16_buf)) {
+ const u16_p *u16_buf = (const u16_p *)buf;
+ const u16_p *end = u16_buf + len / sizeof(*u16_buf);
+
+ for (; u16_buf != end; ++u16_buf)
sum += *u16_buf;
- len -= sizeof(*u16_buf);
- u16_buf += 1;
- }
- /* if length is in odd bytes */
- if (len == 1) {
+ /* if length is odd, keeping it byte order independent */
+ if (unlikely(len % 2)) {
uint16_t left = 0;
- *(uint8_t *)&left = *(const uint8_t *)u16_buf;
+ *(unsigned char*)&left = *(const unsigned char *)end;
sum += left;
}