rte_ether: force format string for unformat_addr

Message ID 20190710183342.6459-1-aconole@redhat.com
State New
Delegated to: Ferruh Yigit
Headers show
Series
  • rte_ether: force format string for unformat_addr
Related show

Checks

Context Check Description
ci/Intel-compilation success Compilation OK
ci/intel-Performance-Testing success Performance Testing PASS
ci/mellanox-Performance-Testing success Performance Testing PASS
ci/checkpatch success coding style OK

Commit Message

Aaron Conole July 10, 2019, 6:33 p.m.
rte_ether_unformation_addr is very lax in what it accepts now, including
ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
previously this behavior was enforced via the my_ether_aton which would
fail ambiguously formatted values.

Reported-by: Michael Santana <msantana@redhat.com>
Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
Signed-off-by: Aaron Conole <aconole@redhat.com>
---
 lib/librte_net/rte_ether.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Stephen Hemminger July 10, 2019, 6:42 p.m. | #1
On Wed, 10 Jul 2019 14:33:42 -0400
Aaron Conole <aconole@redhat.com> wrote:

> rte_ether_unformation_addr is very lax in what it accepts now, including
> ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
> previously this behavior was enforced via the my_ether_aton which would
> fail ambiguously formatted values.
> 
> Reported-by: Michael Santana <msantana@redhat.com>
> Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
> Signed-off-by: Aaron Conole <aconole@redhat.com>
> ---
>  lib/librte_net/rte_ether.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
> index 8d040173c..4f252b813 100644
> --- a/lib/librte_net/rte_ether.c
> +++ b/lib/librte_net/rte_ether.c
> @@ -45,7 +45,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>  	if (n == 6) {
>  		/* Standard format XX:XX:XX:XX:XX:XX */
>  		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
> -		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
> +		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
>  			rte_errno = ERANGE;
>  			return -1;
>  		}
> @@ -58,7 +59,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>  		ea->addr_bytes[5] = o5;
>  	} else if (n == 3) {
>  		/* Support the format XXXX:XXXX:XXXX */
> -		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
> +		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
>  			rte_errno = ERANGE;
>  			return -1;
>  		}

NAK
Skipping leading zero should be ok. There is no need for this patch.

The current behavior is superset of what standard ether_aton accepts.
Aaron Conole July 10, 2019, 7:13 p.m. | #2
Stephen Hemminger <stephen@networkplumber.org> writes:

> On Wed, 10 Jul 2019 14:33:42 -0400
> Aaron Conole <aconole@redhat.com> wrote:
>
>> rte_ether_unformation_addr is very lax in what it accepts now, including
>> ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
>> previously this behavior was enforced via the my_ether_aton which would
>> fail ambiguously formatted values.
>> 
>> Reported-by: Michael Santana <msantana@redhat.com>
>> Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> ---
>>  lib/librte_net/rte_ether.c | 6 ++++--
>>  1 file changed, 4 insertions(+), 2 deletions(-)
>> 
>> diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
>> index 8d040173c..4f252b813 100644
>> --- a/lib/librte_net/rte_ether.c
>> +++ b/lib/librte_net/rte_ether.c
>> @@ -45,7 +45,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>>  	if (n == 6) {
>>  		/* Standard format XX:XX:XX:XX:XX:XX */
>>  		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
>> -		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
>> +		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
>> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
>>  			rte_errno = ERANGE;
>>  			return -1;
>>  		}
>> @@ -58,7 +59,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>>  		ea->addr_bytes[5] = o5;
>>  	} else if (n == 3) {
>>  		/* Support the format XXXX:XXXX:XXXX */
>> -		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
>> +		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
>> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
>>  			rte_errno = ERANGE;
>>  			return -1;
>>  		}
>
> NAK
> Skipping leading zero should be ok. There is no need for this patch.

Is it intended to skip the leading 0?  Why not the trailing 0?  I'm not
familiar with the format that is used here  (example - X:XX:X:XX:X)

It isn't described in any RFC I could find (but I only did a small
search).  Even in IEEE, the format is always a full octet.

> The current behavior is superset of what standard ether_aton accepts.

Okay, but it introduces a test failure for the cmdline tests and then
that test will need a few lines removed for 'unsuccessful' formats.

ether_aton is much more rigid in the formats it accepts, so the test
case is enforcing that.  I guess either the current behavior of this
function changes (and since it is a new behavior of the cmdline parser,
I would think it should be changed) or the test case should be changed
to adopt it.
Stephen Hemminger July 10, 2019, 7:27 p.m. | #3
On Wed, 10 Jul 2019 15:13:02 -0400
Aaron Conole <aconole@redhat.com> wrote:

> Stephen Hemminger <stephen@networkplumber.org> writes:
> 
> > On Wed, 10 Jul 2019 14:33:42 -0400
> > Aaron Conole <aconole@redhat.com> wrote:
> >  
> >> rte_ether_unformation_addr is very lax in what it accepts now, including
> >> ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
> >> previously this behavior was enforced via the my_ether_aton which would
> >> fail ambiguously formatted values.
> >> 
> >> Reported-by: Michael Santana <msantana@redhat.com>
> >> Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
> >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> >> ---
> >>  lib/librte_net/rte_ether.c | 6 ++++--
> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >> 
> >> diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
> >> index 8d040173c..4f252b813 100644
> >> --- a/lib/librte_net/rte_ether.c
> >> +++ b/lib/librte_net/rte_ether.c
> >> @@ -45,7 +45,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
> >>  	if (n == 6) {
> >>  		/* Standard format XX:XX:XX:XX:XX:XX */
> >>  		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
> >> -		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
> >> +		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
> >>  			rte_errno = ERANGE;
> >>  			return -1;
> >>  		}
> >> @@ -58,7 +59,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
> >>  		ea->addr_bytes[5] = o5;
> >>  	} else if (n == 3) {
> >>  		/* Support the format XXXX:XXXX:XXXX */
> >> -		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
> >> +		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
> >>  			rte_errno = ERANGE;
> >>  			return -1;
> >>  		}  
> >
> > NAK
> > Skipping leading zero should be ok. There is no need for this patch.  
> 
> Is it intended to skip the leading 0?  Why not the trailing 0?  I'm not
> familiar with the format that is used here  (example - X:XX:X:XX:X)
> 
> It isn't described in any RFC I could find (but I only did a small
> search).  Even in IEEE, the format is always a full octet.
> 
> > The current behavior is superset of what standard ether_aton accepts.  
> 
> Okay, but it introduces a test failure for the cmdline tests and then
> that test will need a few lines removed for 'unsuccessful' formats.
> 
> ether_aton is much more rigid in the formats it accepts, so the test
> case is enforcing that.  I guess either the current behavior of this
> function changes (and since it is a new behavior of the cmdline parser,
> I would think it should be changed) or the test case should be changed
> to adopt it.

BSD ether_aton is:
/*
 * Convert an ASCII representation of an ethernet address to binary form.
 */
struct ether_addr *
ether_aton_r(const char *a, struct ether_addr *e)
{
	int i;
	unsigned int o0, o1, o2, o3, o4, o5;

	i = sscanf(a, "%x:%x:%x:%x:%x:%x", &o0, &o1, &o2, &o3, &o4, &o5);
	if (i != 6)
		return (NULL);
	e->octet[0]=o0;
	e->octet[1]=o1;
	e->octet[2]=o2;
	e->octet[3]=o3;
	e->octet[4]=o4;
	e->octet[5]=o5;
	return (e);
}
Aaron Conole July 10, 2019, 8:31 p.m. | #4
Stephen Hemminger <stephen@networkplumber.org> writes:

> On Wed, 10 Jul 2019 15:13:02 -0400
> Aaron Conole <aconole@redhat.com> wrote:
>
>> Stephen Hemminger <stephen@networkplumber.org> writes:
>> 
>> > On Wed, 10 Jul 2019 14:33:42 -0400
>> > Aaron Conole <aconole@redhat.com> wrote:
>> >  
>> >> rte_ether_unformation_addr is very lax in what it accepts now, including
>> >> ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
>> >> previously this behavior was enforced via the my_ether_aton which would
>> >> fail ambiguously formatted values.
>> >> 
>> >> Reported-by: Michael Santana <msantana@redhat.com>
>> >> Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
>> >> Signed-off-by: Aaron Conole <aconole@redhat.com>
>> >> ---
>> >>  lib/librte_net/rte_ether.c | 6 ++++--
>> >>  1 file changed, 4 insertions(+), 2 deletions(-)
>> >> 
>> >> diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
>> >> index 8d040173c..4f252b813 100644
>> >> --- a/lib/librte_net/rte_ether.c
>> >> +++ b/lib/librte_net/rte_ether.c
>> >> @@ -45,7 +45,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>> >>  	if (n == 6) {
>> >>  		/* Standard format XX:XX:XX:XX:XX:XX */
>> >>  		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
>> >> -		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
>> >> +		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
>> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
>> >>  			rte_errno = ERANGE;
>> >>  			return -1;
>> >>  		}
>> >> @@ -58,7 +59,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>> >>  		ea->addr_bytes[5] = o5;
>> >>  	} else if (n == 3) {
>> >>  		/* Support the format XXXX:XXXX:XXXX */
>> >> -		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
>> >> +		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
>> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
>> >>  			rte_errno = ERANGE;
>> >>  			return -1;
>> >>  		}  
>> >
>> > NAK
>> > Skipping leading zero should be ok. There is no need for this patch.  
>> 
>> Is it intended to skip the leading 0?  Why not the trailing 0?  I'm not
>> familiar with the format that is used here  (example - X:XX:X:XX:X)
>> 
>> It isn't described in any RFC I could find (but I only did a small
>> search).  Even in IEEE, the format is always a full octet.
>> 
>> > The current behavior is superset of what standard ether_aton accepts.  
>> 
>> Okay, but it introduces a test failure for the cmdline tests and then
>> that test will need a few lines removed for 'unsuccessful' formats.
>> 
>> ether_aton is much more rigid in the formats it accepts, so the test
>> case is enforcing that.  I guess either the current behavior of this
>> function changes (and since it is a new behavior of the cmdline parser,
>> I would think it should be changed) or the test case should be changed
>> to adopt it.
>
> BSD ether_aton is:
> /*
>  * Convert an ASCII representation of an ethernet address to binary form.
>  */
> struct ether_addr *
> ether_aton_r(const char *a, struct ether_addr *e)
> {
> 	int i;
> 	unsigned int o0, o1, o2, o3, o4, o5;
>
> 	i = sscanf(a, "%x:%x:%x:%x:%x:%x", &o0, &o1, &o2, &o3, &o4, &o5);
> 	if (i != 6)
> 		return (NULL);
> 	e->octet[0]=o0;
> 	e->octet[1]=o1;
> 	e->octet[2]=o2;
> 	e->octet[3]=o3;
> 	e->octet[4]=o4;
> 	e->octet[5]=o5;
> 	return (e);
> }

Your implementation fixes the above by bounds checking each octet
to enforce that in the 6-octet form, each octet is bound to the region
00-ff.

The BSD example only accepts a 6-octet form.  Your version is intended
to accept both colon forms so x:x:x will successfully parse as well
(interpreted on the XXXX:XXXX:XXXX side) (ie: mac 02:03:04 or 2:3:4
would be accepted).  Further, accidentally passing an ipv6 address to
this routine (something a user of a cmdline interface might do) could be
parsed as valid (example: 2001:db8:2::1) - which would be the wrong
thing.  I think it would be strange for length limits to be enforced in
cmdline parser *after* calling this, but that might be an option for
fixing (so patch cmdline_parse_etheraddr to do a length check after the
unformat_addr call).

I guess I'm not sure what the *best* fix would be.  I think the most
sane fix is what I've put in since it will only allow the commonly
accepted notation, and not allow ad-hoc accidents.  Higher layers (like
cmdline parsers) are free to implement routines that reformat the lax
forms (like you might want to allow a user to pass) into more
restrictive forms required by a lower layer (like librte_net).  I
concede that there could be a more friendly thing to do in some specific
cases - but then we must more strictly validate the *form* (ie: we
have a scanf where one form is a subset of another and will be okay with
some kinds of invalid characters being inserted - allowing for things
like IPV6 addresses looking like ethernet hardware addresses).
Stephen Hemminger July 10, 2019, 11:13 p.m. | #5
On Wed, 10 Jul 2019 16:31:59 -0400
Aaron Conole <aconole@redhat.com> wrote:

> Stephen Hemminger <stephen@networkplumber.org> writes:
> 
> > On Wed, 10 Jul 2019 15:13:02 -0400
> > Aaron Conole <aconole@redhat.com> wrote:
> >  
> >> Stephen Hemminger <stephen@networkplumber.org> writes:
> >>   
> >> > On Wed, 10 Jul 2019 14:33:42 -0400
> >> > Aaron Conole <aconole@redhat.com> wrote:
> >> >    
> >> >> rte_ether_unformation_addr is very lax in what it accepts now, including
> >> >> ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
> >> >> previously this behavior was enforced via the my_ether_aton which would
> >> >> fail ambiguously formatted values.
> >> >> 
> >> >> Reported-by: Michael Santana <msantana@redhat.com>
> >> >> Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
> >> >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> >> >> ---
> >> >>  lib/librte_net/rte_ether.c | 6 ++++--
> >> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >> >> 
> >> >> diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
> >> >> index 8d040173c..4f252b813 100644
> >> >> --- a/lib/librte_net/rte_ether.c
> >> >> +++ b/lib/librte_net/rte_ether.c
> >> >> @@ -45,7 +45,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
> >> >>  	if (n == 6) {
> >> >>  		/* Standard format XX:XX:XX:XX:XX:XX */
> >> >>  		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
> >> >> -		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
> >> >> +		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
> >> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
> >> >>  			rte_errno = ERANGE;
> >> >>  			return -1;
> >> >>  		}
> >> >> @@ -58,7 +59,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
> >> >>  		ea->addr_bytes[5] = o5;
> >> >>  	} else if (n == 3) {
> >> >>  		/* Support the format XXXX:XXXX:XXXX */
> >> >> -		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
> >> >> +		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
> >> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
> >> >>  			rte_errno = ERANGE;
> >> >>  			return -1;
> >> >>  		}    
> >> >
> >> > NAK
> >> > Skipping leading zero should be ok. There is no need for this patch.    
> >> 
> >> Is it intended to skip the leading 0?  Why not the trailing 0?  I'm not
> >> familiar with the format that is used here  (example - X:XX:X:XX:X)
> >> 
> >> It isn't described in any RFC I could find (but I only did a small
> >> search).  Even in IEEE, the format is always a full octet.
> >>   
> >> > The current behavior is superset of what standard ether_aton accepts.    
> >> 
> >> Okay, but it introduces a test failure for the cmdline tests and then
> >> that test will need a few lines removed for 'unsuccessful' formats.
> >> 
> >> ether_aton is much more rigid in the formats it accepts, so the test
> >> case is enforcing that.  I guess either the current behavior of this
> >> function changes (and since it is a new behavior of the cmdline parser,
> >> I would think it should be changed) or the test case should be changed
> >> to adopt it.  
> >
> > BSD ether_aton is:
> > /*
> >  * Convert an ASCII representation of an ethernet address to binary form.
> >  */
> > struct ether_addr *
> > ether_aton_r(const char *a, struct ether_addr *e)
> > {
> > 	int i;
> > 	unsigned int o0, o1, o2, o3, o4, o5;
> >
> > 	i = sscanf(a, "%x:%x:%x:%x:%x:%x", &o0, &o1, &o2, &o3, &o4, &o5);
> > 	if (i != 6)
> > 		return (NULL);
> > 	e->octet[0]=o0;
> > 	e->octet[1]=o1;
> > 	e->octet[2]=o2;
> > 	e->octet[3]=o3;
> > 	e->octet[4]=o4;
> > 	e->octet[5]=o5;
> > 	return (e);
> > }  
> 
> Your implementation fixes the above by bounds checking each octet
> to enforce that in the 6-octet form, each octet is bound to the region
> 00-ff.
> 
> The BSD example only accepts a 6-octet form.  Your version is intended
> to accept both colon forms so x:x:x will successfully parse as well
> (interpreted on the XXXX:XXXX:XXXX side) (ie: mac 02:03:04 or 2:3:4
> would be accepted).  Further, accidentally passing an ipv6 address to
> this routine (something a user of a cmdline interface might do) could be
> parsed as valid (example: 2001:db8:2::1) - which would be the wrong
> thing.  I think it would be strange for length limits to be enforced in
> cmdline parser *after* calling this, but that might be an option for
> fixing (so patch cmdline_parse_etheraddr to do a length check after the
> unformat_addr call).

Being liberal in what you accept as input is a good core principle.
BSD goes too far, but what you propose is too restrictive.

> I guess I'm not sure what the *best* fix would be.  I think the most
> sane fix is what I've put in since it will only allow the commonly
> accepted notation, and not allow ad-hoc accidents.  Higher layers (like
> cmdline parsers) are free to implement routines that reformat the lax
> forms (like you might want to allow a user to pass) into more
> restrictive forms required by a lower layer (like librte_net).  I
> concede that there could be a more friendly thing to do in some specific
> cases - but then we must more strictly validate the *form* (ie: we
> have a scanf where one form is a subset of another and will be okay with
> some kinds of invalid characters being inserted - allowing for things
> like IPV6 addresses looking like ethernet hardware addresses).

Fix the cmdline test.
Stephen Hemminger July 17, 2019, 6:42 p.m. | #6
On Wed, 10 Jul 2019 16:31:59 -0400
Aaron Conole <aconole@redhat.com> wrote:

> Stephen Hemminger <stephen@networkplumber.org> writes:
> 
> > On Wed, 10 Jul 2019 15:13:02 -0400
> > Aaron Conole <aconole@redhat.com> wrote:
> >  
> >> Stephen Hemminger <stephen@networkplumber.org> writes:
> >>   
> >> > On Wed, 10 Jul 2019 14:33:42 -0400
> >> > Aaron Conole <aconole@redhat.com> wrote:
> >> >    
> >> >> rte_ether_unformation_addr is very lax in what it accepts now, including
> >> >> ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
> >> >> previously this behavior was enforced via the my_ether_aton which would
> >> >> fail ambiguously formatted values.
> >> >> 
> >> >> Reported-by: Michael Santana <msantana@redhat.com>
> >> >> Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
> >> >> Signed-off-by: Aaron Conole <aconole@redhat.com>
> >> >> ---
> >> >>  lib/librte_net/rte_ether.c | 6 ++++--
> >> >>  1 file changed, 4 insertions(+), 2 deletions(-)
> >> >> 
> >> >> diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
> >> >> index 8d040173c..4f252b813 100644
> >> >> --- a/lib/librte_net/rte_ether.c
> >> >> +++ b/lib/librte_net/rte_ether.c
> >> >> @@ -45,7 +45,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
> >> >>  	if (n == 6) {
> >> >>  		/* Standard format XX:XX:XX:XX:XX:XX */
> >> >>  		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
> >> >> -		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
> >> >> +		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
> >> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
> >> >>  			rte_errno = ERANGE;
> >> >>  			return -1;
> >> >>  		}
> >> >> @@ -58,7 +59,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
> >> >>  		ea->addr_bytes[5] = o5;
> >> >>  	} else if (n == 3) {
> >> >>  		/* Support the format XXXX:XXXX:XXXX */
> >> >> -		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
> >> >> +		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
> >> >> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
> >> >>  			rte_errno = ERANGE;
> >> >>  			return -1;
> >> >>  		}    
> >> >
> >> > NAK
> >> > Skipping leading zero should be ok. There is no need for this patch.    
> >> 
> >> Is it intended to skip the leading 0?  Why not the trailing 0?  I'm not
> >> familiar with the format that is used here  (example - X:XX:X:XX:X)
> >> 
> >> It isn't described in any RFC I could find (but I only did a small
> >> search).  Even in IEEE, the format is always a full octet.
> >>   
> >> > The current behavior is superset of what standard ether_aton accepts.    
> >> 
> >> Okay, but it introduces a test failure for the cmdline tests and then
> >> that test will need a few lines removed for 'unsuccessful' formats.
> >> 
> >> ether_aton is much more rigid in the formats it accepts, so the test
> >> case is enforcing that.  I guess either the current behavior of this
> >> function changes (and since it is a new behavior of the cmdline parser,
> >> I would think it should be changed) or the test case should be changed
> >> to adopt it.  
> >
> > BSD ether_aton is:
> > /*
> >  * Convert an ASCII representation of an ethernet address to binary form.
> >  */
> > struct ether_addr *
> > ether_aton_r(const char *a, struct ether_addr *e)
> > {
> > 	int i;
> > 	unsigned int o0, o1, o2, o3, o4, o5;
> >
> > 	i = sscanf(a, "%x:%x:%x:%x:%x:%x", &o0, &o1, &o2, &o3, &o4, &o5);
> > 	if (i != 6)
> > 		return (NULL);
> > 	e->octet[0]=o0;
> > 	e->octet[1]=o1;
> > 	e->octet[2]=o2;
> > 	e->octet[3]=o3;
> > 	e->octet[4]=o4;
> > 	e->octet[5]=o5;
> > 	return (e);
> > }  
> 
> Your implementation fixes the above by bounds checking each octet
> to enforce that in the 6-octet form, each octet is bound to the region
> 00-ff.
> 
> The BSD example only accepts a 6-octet form.  Your version is intended
> to accept both colon forms so x:x:x will successfully parse as well
> (interpreted on the XXXX:XXXX:XXXX side) (ie: mac 02:03:04 or 2:3:4
> would be accepted).  Further, accidentally passing an ipv6 address to
> this routine (something a user of a cmdline interface might do) could be
> parsed as valid (example: 2001:db8:2::1) - which would be the wrong
> thing.  I think it would be strange for length limits to be enforced in
> cmdline parser *after* calling this, but that might be an option for
> fixing (so patch cmdline_parse_etheraddr to do a length check after the
> unformat_addr call).
> 
> I guess I'm not sure what the *best* fix would be.  I think the most
> sane fix is what I've put in since it will only allow the commonly
> accepted notation, and not allow ad-hoc accidents.  Higher layers (like
> cmdline parsers) are free to implement routines that reformat the lax
> forms (like you might want to allow a user to pass) into more
> restrictive forms required by a lower layer (like librte_net).  I
> concede that there could be a more friendly thing to do in some specific
> cases - but then we must more strictly validate the *form* (ie: we
> have a scanf where one form is a subset of another and will be okay with
> some kinds of invalid characters being inserted - allowing for things
> like IPV6 addresses looking like ethernet hardware addresses).


I have a new version that is closer to original implementation
in cmdline_parse_etheraddr.

Comparison chart relative to ether_aton

Input                         	glibc	BSD	ORIG	NEW
01:23:45:67:89:AB             	ok	ok	ok	ok
4567:89AB:CDEF                	BAD	BAD	ok	ok
00:11:22:33:44:55#garbage     	ok	ok	BAD	BAD
00:11:22:33:44:55 garbage     	ok	ok	BAD	BAD
0011:2233:4455#garbage        	BAD	BAD	BAD	BAD
0123:45:67:89:AB              	BAD	BAD	BAD	BAD
01:23:4567:89:AB              	BAD	BAD	BAD	BAD
01:23:45:67:89AB              	BAD	BAD	BAD	BAD
012:345:678:9AB               	BAD	BAD	BAD	BAD
01:23:45:67:89:ABC            	ok	ok	BAD	BAD
01:23:45:67:89:A              	ok	ok	ok	BAD
01:23:45:67:89                	BAD	BAD	BAD	BAD
01:23:45:67:89:AB:CD          	ok	ok	BAD	BAD
IN:VA:LI:DC:HA:RS             	BAD	BAD	BAD	BAD
INVA:LIDC:HARS                	BAD	BAD	BAD	BAD
01 23 45 67 89 AB             	BAD	BAD	BAD	BAD
01-23-45-67-89-AB             	BAD	BAD	BAD	BAD
01.23.45.67.89.AB             	BAD	BAD	BAD	BAD
01,23,45,67,89,AB             	BAD	BAD	BAD	BAD
01:23:45                      	BAD	BAD	ok	BAD
01:23:45#:67:89:AB            	BAD	BAD	BAD	BAD
random invalid text           	BAD	BAD	BAD	BAD
random text                   	BAD	BAD	BAD	BAD
Ferruh Yigit July 19, 2019, 5:59 p.m. | #7
On 7/17/2019 7:42 PM, Stephen Hemminger wrote:
> On Wed, 10 Jul 2019 16:31:59 -0400
> Aaron Conole <aconole@redhat.com> wrote:
> 
>> Stephen Hemminger <stephen@networkplumber.org> writes:
>>
>>> On Wed, 10 Jul 2019 15:13:02 -0400
>>> Aaron Conole <aconole@redhat.com> wrote:
>>>  
>>>> Stephen Hemminger <stephen@networkplumber.org> writes:
>>>>   
>>>>> On Wed, 10 Jul 2019 14:33:42 -0400
>>>>> Aaron Conole <aconole@redhat.com> wrote:
>>>>>    
>>>>>> rte_ether_unformation_addr is very lax in what it accepts now, including
>>>>>> ethernet addresses formatted ambiguously as "x:xx:x:xx:x:xx".  However,
>>>>>> previously this behavior was enforced via the my_ether_aton which would
>>>>>> fail ambiguously formatted values.
>>>>>>
>>>>>> Reported-by: Michael Santana <msantana@redhat.com>
>>>>>> Fixes: 596d31092d32 ("net: add function to convert string to ethernet address")
>>>>>> Signed-off-by: Aaron Conole <aconole@redhat.com>
>>>>>> ---
>>>>>>  lib/librte_net/rte_ether.c | 6 ++++--
>>>>>>  1 file changed, 4 insertions(+), 2 deletions(-)
>>>>>>
>>>>>> diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
>>>>>> index 8d040173c..4f252b813 100644
>>>>>> --- a/lib/librte_net/rte_ether.c
>>>>>> +++ b/lib/librte_net/rte_ether.c
>>>>>> @@ -45,7 +45,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>>>>>>  	if (n == 6) {
>>>>>>  		/* Standard format XX:XX:XX:XX:XX:XX */
>>>>>>  		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
>>>>>> -		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
>>>>>> +		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
>>>>>> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
>>>>>>  			rte_errno = ERANGE;
>>>>>>  			return -1;
>>>>>>  		}
>>>>>> @@ -58,7 +59,8 @@ rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
>>>>>>  		ea->addr_bytes[5] = o5;
>>>>>>  	} else if (n == 3) {
>>>>>>  		/* Support the format XXXX:XXXX:XXXX */
>>>>>> -		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
>>>>>> +		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
>>>>>> +		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
>>>>>>  			rte_errno = ERANGE;
>>>>>>  			return -1;
>>>>>>  		}    
>>>>>
>>>>> NAK
>>>>> Skipping leading zero should be ok. There is no need for this patch.    
>>>>
>>>> Is it intended to skip the leading 0?  Why not the trailing 0?  I'm not
>>>> familiar with the format that is used here  (example - X:XX:X:XX:X)
>>>>
>>>> It isn't described in any RFC I could find (but I only did a small
>>>> search).  Even in IEEE, the format is always a full octet.
>>>>   
>>>>> The current behavior is superset of what standard ether_aton accepts.    
>>>>
>>>> Okay, but it introduces a test failure for the cmdline tests and then
>>>> that test will need a few lines removed for 'unsuccessful' formats.
>>>>
>>>> ether_aton is much more rigid in the formats it accepts, so the test
>>>> case is enforcing that.  I guess either the current behavior of this
>>>> function changes (and since it is a new behavior of the cmdline parser,
>>>> I would think it should be changed) or the test case should be changed
>>>> to adopt it.  
>>>
>>> BSD ether_aton is:
>>> /*
>>>  * Convert an ASCII representation of an ethernet address to binary form.
>>>  */
>>> struct ether_addr *
>>> ether_aton_r(const char *a, struct ether_addr *e)
>>> {
>>> 	int i;
>>> 	unsigned int o0, o1, o2, o3, o4, o5;
>>>
>>> 	i = sscanf(a, "%x:%x:%x:%x:%x:%x", &o0, &o1, &o2, &o3, &o4, &o5);
>>> 	if (i != 6)
>>> 		return (NULL);
>>> 	e->octet[0]=o0;
>>> 	e->octet[1]=o1;
>>> 	e->octet[2]=o2;
>>> 	e->octet[3]=o3;
>>> 	e->octet[4]=o4;
>>> 	e->octet[5]=o5;
>>> 	return (e);
>>> }  
>>
>> Your implementation fixes the above by bounds checking each octet
>> to enforce that in the 6-octet form, each octet is bound to the region
>> 00-ff.
>>
>> The BSD example only accepts a 6-octet form.  Your version is intended
>> to accept both colon forms so x:x:x will successfully parse as well
>> (interpreted on the XXXX:XXXX:XXXX side) (ie: mac 02:03:04 or 2:3:4
>> would be accepted).  Further, accidentally passing an ipv6 address to
>> this routine (something a user of a cmdline interface might do) could be
>> parsed as valid (example: 2001:db8:2::1) - which would be the wrong
>> thing.  I think it would be strange for length limits to be enforced in
>> cmdline parser *after* calling this, but that might be an option for
>> fixing (so patch cmdline_parse_etheraddr to do a length check after the
>> unformat_addr call).
>>
>> I guess I'm not sure what the *best* fix would be.  I think the most
>> sane fix is what I've put in since it will only allow the commonly
>> accepted notation, and not allow ad-hoc accidents.  Higher layers (like
>> cmdline parsers) are free to implement routines that reformat the lax
>> forms (like you might want to allow a user to pass) into more
>> restrictive forms required by a lower layer (like librte_net).  I
>> concede that there could be a more friendly thing to do in some specific
>> cases - but then we must more strictly validate the *form* (ie: we
>> have a scanf where one form is a subset of another and will be okay with
>> some kinds of invalid characters being inserted - allowing for things
>> like IPV6 addresses looking like ethernet hardware addresses).
> 
> 
> I have a new version that is closer to original implementation
> in cmdline_parse_etheraddr.
> 
> Comparison chart relative to ether_aton
> 
> Input                         	glibc	BSD	ORIG	NEW
> 01:23:45:67:89:AB             	ok	ok	ok	ok
> 4567:89AB:CDEF                	BAD	BAD	ok	ok
> 00:11:22:33:44:55#garbage     	ok	ok	BAD	BAD
> 00:11:22:33:44:55 garbage     	ok	ok	BAD	BAD
> 0011:2233:4455#garbage        	BAD	BAD	BAD	BAD
> 0123:45:67:89:AB              	BAD	BAD	BAD	BAD
> 01:23:4567:89:AB              	BAD	BAD	BAD	BAD
> 01:23:45:67:89AB              	BAD	BAD	BAD	BAD
> 012:345:678:9AB               	BAD	BAD	BAD	BAD
> 01:23:45:67:89:ABC            	ok	ok	BAD	BAD
> 01:23:45:67:89:A              	ok	ok	ok	BAD
> 01:23:45:67:89                	BAD	BAD	BAD	BAD
> 01:23:45:67:89:AB:CD          	ok	ok	BAD	BAD
> IN:VA:LI:DC:HA:RS             	BAD	BAD	BAD	BAD
> INVA:LIDC:HARS                	BAD	BAD	BAD	BAD
> 01 23 45 67 89 AB             	BAD	BAD	BAD	BAD
> 01-23-45-67-89-AB             	BAD	BAD	BAD	BAD
> 01.23.45.67.89.AB             	BAD	BAD	BAD	BAD
> 01,23,45,67,89,AB             	BAD	BAD	BAD	BAD
> 01:23:45                      	BAD	BAD	ok	BAD
> 01:23:45#:67:89:AB            	BAD	BAD	BAD	BAD
> random invalid text           	BAD	BAD	BAD	BAD
> random text                   	BAD	BAD	BAD	BAD
> 

Hi Aaron,

Can you please check if you are OK after merged patch:
https://patches.dpdk.org/patch/56737/

If so can you please update the patch status as 'rejected'

Patch

diff --git a/lib/librte_net/rte_ether.c b/lib/librte_net/rte_ether.c
index 8d040173c..4f252b813 100644
--- a/lib/librte_net/rte_ether.c
+++ b/lib/librte_net/rte_ether.c
@@ -45,7 +45,8 @@  rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
 	if (n == 6) {
 		/* Standard format XX:XX:XX:XX:XX:XX */
 		if (o0 > UINT8_MAX || o1 > UINT8_MAX || o2 > UINT8_MAX ||
-		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX) {
+		    o3 > UINT8_MAX || o4 > UINT8_MAX || o5 > UINT8_MAX ||
+		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 1) {
 			rte_errno = ERANGE;
 			return -1;
 		}
@@ -58,7 +59,8 @@  rte_ether_unformat_addr(const char *s, struct rte_ether_addr *ea)
 		ea->addr_bytes[5] = o5;
 	} else if (n == 3) {
 		/* Support the format XXXX:XXXX:XXXX */
-		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX) {
+		if (o0 > UINT16_MAX || o1 > UINT16_MAX || o2 > UINT16_MAX ||
+		    strlen(s) != RTE_ETHER_ADDR_FMT_SIZE - 4) {
 			rte_errno = ERANGE;
 			return -1;
 		}