[v5,2/2] ethdev: support xstats reset telemetry command

Message ID 20230209023203.35269-3-fengchengwen@huawei.com (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series support dmadev/ethdev stats reset |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-broadcom-Functional fail Functional Testing issues
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-intel-Functional fail Functional Testing issues
ci/iol-intel-Performance success Performance Testing PASS
ci/github-robot: build success github build: passed
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-abi-testing success Testing PASS

Commit Message

Chengwen Feng Feb. 9, 2023, 2:32 a.m. UTC
  The xstats reset is useful for debugging, so add it to the ethdev
telemetry command lists.

Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
---
 lib/ethdev/rte_ethdev.c | 31 +++++++++++++++++++++++++++++++
 1 file changed, 31 insertions(+)
  

Comments

Dongdong Liu Feb. 15, 2023, 3:19 a.m. UTC | #1
Hi Chengwen

On 2023/2/9 10:32, Chengwen Feng wrote:
> The xstats reset is useful for debugging, so add it to the ethdev
> telemetry command lists.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
This patch looks good, so
Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>

A minior question
Do we need to support stats reset ?

Thanks,
Dongdong
> ---
>  lib/ethdev/rte_ethdev.c | 31 +++++++++++++++++++++++++++++++
>  1 file changed, 31 insertions(+)
>
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index d25db35f7f..e85c98f307 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -5915,6 +5915,35 @@ eth_dev_handle_port_xstats(const char *cmd __rte_unused,
>  	return 0;
>  }
>
> +static int
> +eth_dev_handle_port_xstats_reset(const char *cmd __rte_unused,
> +		const char *params,
> +		struct rte_tel_data *d)
> +{
> +	int port_id, ret;
> +	char *end_param;
> +
> +	if (params == NULL || strlen(params) == 0 || !isdigit(*params))
> +		return -1;
> +
> +	port_id = strtoul(params, &end_param, 0);
> +	if (*end_param != '\0')
> +		RTE_ETHDEV_LOG(NOTICE,
> +			"Extra parameters passed to ethdev telemetry command, ignoring\n");
> +	if (!rte_eth_dev_is_valid_port(port_id))
> +		return -1;
> +
> +	ret = rte_eth_xstats_reset(port_id);
> +	if (ret == 0) {
> +		rte_tel_data_string(d, "success");
> +		RTE_ETHDEV_LOG(NOTICE, "Port %d reset xstats success\n", port_id);
> +	} else {
> +		RTE_ETHDEV_LOG(ERR, "Port %d reset xstats failed! ret: %d\n", port_id, ret);
> +	}
> +
> +	return ret;
> +}
> +
>  #ifndef RTE_EXEC_ENV_WINDOWS
>  static int
>  eth_dev_handle_port_dump_priv(const char *cmd __rte_unused,
> @@ -6329,6 +6358,8 @@ RTE_INIT(ethdev_init_telemetry)
>  			"Returns the common stats for a port. Parameters: int port_id");
>  	rte_telemetry_register_cmd("/ethdev/xstats", eth_dev_handle_port_xstats,
>  			"Returns the extended stats for a port. Parameters: int port_id");
> +	rte_telemetry_register_cmd("/ethdev/xstats_reset", eth_dev_handle_port_xstats_reset,
> +			"Reset the extended stats for a port. Parameters: int port_id");
>  #ifndef RTE_EXEC_ENV_WINDOWS
>  	rte_telemetry_register_cmd("/ethdev/dump_priv", eth_dev_handle_port_dump_priv,
>  			"Returns dump private information for a port. Parameters: int port_id");
>
  
Chengwen Feng Feb. 16, 2023, 11:53 a.m. UTC | #2
On 2023/2/15 11:19, Dongdong Liu wrote:
> Hi Chengwen
> 
> On 2023/2/9 10:32, Chengwen Feng wrote:
>> The xstats reset is useful for debugging, so add it to the ethdev
>> telemetry command lists.
>>
>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> This patch looks good, so
> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
> 
> A minior question
> Do we need to support stats reset ?

Stats is contained by xstats, and future direction I think is xstats.
So I think we don't need support stats reset.

Thanks.

> 
> Thanks,
> Dongdong
>> ---
>>  lib/ethdev/rte_ethdev.c | 31 +++++++++++++++++++++++++++++++
>>  1 file changed, 31 insertions(+)
>>
>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>> index d25db35f7f..e85c98f307 100644
>> --- a/lib/ethdev/rte_ethdev.c
>> +++ b/lib/ethdev/rte_ethdev.c
>> @@ -5915,6 +5915,35 @@ eth_dev_handle_port_xstats(const char *cmd __rte_unused,
>>      return 0;
>>  }
>>
>> +static int
>> +eth_dev_handle_port_xstats_reset(const char *cmd __rte_unused,
>> +        const char *params,
>> +        struct rte_tel_data *d)
>> +{
>> +    int port_id, ret;
>> +    char *end_param;
>> +
>> +    if (params == NULL || strlen(params) == 0 || !isdigit(*params))
>> +        return -1;
>> +
>> +    port_id = strtoul(params, &end_param, 0);
>> +    if (*end_param != '\0')
>> +        RTE_ETHDEV_LOG(NOTICE,
>> +            "Extra parameters passed to ethdev telemetry command, ignoring\n");
>> +    if (!rte_eth_dev_is_valid_port(port_id))
>> +        return -1;
>> +
>> +    ret = rte_eth_xstats_reset(port_id);
>> +    if (ret == 0) {
>> +        rte_tel_data_string(d, "success");
>> +        RTE_ETHDEV_LOG(NOTICE, "Port %d reset xstats success\n", port_id);
>> +    } else {
>> +        RTE_ETHDEV_LOG(ERR, "Port %d reset xstats failed! ret: %d\n", port_id, ret);
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>>  #ifndef RTE_EXEC_ENV_WINDOWS
>>  static int
>>  eth_dev_handle_port_dump_priv(const char *cmd __rte_unused,
>> @@ -6329,6 +6358,8 @@ RTE_INIT(ethdev_init_telemetry)
>>              "Returns the common stats for a port. Parameters: int port_id");
>>      rte_telemetry_register_cmd("/ethdev/xstats", eth_dev_handle_port_xstats,
>>              "Returns the extended stats for a port. Parameters: int port_id");
>> +    rte_telemetry_register_cmd("/ethdev/xstats_reset", eth_dev_handle_port_xstats_reset,
>> +            "Reset the extended stats for a port. Parameters: int port_id");
>>  #ifndef RTE_EXEC_ENV_WINDOWS
>>      rte_telemetry_register_cmd("/ethdev/dump_priv", eth_dev_handle_port_dump_priv,
>>              "Returns dump private information for a port. Parameters: int port_id");
>>
> .
  
Ferruh Yigit Feb. 16, 2023, 12:06 p.m. UTC | #3
On 2/16/2023 11:53 AM, fengchengwen wrote:
> On 2023/2/15 11:19, Dongdong Liu wrote:
>> Hi Chengwen
>>
>> On 2023/2/9 10:32, Chengwen Feng wrote:
>>> The xstats reset is useful for debugging, so add it to the ethdev
>>> telemetry command lists.
>>>
>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>> This patch looks good, so
>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
>>
>> A minior question
>> Do we need to support stats reset ?
> 
> Stats is contained by xstats, and future direction I think is xstats.
> So I think we don't need support stats reset.
> 

I have similar question with Dongdong, readonly values are safe for
telemetry, but modifying data can be more tricky since we don't have
locking in ethdev APIs, this can cause concurrency issues.

Overall do we want telemetry go that way and become something that
alters ethdev data/config?


> Thanks.
> 
>>
>> Thanks,
>> Dongdong
>>> ---
>>>  lib/ethdev/rte_ethdev.c | 31 +++++++++++++++++++++++++++++++
>>>  1 file changed, 31 insertions(+)
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>>> index d25db35f7f..e85c98f307 100644
>>> --- a/lib/ethdev/rte_ethdev.c
>>> +++ b/lib/ethdev/rte_ethdev.c
>>> @@ -5915,6 +5915,35 @@ eth_dev_handle_port_xstats(const char *cmd __rte_unused,
>>>      return 0;
>>>  }
>>>
>>> +static int
>>> +eth_dev_handle_port_xstats_reset(const char *cmd __rte_unused,
>>> +        const char *params,
>>> +        struct rte_tel_data *d)
>>> +{
>>> +    int port_id, ret;
>>> +    char *end_param;
>>> +
>>> +    if (params == NULL || strlen(params) == 0 || !isdigit(*params))
>>> +        return -1;
>>> +
>>> +    port_id = strtoul(params, &end_param, 0);
>>> +    if (*end_param != '\0')
>>> +        RTE_ETHDEV_LOG(NOTICE,
>>> +            "Extra parameters passed to ethdev telemetry command, ignoring\n");
>>> +    if (!rte_eth_dev_is_valid_port(port_id))
>>> +        return -1;
>>> +
>>> +    ret = rte_eth_xstats_reset(port_id);
>>> +    if (ret == 0) {
>>> +        rte_tel_data_string(d, "success");
>>> +        RTE_ETHDEV_LOG(NOTICE, "Port %d reset xstats success\n", port_id);
>>> +    } else {
>>> +        RTE_ETHDEV_LOG(ERR, "Port %d reset xstats failed! ret: %d\n", port_id, ret);
>>> +    }
>>> +
>>> +    return ret;
>>> +}
>>> +
>>>  #ifndef RTE_EXEC_ENV_WINDOWS
>>>  static int
>>>  eth_dev_handle_port_dump_priv(const char *cmd __rte_unused,
>>> @@ -6329,6 +6358,8 @@ RTE_INIT(ethdev_init_telemetry)
>>>              "Returns the common stats for a port. Parameters: int port_id");
>>>      rte_telemetry_register_cmd("/ethdev/xstats", eth_dev_handle_port_xstats,
>>>              "Returns the extended stats for a port. Parameters: int port_id");
>>> +    rte_telemetry_register_cmd("/ethdev/xstats_reset", eth_dev_handle_port_xstats_reset,
>>> +            "Reset the extended stats for a port. Parameters: int port_id");
>>>  #ifndef RTE_EXEC_ENV_WINDOWS
>>>      rte_telemetry_register_cmd("/ethdev/dump_priv", eth_dev_handle_port_dump_priv,
>>>              "Returns dump private information for a port. Parameters: int port_id");
>>>
>> .
  
Chengwen Feng Feb. 16, 2023, 12:42 p.m. UTC | #4
On 2023/2/16 20:06, Ferruh Yigit wrote:
> On 2/16/2023 11:53 AM, fengchengwen wrote:
>> On 2023/2/15 11:19, Dongdong Liu wrote:
>>> Hi Chengwen
>>>
>>> On 2023/2/9 10:32, Chengwen Feng wrote:
>>>> The xstats reset is useful for debugging, so add it to the ethdev
>>>> telemetry command lists.
>>>>
>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>> This patch looks good, so
>>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
>>>
>>> A minior question
>>> Do we need to support stats reset ?
>>
>> Stats is contained by xstats, and future direction I think is xstats.
>> So I think we don't need support stats reset.
>>
> 
> I have similar question with Dongdong, readonly values are safe for
> telemetry, but modifying data can be more tricky since we don't have
> locking in ethdev APIs, this can cause concurrency issues.

Yes, it indeed has concurrency issues.

> 
> Overall do we want telemetry go that way and become something that
> alters ethdev data/config?

There are at least two part of data: config and status.
For stats (which belong status data) could help for debugging, I think it's acceptable.

As for concurrency issues. People should know what to do and when to do, just like
the don't invoke config API (e.g. dev_configure/dev_start/...) concurrency.

> 
> 
>> Thanks.
>>
>>>
>>> Thanks,
>>> Dongdong
>>>> ---
>>>>  lib/ethdev/rte_ethdev.c | 31 +++++++++++++++++++++++++++++++
>>>>  1 file changed, 31 insertions(+)
>>>>
>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>>>> index d25db35f7f..e85c98f307 100644
>>>> --- a/lib/ethdev/rte_ethdev.c
>>>> +++ b/lib/ethdev/rte_ethdev.c
>>>> @@ -5915,6 +5915,35 @@ eth_dev_handle_port_xstats(const char *cmd __rte_unused,
>>>>      return 0;
>>>>  }
>>>>
>>>> +static int
>>>> +eth_dev_handle_port_xstats_reset(const char *cmd __rte_unused,
>>>> +        const char *params,
>>>> +        struct rte_tel_data *d)
>>>> +{
>>>> +    int port_id, ret;
>>>> +    char *end_param;
>>>> +
>>>> +    if (params == NULL || strlen(params) == 0 || !isdigit(*params))
>>>> +        return -1;
>>>> +
>>>> +    port_id = strtoul(params, &end_param, 0);
>>>> +    if (*end_param != '\0')
>>>> +        RTE_ETHDEV_LOG(NOTICE,
>>>> +            "Extra parameters passed to ethdev telemetry command, ignoring\n");
>>>> +    if (!rte_eth_dev_is_valid_port(port_id))
>>>> +        return -1;
>>>> +
>>>> +    ret = rte_eth_xstats_reset(port_id);
>>>> +    if (ret == 0) {
>>>> +        rte_tel_data_string(d, "success");
>>>> +        RTE_ETHDEV_LOG(NOTICE, "Port %d reset xstats success\n", port_id);
>>>> +    } else {
>>>> +        RTE_ETHDEV_LOG(ERR, "Port %d reset xstats failed! ret: %d\n", port_id, ret);
>>>> +    }
>>>> +
>>>> +    return ret;
>>>> +}
>>>> +
>>>>  #ifndef RTE_EXEC_ENV_WINDOWS
>>>>  static int
>>>>  eth_dev_handle_port_dump_priv(const char *cmd __rte_unused,
>>>> @@ -6329,6 +6358,8 @@ RTE_INIT(ethdev_init_telemetry)
>>>>              "Returns the common stats for a port. Parameters: int port_id");
>>>>      rte_telemetry_register_cmd("/ethdev/xstats", eth_dev_handle_port_xstats,
>>>>              "Returns the extended stats for a port. Parameters: int port_id");
>>>> +    rte_telemetry_register_cmd("/ethdev/xstats_reset", eth_dev_handle_port_xstats_reset,
>>>> +            "Reset the extended stats for a port. Parameters: int port_id");
>>>>  #ifndef RTE_EXEC_ENV_WINDOWS
>>>>      rte_telemetry_register_cmd("/ethdev/dump_priv", eth_dev_handle_port_dump_priv,
>>>>              "Returns dump private information for a port. Parameters: int port_id");
>>>>
>>> .
> 
> .
>
  
Bruce Richardson Feb. 16, 2023, 12:54 p.m. UTC | #5
On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
> On 2023/2/16 20:06, Ferruh Yigit wrote:
> > On 2/16/2023 11:53 AM, fengchengwen wrote:
> >> On 2023/2/15 11:19, Dongdong Liu wrote:
> >>> Hi Chengwen
> >>>
> >>> On 2023/2/9 10:32, Chengwen Feng wrote:
> >>>> The xstats reset is useful for debugging, so add it to the ethdev
> >>>> telemetry command lists.
> >>>>
> >>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>> This patch looks good, so
> >>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
> >>>
> >>> A minior question
> >>> Do we need to support stats reset ?
> >>
> >> Stats is contained by xstats, and future direction I think is xstats.
> >> So I think we don't need support stats reset.
> >>
> > 
> > I have similar question with Dongdong, readonly values are safe for
> > telemetry, but modifying data can be more tricky since we don't have
> > locking in ethdev APIs, this can cause concurrency issues.
> 
> Yes, it indeed has concurrency issues.
> 
> > 
> > Overall do we want telemetry go that way and become something that
> > alters ethdev data/config?
> 
> There are at least two part of data: config and status.
> For stats (which belong status data) could help for debugging, I think it's acceptable.
> 
> As for concurrency issues. People should know what to do and when to do, just like
> the don't invoke config API (e.g. dev_configure/dev_start/...) concurrency.
> 
While this is probably ok for now, I think in next release we should look
to add some sort of support for locking for destructive ops in a future
release. For example, we could:

1. Add support for marking a callback as "destructive" and only allow it to
be called if only one connection is present or

2. Make it possible for callbacks to query the number of connections so
that the callback itself is non-destructive in more than one connection is
open.

[Both of these will require locking support so that new connections aren't
openned when the callback is in-flight!]

Any other thoughts or suggestions?

/Bruce
  
Bruce Richardson Feb. 16, 2023, 12:55 p.m. UTC | #6
On Thu, Feb 16, 2023 at 12:54:20PM +0000, Bruce Richardson wrote:
> On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
> > On 2023/2/16 20:06, Ferruh Yigit wrote:
> > > On 2/16/2023 11:53 AM, fengchengwen wrote:
> > >> On 2023/2/15 11:19, Dongdong Liu wrote:
> > >>> Hi Chengwen
> > >>>
> > >>> On 2023/2/9 10:32, Chengwen Feng wrote:
> > >>>> The xstats reset is useful for debugging, so add it to the ethdev
> > >>>> telemetry command lists.
> > >>>>
> > >>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > >>> This patch looks good, so
> > >>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
> > >>>
> > >>> A minior question
> > >>> Do we need to support stats reset ?
> > >>
> > >> Stats is contained by xstats, and future direction I think is xstats.
> > >> So I think we don't need support stats reset.
> > >>
> > > 
> > > I have similar question with Dongdong, readonly values are safe for
> > > telemetry, but modifying data can be more tricky since we don't have
> > > locking in ethdev APIs, this can cause concurrency issues.
> > 
> > Yes, it indeed has concurrency issues.
> > 
> > > 
> > > Overall do we want telemetry go that way and become something that
> > > alters ethdev data/config?
> > 
> > There are at least two part of data: config and status.
> > For stats (which belong status data) could help for debugging, I think it's acceptable.
> > 
> > As for concurrency issues. People should know what to do and when to do, just like
> > the don't invoke config API (e.g. dev_configure/dev_start/...) concurrency.
> > 
> While this is probably ok for now, I think in next release we should look
> to add some sort of support for locking for destructive ops in a future
> release. For example, we could:
> 
> 1. Add support for marking a callback as "destructive" and only allow it to
> be called if only one connection is present or
> 
> 2. Make it possible for callbacks to query the number of connections so
> that the callback itself is non-destructive in more than one connection is
> open.
> 
> [Both of these will require locking support so that new connections aren't
> openned when the callback is in-flight!]
> 
> Any other thoughts or suggestions?
> 
Actually, another thought - we could also make the max-connections
configurable at runtime, so that the user can enforce only a single
connection.

/Bruce
  
Chengwen Feng Feb. 17, 2023, 9:44 a.m. UTC | #7
On 2023/2/16 20:54, Bruce Richardson wrote:
> On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
>> On 2023/2/16 20:06, Ferruh Yigit wrote:
>>> On 2/16/2023 11:53 AM, fengchengwen wrote:
>>>> On 2023/2/15 11:19, Dongdong Liu wrote:
>>>>> Hi Chengwen
>>>>>
>>>>> On 2023/2/9 10:32, Chengwen Feng wrote:
>>>>>> The xstats reset is useful for debugging, so add it to the ethdev
>>>>>> telemetry command lists.
>>>>>>
>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>> This patch looks good, so
>>>>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
>>>>>
>>>>> A minior question
>>>>> Do we need to support stats reset ?
>>>>
>>>> Stats is contained by xstats, and future direction I think is xstats.
>>>> So I think we don't need support stats reset.
>>>>
>>>
>>> I have similar question with Dongdong, readonly values are safe for
>>> telemetry, but modifying data can be more tricky since we don't have
>>> locking in ethdev APIs, this can cause concurrency issues.
>>
>> Yes, it indeed has concurrency issues.
>>
>>>
>>> Overall do we want telemetry go that way and become something that
>>> alters ethdev data/config?
>>
>> There are at least two part of data: config and status.
>> For stats (which belong status data) could help for debugging, I think it's acceptable.
>>
>> As for concurrency issues. People should know what to do and when to do, just like
>> the don't invoke config API (e.g. dev_configure/dev_start/...) concurrency.
>>
> While this is probably ok for now, I think in next release we should look
> to add some sort of support for locking for destructive ops in a future
> release. For example, we could:
> 
> 1. Add support for marking a callback as "destructive" and only allow it to
> be called if only one connection is present or
> 
> 2. Make it possible for callbacks to query the number of connections so
> that the callback itself is non-destructive in more than one connection is
> open.
> 
> [Both of these will require locking support so that new connections aren't
> openned when the callback is in-flight!]

Except telemetry, the application may have other console could execute DPDK API.
So I think trying to keep it simple, it's up to the user to invoke.

> 
> Any other thoughts or suggestions?
> 
> /Bruce
> .
>
  
Thomas Monjalon Feb. 20, 2023, 1:05 p.m. UTC | #8
17/02/2023 10:44, fengchengwen:
> On 2023/2/16 20:54, Bruce Richardson wrote:
> > On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
> >> On 2023/2/16 20:06, Ferruh Yigit wrote:
> >>> On 2/16/2023 11:53 AM, fengchengwen wrote:
> >>>> On 2023/2/15 11:19, Dongdong Liu wrote:
> >>>>> Hi Chengwen
> >>>>>
> >>>>> On 2023/2/9 10:32, Chengwen Feng wrote:
> >>>>>> The xstats reset is useful for debugging, so add it to the ethdev
> >>>>>> telemetry command lists.
> >>>>>>
> >>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>>>> This patch looks good, so
> >>>>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
> >>>>>
> >>>>> A minior question
> >>>>> Do we need to support stats reset ?
> >>>>
> >>>> Stats is contained by xstats, and future direction I think is xstats.
> >>>> So I think we don't need support stats reset.
> >>>>
> >>>
> >>> I have similar question with Dongdong, readonly values are safe for
> >>> telemetry, but modifying data can be more tricky since we don't have
> >>> locking in ethdev APIs, this can cause concurrency issues.
> >>
> >> Yes, it indeed has concurrency issues.
> >>
> >>>
> >>> Overall do we want telemetry go that way and become something that
> >>> alters ethdev data/config?
> >>
> >> There are at least two part of data: config and status.
> >> For stats (which belong status data) could help for debugging, I think it's acceptable.
> >>
> >> As for concurrency issues. People should know what to do and when to do, just like
> >> the don't invoke config API (e.g. dev_configure/dev_start/...) concurrency.
> >>
> > While this is probably ok for now, I think in next release we should look
> > to add some sort of support for locking for destructive ops in a future
> > release. For example, we could:
> > 
> > 1. Add support for marking a callback as "destructive" and only allow it to
> > be called if only one connection is present or
> > 
> > 2. Make it possible for callbacks to query the number of connections so
> > that the callback itself is non-destructive in more than one connection is
> > open.
> > 
> > [Both of these will require locking support so that new connections aren't
> > openned when the callback is in-flight!]
> 
> Except telemetry, the application may have other console could execute DPDK API.
> So I think trying to keep it simple, it's up to the user to invoke.

No, the user should not be responsible for concurrency issues.
We can ask the app developper to take care,
but not to the user who has no control on what happens in the app.

On a more general note, I feel the expansion of telemetry is not controlled enough.
I would like to stop on adding more telemetry until we have a clear guideline
about what is telemetry for and how to use it.
  
Chengwen Feng July 3, 2023, 3:58 a.m. UTC | #9
On 2023/2/20 21:05, Thomas Monjalon wrote:
> 17/02/2023 10:44, fengchengwen:
>> On 2023/2/16 20:54, Bruce Richardson wrote:
>>> On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
>>>> On 2023/2/16 20:06, Ferruh Yigit wrote:
>>>>> On 2/16/2023 11:53 AM, fengchengwen wrote:
>>>>>> On 2023/2/15 11:19, Dongdong Liu wrote:
>>>>>>> Hi Chengwen
>>>>>>>
>>>>>>> On 2023/2/9 10:32, Chengwen Feng wrote:
>>>>>>>> The xstats reset is useful for debugging, so add it to the ethdev
>>>>>>>> telemetry command lists.
>>>>>>>>
>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>>>> This patch looks good, so
>>>>>>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
>>>>>>>
>>>>>>> A minior question
>>>>>>> Do we need to support stats reset ?
>>>>>>
>>>>>> Stats is contained by xstats, and future direction I think is xstats.
>>>>>> So I think we don't need support stats reset.
>>>>>>
>>>>>
>>>>> I have similar question with Dongdong, readonly values are safe for
>>>>> telemetry, but modifying data can be more tricky since we don't have
>>>>> locking in ethdev APIs, this can cause concurrency issues.
>>>>
>>>> Yes, it indeed has concurrency issues.
>>>>
>>>>>
>>>>> Overall do we want telemetry go that way and become something that
>>>>> alters ethdev data/config?
>>>>
>>>> There are at least two part of data: config and status.
>>>> For stats (which belong status data) could help for debugging, I think it's acceptable.
>>>>
>>>> As for concurrency issues. People should know what to do and when to do, just like
>>>> the don't invoke config API (e.g. dev_configure/dev_start/...) concurrency.
>>>>
>>> While this is probably ok for now, I think in next release we should look
>>> to add some sort of support for locking for destructive ops in a future
>>> release. For example, we could:
>>>
>>> 1. Add support for marking a callback as "destructive" and only allow it to
>>> be called if only one connection is present or
>>>
>>> 2. Make it possible for callbacks to query the number of connections so
>>> that the callback itself is non-destructive in more than one connection is
>>> open.
>>>
>>> [Both of these will require locking support so that new connections aren't
>>> openned when the callback is in-flight!]
>>
>> Except telemetry, the application may have other console could execute DPDK API.
>> So I think trying to keep it simple, it's up to the user to invoke.
> 
> No, the user should not be responsible for concurrency issues.
> We can ask the app developper to take care,
> but not to the user who has no control on what happens in the app.
> 
> On a more general note, I feel the expansion of telemetry is not controlled enough.
> I would like to stop on adding more telemetry until we have a clear guideline
> about what is telemetry for and how to use it.

Hi Thomas,

Should this be discussed on TB?

Thanks.

> 
> 
> .
>
  
Thomas Monjalon July 3, 2023, 7:20 a.m. UTC | #10
03/07/2023 05:58, fengchengwen:
> 
> On 2023/2/20 21:05, Thomas Monjalon wrote:
> > 17/02/2023 10:44, fengchengwen:
> >> On 2023/2/16 20:54, Bruce Richardson wrote:
> >>> On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
> >>>> On 2023/2/16 20:06, Ferruh Yigit wrote:
> >>>>> On 2/16/2023 11:53 AM, fengchengwen wrote:
> >>>>>> On 2023/2/15 11:19, Dongdong Liu wrote:
> >>>>>>> Hi Chengwen
> >>>>>>>
> >>>>>>> On 2023/2/9 10:32, Chengwen Feng wrote:
> >>>>>>>> The xstats reset is useful for debugging, so add it to the ethdev
> >>>>>>>> telemetry command lists.
> >>>>>>>>
> >>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >>>>>>> This patch looks good, so
> >>>>>>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
> >>>>>>>
> >>>>>>> A minior question
> >>>>>>> Do we need to support stats reset ?
> >>>>>>
> >>>>>> Stats is contained by xstats, and future direction I think is xstats.
> >>>>>> So I think we don't need support stats reset.
> >>>>>>
> >>>>>
> >>>>> I have similar question with Dongdong, readonly values are safe for
> >>>>> telemetry, but modifying data can be more tricky since we don't have
> >>>>> locking in ethdev APIs, this can cause concurrency issues.
> >>>>
> >>>> Yes, it indeed has concurrency issues.
> >>>>
> >>>>>
> >>>>> Overall do we want telemetry go that way and become something that
> >>>>> alters ethdev data/config?
> >>>>
> >>>> There are at least two part of data: config and status.
> >>>> For stats (which belong status data) could help for debugging, I think it's acceptable.
> >>>>
> >>>> As for concurrency issues. People should know what to do and when to do, just like
> >>>> the don't invoke config API (e.g. dev_configure/dev_start/...) concurrency.
> >>>>
> >>> While this is probably ok for now, I think in next release we should look
> >>> to add some sort of support for locking for destructive ops in a future
> >>> release. For example, we could:
> >>>
> >>> 1. Add support for marking a callback as "destructive" and only allow it to
> >>> be called if only one connection is present or
> >>>
> >>> 2. Make it possible for callbacks to query the number of connections so
> >>> that the callback itself is non-destructive in more than one connection is
> >>> open.
> >>>
> >>> [Both of these will require locking support so that new connections aren't
> >>> openned when the callback is in-flight!]
> >>
> >> Except telemetry, the application may have other console could execute DPDK API.
> >> So I think trying to keep it simple, it's up to the user to invoke.
> > 
> > No, the user should not be responsible for concurrency issues.
> > We can ask the app developper to take care,
> > but not to the user who has no control on what happens in the app.
> > 
> > On a more general note, I feel the expansion of telemetry is not controlled enough.
> > I would like to stop on adding more telemetry until we have a clear guideline
> > about what is telemetry for and how to use it.
> 
> Hi Thomas,
> 
> Should this be discussed on TB?

What would be your question exactly?
  
Morten Brørup July 3, 2023, 1:44 p.m. UTC | #11
> From: Thomas Monjalon [mailto:thomas@monjalon.net]
> Sent: Monday, 3 July 2023 09.20
> 
> 03/07/2023 05:58, fengchengwen:
> >
> > On 2023/2/20 21:05, Thomas Monjalon wrote:
> > > 17/02/2023 10:44, fengchengwen:
> > >> On 2023/2/16 20:54, Bruce Richardson wrote:
> > >>> On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
> > >>>> On 2023/2/16 20:06, Ferruh Yigit wrote:
> > >>>>> On 2/16/2023 11:53 AM, fengchengwen wrote:
> > >>>>>> On 2023/2/15 11:19, Dongdong Liu wrote:
> > >>>>>>> Hi Chengwen
> > >>>>>>>
> > >>>>>>> On 2023/2/9 10:32, Chengwen Feng wrote:
> > >>>>>>>> The xstats reset is useful for debugging, so add it to the
> ethdev
> > >>>>>>>> telemetry command lists.
> > >>>>>>>>
> > >>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> > >>>>>>> This patch looks good, so
> > >>>>>>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
> > >>>>>>>
> > >>>>>>> A minior question
> > >>>>>>> Do we need to support stats reset ?
> > >>>>>>
> > >>>>>> Stats is contained by xstats, and future direction I think is
> xstats.
> > >>>>>> So I think we don't need support stats reset.
> > >>>>>>
> > >>>>>
> > >>>>> I have similar question with Dongdong, readonly values are safe
> for
> > >>>>> telemetry, but modifying data can be more tricky since we don't
> have
> > >>>>> locking in ethdev APIs, this can cause concurrency issues.
> > >>>>
> > >>>> Yes, it indeed has concurrency issues.
> > >>>>
> > >>>>>
> > >>>>> Overall do we want telemetry go that way and become something
> that
> > >>>>> alters ethdev data/config?
> > >>>>
> > >>>> There are at least two part of data: config and status.
> > >>>> For stats (which belong status data) could help for debugging, I
> think it's acceptable.
> > >>>>
> > >>>> As for concurrency issues. People should know what to do and when
> to do, just like
> > >>>> the don't invoke config API (e.g. dev_configure/dev_start/...)
> concurrency.
> > >>>>
> > >>> While this is probably ok for now, I think in next release we
> should look
> > >>> to add some sort of support for locking for destructive ops in a
> future
> > >>> release. For example, we could:
> > >>>
> > >>> 1. Add support for marking a callback as "destructive" and only
> allow it to
> > >>> be called if only one connection is present or
> > >>>
> > >>> 2. Make it possible for callbacks to query the number of
> connections so
> > >>> that the callback itself is non-destructive in more than one
> connection is
> > >>> open.
> > >>>
> > >>> [Both of these will require locking support so that new
> connections aren't
> > >>> openned when the callback is in-flight!]
> > >>
> > >> Except telemetry, the application may have other console could
> execute DPDK API.
> > >> So I think trying to keep it simple, it's up to the user to invoke.
> > >
> > > No, the user should not be responsible for concurrency issues.
> > > We can ask the app developper to take care,
> > > but not to the user who has no control on what happens in the app.
> > >
> > > On a more general note, I feel the expansion of telemetry is not
> controlled enough.
> > > I would like to stop on adding more telemetry until we have a clear
> guideline
> > > about what is telemetry for and how to use it.
> >
> > Hi Thomas,
> >
> > Should this be discussed on TB?
> 
> What would be your question exactly?

A general comment about telemetry:

If an application exposes telemetry through an end user facing API, e.g. http(s) REST, it would be nice if non-read-only telemetry paths are easy to identify by following some DPDK standard convention, so the application does not need to manually maintain an allow-list of read-only paths.

Bruce's documentation about trace/log/telemetry/dump might also need to be updated regarding non-read-only telemetry actions.
  
Chengwen Feng July 4, 2023, 6:41 a.m. UTC | #12
Hi Thomas and Morten,

On 2023/7/3 21:44, Morten Brørup wrote:
>> From: Thomas Monjalon [mailto:thomas@monjalon.net]
>> Sent: Monday, 3 July 2023 09.20
>>
>> 03/07/2023 05:58, fengchengwen:
>>>
>>> On 2023/2/20 21:05, Thomas Monjalon wrote:
>>>> 17/02/2023 10:44, fengchengwen:
>>>>> On 2023/2/16 20:54, Bruce Richardson wrote:
>>>>>> On Thu, Feb 16, 2023 at 08:42:34PM +0800, fengchengwen wrote:
>>>>>>> On 2023/2/16 20:06, Ferruh Yigit wrote:
>>>>>>>> On 2/16/2023 11:53 AM, fengchengwen wrote:
>>>>>>>>> On 2023/2/15 11:19, Dongdong Liu wrote:
>>>>>>>>>> Hi Chengwen
>>>>>>>>>>
>>>>>>>>>> On 2023/2/9 10:32, Chengwen Feng wrote:
>>>>>>>>>>> The xstats reset is useful for debugging, so add it to the
>> ethdev
>>>>>>>>>>> telemetry command lists.
>>>>>>>>>>>
>>>>>>>>>>> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
>>>>>>>>>> This patch looks good, so
>>>>>>>>>> Reviewed-by: Dongdong Liu <liudongdong3@huawei.com>
>>>>>>>>>>
>>>>>>>>>> A minior question
>>>>>>>>>> Do we need to support stats reset ?
>>>>>>>>>
>>>>>>>>> Stats is contained by xstats, and future direction I think is
>> xstats.
>>>>>>>>> So I think we don't need support stats reset.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I have similar question with Dongdong, readonly values are safe
>> for
>>>>>>>> telemetry, but modifying data can be more tricky since we don't
>> have
>>>>>>>> locking in ethdev APIs, this can cause concurrency issues.
>>>>>>>
>>>>>>> Yes, it indeed has concurrency issues.
>>>>>>>
>>>>>>>>
>>>>>>>> Overall do we want telemetry go that way and become something
>> that
>>>>>>>> alters ethdev data/config?
>>>>>>>
>>>>>>> There are at least two part of data: config and status.
>>>>>>> For stats (which belong status data) could help for debugging, I
>> think it's acceptable.
>>>>>>>
>>>>>>> As for concurrency issues. People should know what to do and when
>> to do, just like
>>>>>>> the don't invoke config API (e.g. dev_configure/dev_start/...)
>> concurrency.
>>>>>>>
>>>>>> While this is probably ok for now, I think in next release we
>> should look
>>>>>> to add some sort of support for locking for destructive ops in a
>> future
>>>>>> release. For example, we could:
>>>>>>
>>>>>> 1. Add support for marking a callback as "destructive" and only
>> allow it to
>>>>>> be called if only one connection is present or
>>>>>>
>>>>>> 2. Make it possible for callbacks to query the number of
>> connections so
>>>>>> that the callback itself is non-destructive in more than one
>> connection is
>>>>>> open.
>>>>>>
>>>>>> [Both of these will require locking support so that new
>> connections aren't
>>>>>> openned when the callback is in-flight!]
>>>>>
>>>>> Except telemetry, the application may have other console could
>> execute DPDK API.
>>>>> So I think trying to keep it simple, it's up to the user to invoke.
>>>>
>>>> No, the user should not be responsible for concurrency issues.
>>>> We can ask the app developper to take care,
>>>> but not to the user who has no control on what happens in the app.
>>>>
>>>> On a more general note, I feel the expansion of telemetry is not
>> controlled enough.
>>>> I would like to stop on adding more telemetry until we have a clear
>> guideline
>>>> about what is telemetry for and how to use it.
>>>
>>> Hi Thomas,
>>>
>>> Should this be discussed on TB?
>>
>> What would be your question exactly?
> 
> A general comment about telemetry:
> 
> If an application exposes telemetry through an end user facing API, e.g. http(s) REST, it would be nice if non-read-only telemetry paths are easy to identify by following some DPDK standard convention, so the application does not need to manually maintain an allow-list of read-only paths.

+1 for this point.

> 
> Bruce's documentation about trace/log/telemetry/dump might also need to be updated regarding non-read-only telemetry actions.

I just check Bruce's patch [1], and notice that the telemetry callback must be 'read-only': (Telemetry callbacks should not modify any program state, but be "read-only").

From internal product usage, we think xstats-reset is valid to identify problem, but this callback is not read-only.

We think telemetry callback should not limit to 'read-only'. Perhaps we could develop some strategy to better manage non-read-only callbacks (just like Morten's advise).

[1]: https://patchwork.dpdk.org/project/dpdk/patch/20230620170728.74117-3-bruce.richardson@intel.com/

> 
> 
> .
>
  

Patch

diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
index d25db35f7f..e85c98f307 100644
--- a/lib/ethdev/rte_ethdev.c
+++ b/lib/ethdev/rte_ethdev.c
@@ -5915,6 +5915,35 @@  eth_dev_handle_port_xstats(const char *cmd __rte_unused,
 	return 0;
 }
 
+static int
+eth_dev_handle_port_xstats_reset(const char *cmd __rte_unused,
+		const char *params,
+		struct rte_tel_data *d)
+{
+	int port_id, ret;
+	char *end_param;
+
+	if (params == NULL || strlen(params) == 0 || !isdigit(*params))
+		return -1;
+
+	port_id = strtoul(params, &end_param, 0);
+	if (*end_param != '\0')
+		RTE_ETHDEV_LOG(NOTICE,
+			"Extra parameters passed to ethdev telemetry command, ignoring\n");
+	if (!rte_eth_dev_is_valid_port(port_id))
+		return -1;
+
+	ret = rte_eth_xstats_reset(port_id);
+	if (ret == 0) {
+		rte_tel_data_string(d, "success");
+		RTE_ETHDEV_LOG(NOTICE, "Port %d reset xstats success\n", port_id);
+	} else {
+		RTE_ETHDEV_LOG(ERR, "Port %d reset xstats failed! ret: %d\n", port_id, ret);
+	}
+
+	return ret;
+}
+
 #ifndef RTE_EXEC_ENV_WINDOWS
 static int
 eth_dev_handle_port_dump_priv(const char *cmd __rte_unused,
@@ -6329,6 +6358,8 @@  RTE_INIT(ethdev_init_telemetry)
 			"Returns the common stats for a port. Parameters: int port_id");
 	rte_telemetry_register_cmd("/ethdev/xstats", eth_dev_handle_port_xstats,
 			"Returns the extended stats for a port. Parameters: int port_id");
+	rte_telemetry_register_cmd("/ethdev/xstats_reset", eth_dev_handle_port_xstats_reset,
+			"Reset the extended stats for a port. Parameters: int port_id");
 #ifndef RTE_EXEC_ENV_WINDOWS
 	rte_telemetry_register_cmd("/ethdev/dump_priv", eth_dev_handle_port_dump_priv,
 			"Returns dump private information for a port. Parameters: int port_id");