ethdev: add dump regs for telemetry

Message ID 20231214015650.3738578-1-haijie1@huawei.com (mailing list archive)
State Changes Requested, archived
Delegated to: Ferruh Yigit
Headers
Series ethdev: add dump regs for telemetry |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation fail Compilation issues
ci/github-robot: build fail github build: failed
ci/intel-Testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-unit-amd64-testing fail Testing issues
ci/iol-compile-amd64-testing fail Testing issues
ci/iol-unit-arm64-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-compile-arm64-testing fail Testing issues
ci/iol-sample-apps-testing success Testing PASS

Commit Message

Jie Hai Dec. 14, 2023, 1:56 a.m. UTC
  The ethdev library now registers a telemetry command for
dump regs.

An example usage is shown below:
--> /ethdev/regs,test
{
  "/ethdev/regs": {
    "regs_offset": 0,
    "regs_length": 3192,
    "regs_width": 4,
    "device_version": "0x1080f00",
    "regs_file": "port_0_regs_test"
  }
}

Signed-off-by: Jie Hai <haijie1@huawei.com>
---
 lib/ethdev/rte_ethdev_telemetry.c | 93 +++++++++++++++++++++++++++++++
 1 file changed, 93 insertions(+)
  

Comments

Ferruh Yigit Dec. 14, 2023, 12:49 p.m. UTC | #1
On 12/14/2023 1:56 AM, Jie Hai wrote:
> The ethdev library now registers a telemetry command for
> dump regs.
> 
> An example usage is shown below:
> --> /ethdev/regs,test
> {
>   "/ethdev/regs": {
>     "regs_offset": 0,
>     "regs_length": 3192,
>     "regs_width": 4,
>     "device_version": "0x1080f00",
>     "regs_file": "port_0_regs_test"
>   }
> }
> 
> Signed-off-by: Jie Hai <haijie1@huawei.com>
> ---
>  lib/ethdev/rte_ethdev_telemetry.c | 93 +++++++++++++++++++++++++++++++
>  1 file changed, 93 insertions(+)
> 
> diff --git a/lib/ethdev/rte_ethdev_telemetry.c b/lib/ethdev/rte_ethdev_telemetry.c
> index b01028ce9b60..33ec4739aa9b 100644
> --- a/lib/ethdev/rte_ethdev_telemetry.c
> +++ b/lib/ethdev/rte_ethdev_telemetry.c
> @@ -5,6 +5,7 @@
>  #include <ctype.h>
>  #include <stdlib.h>
>  
> +#include <rte_malloc.h>
>  #include <rte_kvargs.h>
>  #include <rte_telemetry.h>
>  
> @@ -1395,6 +1396,96 @@ eth_dev_handle_port_tm_node_caps(const char *cmd __rte_unused,
>  	return ret;
>  }
>  
> +static int
> +eth_dev_get_port_regs(uint16_t port_id, struct rte_dev_reg_info *reg_info,
> +		      const char *file_name)
> +{
> +	uint64_t buf_size;
> +	size_t nr_written;
> +	void *data;
> +	FILE *fp;
> +	int ret;
> +
> +	ret = rte_eth_dev_get_reg_info(port_id, reg_info);
> +	if (ret != 0) {
> +		RTE_ETHDEV_LOG(ERR,
> +			"Error getting device reg info: %d\n", ret);
> +		return ret;
> +	}
> +
> +	buf_size = reg_info->length * reg_info->width;
> +	data = rte_zmalloc(NULL, buf_size, 0);
> +	if (!data) {
> +		RTE_ETHDEV_LOG(ERR,
> +			"Error allocating %zu bytes buffer\n", buf_size);
> +		return -ENOMEM;
> +	}
> +
> +	reg_info->data = data;
> +	ret = rte_eth_dev_get_reg_info(port_id, reg_info);
> +	if (ret != 0) {
> +		RTE_ETHDEV_LOG(ERR,
> +			"Error getting regs from device: %d\n", ret);
> +		goto out;
> +	}
> +
> +	fp = fopen(file_name, "wb");
> +	if (fp == NULL) {
> +		printf("Error during opening '%s' for writing: %s\n",
> +			file_name, strerror(errno));
> +		ret = -EINVAL;
> +	} else {
> +		nr_written = fwrite(reg_info->data, 1, buf_size, fp);
>

Above code writes register data to a file.

I am not sure about this kind of usage of telemetry command, that it
cause data to be written to a file.

My understanding is, telemetry usage is based on what telemetry client
receives.
What do you think just keep the 'reg_info' fields excluding data to the
file?
  
Jie Hai Jan. 9, 2024, 2:19 a.m. UTC | #2
On 2023/12/14 20:49, Ferruh Yigit wrote:
> On 12/14/2023 1:56 AM, Jie Hai wrote:
>> The ethdev library now registers a telemetry command for
>> dump regs.
>>
>> An example usage is shown below:
>> --> /ethdev/regs,test
>> {
>>    "/ethdev/regs": {
>>      "regs_offset": 0,
>>      "regs_length": 3192,
>>      "regs_width": 4,
>>      "device_version": "0x1080f00",
>>      "regs_file": "port_0_regs_test"
>>    }
>> }

> 
> Above code writes register data to a file.
> 
> I am not sure about this kind of usage of telemetry command, that it
> cause data to be written to a file.
> 
> My understanding is, telemetry usage is based on what telemetry client
> receives.
> What do you think just keep the 'reg_info' fields excluding data to the
> file?
> 
> .Hi, Ferruh

I tried to write all register information to telemetry data,
but gave up because some drivers had too many registers (eg.ixgbe)
to carry. Therefore, the writing data to file approach is selected.

When we query a register, the register content is the key.
The information such as the width and length is only auxiliary
information. If the register data cannot be obtained, the auxiliary 
information is optional. So I don't think register data should be removed.

In my opinion, writing a file is a more appropriate way to do it.
I wonder if there's a better way.

Best regards,
Thanks
  
Jie Hai Jan. 9, 2024, 2:41 a.m. UTC | #3
On 2024/1/9 10:19, Jie Hai wrote:
> On 2023/12/14 20:49, Ferruh Yigit wrote:
>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>> The ethdev library now registers a telemetry command for
>>> dump regs.
>>>
>>> An example usage is shown below:
>>> --> /ethdev/regs,test
>>> {
>>>    "/ethdev/regs": {
>>>      "regs_offset": 0,
>>>      "regs_length": 3192,
>>>      "regs_width": 4,
>>>      "device_version": "0x1080f00",
>>>      "regs_file": "port_0_regs_test"
>>>    }
>>> }
> 
>>
>> Above code writes register data to a file.
>>
>> I am not sure about this kind of usage of telemetry command, that it
>> cause data to be written to a file.
>>
>> My understanding is, telemetry usage is based on what telemetry client
>> receives.
>> What do you think just keep the 'reg_info' fields excluding data to the
>> file?
>>
>> .Hi, Ferruh
> 
> I tried to write all register information to telemetry data,
> but gave up because some drivers had too many registers (eg.ixgbe)
Sorry, correct it. It is i40e who has the most registers.
> to carry. Therefore, the writing data to file approach is selected.
> 
> When we query a register, the register content is the key.
> The information such as the width and length is only auxiliary
> information. If the register data cannot be obtained, the auxiliary 
> information is optional. So I don't think register data should be removed.
> 
> In my opinion, writing a file is a more appropriate way to do it.
> I wonder if there's a better way.
> 
> Best regards,
> Thanks
  
Ferruh Yigit Jan. 9, 2024, 6:06 p.m. UTC | #4
On 1/9/2024 2:19 AM, Jie Hai wrote:
> On 2023/12/14 20:49, Ferruh Yigit wrote:
>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>> The ethdev library now registers a telemetry command for
>>> dump regs.
>>>
>>> An example usage is shown below:
>>> --> /ethdev/regs,test
>>> {
>>>    "/ethdev/regs": {
>>>      "regs_offset": 0,
>>>      "regs_length": 3192,
>>>      "regs_width": 4,
>>>      "device_version": "0x1080f00",
>>>      "regs_file": "port_0_regs_test"
>>>    }
>>> }
> 
>>
>> Above code writes register data to a file.
>>
>> I am not sure about this kind of usage of telemetry command, that it
>> cause data to be written to a file.
>>
>> My understanding is, telemetry usage is based on what telemetry client
>> receives.
>> What do you think just keep the 'reg_info' fields excluding data to the
>> file?
>>
>> .Hi, Ferruh
> 
> I tried to write all register information to telemetry data,
> but gave up because some drivers had too many registers (eg.ixgbe)
> to carry. Therefore, the writing data to file approach is selected.
> 
> When we query a register, the register content is the key.
> The information such as the width and length is only auxiliary
> information. If the register data cannot be obtained, the auxiliary
> information is optional. So I don't think register data should be removed.
> 
> In my opinion, writing a file is a more appropriate way to do it.
> I wonder if there's a better way.
> 
> 

Is there a usecase to get register information from telemetry interface?
  
fengchengwen Jan. 10, 2024, 1:38 a.m. UTC | #5
Hi Ferruh,

On 2024/1/10 2:06, Ferruh Yigit wrote:
> On 1/9/2024 2:19 AM, Jie Hai wrote:
>> On 2023/12/14 20:49, Ferruh Yigit wrote:
>>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>>> The ethdev library now registers a telemetry command for
>>>> dump regs.
>>>>
>>>> An example usage is shown below:
>>>> --> /ethdev/regs,test
>>>> {
>>>>    "/ethdev/regs": {
>>>>      "regs_offset": 0,
>>>>      "regs_length": 3192,
>>>>      "regs_width": 4,
>>>>      "device_version": "0x1080f00",
>>>>      "regs_file": "port_0_regs_test"
>>>>    }
>>>> }
>>
>>>
>>> Above code writes register data to a file.
>>>
>>> I am not sure about this kind of usage of telemetry command, that it
>>> cause data to be written to a file.
>>>
>>> My understanding is, telemetry usage is based on what telemetry client
>>> receives.
>>> What do you think just keep the 'reg_info' fields excluding data to the
>>> file?
>>>
>>> .Hi, Ferruh
>>
>> I tried to write all register information to telemetry data,
>> but gave up because some drivers had too many registers (eg.ixgbe)
>> to carry. Therefore, the writing data to file approach is selected.
>>
>> When we query a register, the register content is the key.
>> The information such as the width and length is only auxiliary
>> information. If the register data cannot be obtained, the auxiliary
>> information is optional. So I don't think register data should be removed.
>>
>> In my opinion, writing a file is a more appropriate way to do it.
>> I wonder if there's a better way.
>>
>>
> 
> Is there a usecase to get register information from telemetry interface?

Among the available tools:
1, ethtool/proc-info: should use multi-process mechanism to connect to the main process
2, telemetry: easier, lighter load, and it don't need re-probe the ethdev in the secondary process,
              and also cost more resource, like hugepage, cores.

From our users, they prefer use the second 'telemetry', so I think we should move
more status-query-points to telemetry.

As for this question, I think it's okay to get register info from telemetry.



Another question, we have some internal registers, which:
1. Is not suitable expose by xstats, because they may includes configuration
2. Is not suitable expose by dumps, because this dumps is hard to understand (because it only has value).

So we plan to add some telemetry points in the driver itself, so we could display them like xstats:
"xxxx" : 0x1234
"yyyy" : 0x100

Will the community accept this kind of telemetry points which limit one driver ?

Thanks.

> 
> .
>
  
Ferruh Yigit Jan. 10, 2024, 12:15 p.m. UTC | #6
On 1/10/2024 1:38 AM, fengchengwen wrote:
> Hi Ferruh,
> 
> On 2024/1/10 2:06, Ferruh Yigit wrote:
>> On 1/9/2024 2:19 AM, Jie Hai wrote:
>>> On 2023/12/14 20:49, Ferruh Yigit wrote:
>>>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>>>> The ethdev library now registers a telemetry command for
>>>>> dump regs.
>>>>>
>>>>> An example usage is shown below:
>>>>> --> /ethdev/regs,test
>>>>> {
>>>>>    "/ethdev/regs": {
>>>>>      "regs_offset": 0,
>>>>>      "regs_length": 3192,
>>>>>      "regs_width": 4,
>>>>>      "device_version": "0x1080f00",
>>>>>      "regs_file": "port_0_regs_test"
>>>>>    }
>>>>> }
>>>
>>>>
>>>> Above code writes register data to a file.
>>>>
>>>> I am not sure about this kind of usage of telemetry command, that it
>>>> cause data to be written to a file.
>>>>
>>>> My understanding is, telemetry usage is based on what telemetry client
>>>> receives.
>>>> What do you think just keep the 'reg_info' fields excluding data to the
>>>> file?
>>>>
>>>> .Hi, Ferruh
>>>
>>> I tried to write all register information to telemetry data,
>>> but gave up because some drivers had too many registers (eg.ixgbe)
>>> to carry. Therefore, the writing data to file approach is selected.
>>>
>>> When we query a register, the register content is the key.
>>> The information such as the width and length is only auxiliary
>>> information. If the register data cannot be obtained, the auxiliary
>>> information is optional. So I don't think register data should be removed.
>>>
>>> In my opinion, writing a file is a more appropriate way to do it.
>>> I wonder if there's a better way.
>>>
>>>
>>
>> Is there a usecase to get register information from telemetry interface?
> 
> Among the available tools:
> 1, ethtool/proc-info: should use multi-process mechanism to connect to the main process
> 2, telemetry: easier, lighter load, and it don't need re-probe the ethdev in the secondary process,
>               and also cost more resource, like hugepage, cores.
> 
> From our users, they prefer use the second 'telemetry', so I think we should move
> more status-query-points to telemetry.
> 
> As for this question, I think it's okay to get register info from telemetry.
> 
> 
>
> Another question, we have some internal registers, which:
> 1. Is not suitable expose by xstats, because they may includes configuration
> 2. Is not suitable expose by dumps, because this dumps is hard to understand (because it only has value).
> 
> So we plan to add some telemetry points in the driver itself, so we could display them like xstats:
> "xxxx" : 0x1234
> "yyyy" : 0x100
> 
> Will the community accept this kind of telemetry points which limit one driver ?
> 

Hi Chengwen,

I see there is a usecase/requirement.

With this patch, even using file, only register values are dumped and
isn't it hard to find value of specific register?

("xxxx" : 0x1234) approach looks better, but instead of making this
telemetry support for specific driver, what about making it in two steps.

First add new dev_ops, (or update existing one), to get registers with
"name: value" format, (in a way to allow empty name), or even perhaps
"name: offset, value" format.
And in second stage add telemetry support around it.
(Name being optional lets us wrap exiting 'get_reg' dev_ops with new one)

When adding dev_ops, it may get an additional 'filter' parameter, to get
only subset of regs, like "mac*" to get regs name staring with "mac",
this may help for the cases there are too many registers you mentioned.

Anyway, we can discuss more about its design, but what do you think
about first having a dev_ops for this?
  
Thomas Monjalon Jan. 10, 2024, 2:09 p.m. UTC | #7
10/01/2024 13:15, Ferruh Yigit:
> On 1/10/2024 1:38 AM, fengchengwen wrote:
> > Hi Ferruh,
> > 
> > On 2024/1/10 2:06, Ferruh Yigit wrote:
> >> On 1/9/2024 2:19 AM, Jie Hai wrote:
> >>> On 2023/12/14 20:49, Ferruh Yigit wrote:
> >>>> On 12/14/2023 1:56 AM, Jie Hai wrote:
> >>>>> The ethdev library now registers a telemetry command for
> >>>>> dump regs.
> >>>>>
> >>>>> An example usage is shown below:
> >>>>> --> /ethdev/regs,test
> >>>>> {
> >>>>>    "/ethdev/regs": {
> >>>>>      "regs_offset": 0,
> >>>>>      "regs_length": 3192,
> >>>>>      "regs_width": 4,
> >>>>>      "device_version": "0x1080f00",
> >>>>>      "regs_file": "port_0_regs_test"
> >>>>>    }
> >>>>> }
> >>>
> >>>>
> >>>> Above code writes register data to a file.
> >>>>
> >>>> I am not sure about this kind of usage of telemetry command, that it
> >>>> cause data to be written to a file.
> >>>>
> >>>> My understanding is, telemetry usage is based on what telemetry client
> >>>> receives.
> >>>> What do you think just keep the 'reg_info' fields excluding data to the
> >>>> file?
> >>>>
> >>>> .Hi, Ferruh
> >>>
> >>> I tried to write all register information to telemetry data,
> >>> but gave up because some drivers had too many registers (eg.ixgbe)
> >>> to carry. Therefore, the writing data to file approach is selected.
> >>>
> >>> When we query a register, the register content is the key.
> >>> The information such as the width and length is only auxiliary
> >>> information. If the register data cannot be obtained, the auxiliary
> >>> information is optional. So I don't think register data should be removed.
> >>>
> >>> In my opinion, writing a file is a more appropriate way to do it.
> >>> I wonder if there's a better way.
> >>>
> >>>
> >>
> >> Is there a usecase to get register information from telemetry interface?
> > 
> > Among the available tools:
> > 1, ethtool/proc-info: should use multi-process mechanism to connect to the main process
> > 2, telemetry: easier, lighter load, and it don't need re-probe the ethdev in the secondary process,
> >               and also cost more resource, like hugepage, cores.
> > 
> > From our users, they prefer use the second 'telemetry', so I think we should move
> > more status-query-points to telemetry.
> > 
> > As for this question, I think it's okay to get register info from telemetry.
> > 
> > 
> >
> > Another question, we have some internal registers, which:
> > 1. Is not suitable expose by xstats, because they may includes configuration
> > 2. Is not suitable expose by dumps, because this dumps is hard to understand (because it only has value).
> > 
> > So we plan to add some telemetry points in the driver itself, so we could display them like xstats:
> > "xxxx" : 0x1234
> > "yyyy" : 0x100
> > 
> > Will the community accept this kind of telemetry points which limit one driver ?
> > 
> 
> Hi Chengwen,
> 
> I see there is a usecase/requirement.
> 
> With this patch, even using file, only register values are dumped and
> isn't it hard to find value of specific register?
> 
> ("xxxx" : 0x1234) approach looks better, but instead of making this
> telemetry support for specific driver, what about making it in two steps.
> 
> First add new dev_ops, (or update existing one), to get registers with
> "name: value" format, (in a way to allow empty name), or even perhaps
> "name: offset, value" format.

I'm OK to add an API for dumping registers, and guess what?
We already have it: rte_eth_dev_get_reg_info().
We may extend it to query a subset of registers.

> And in second stage add telemetry support around it.
> (Name being optional lets us wrap exiting 'get_reg' dev_ops with new one)

I am against overloading telemetry for debug purpose.

> When adding dev_ops, it may get an additional 'filter' parameter, to get
> only subset of regs, like "mac*" to get regs name staring with "mac",
> this may help for the cases there are too many registers you mentioned.
> 
> Anyway, we can discuss more about its design, but what do you think
> about first having a dev_ops for this?
  
Ferruh Yigit Jan. 10, 2024, 3:48 p.m. UTC | #8
On 1/10/2024 2:09 PM, Thomas Monjalon wrote:
> 10/01/2024 13:15, Ferruh Yigit:
>> On 1/10/2024 1:38 AM, fengchengwen wrote:
>>> Hi Ferruh,
>>>
>>> On 2024/1/10 2:06, Ferruh Yigit wrote:
>>>> On 1/9/2024 2:19 AM, Jie Hai wrote:
>>>>> On 2023/12/14 20:49, Ferruh Yigit wrote:
>>>>>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>>>>>> The ethdev library now registers a telemetry command for
>>>>>>> dump regs.
>>>>>>>
>>>>>>> An example usage is shown below:
>>>>>>> --> /ethdev/regs,test
>>>>>>> {
>>>>>>>    "/ethdev/regs": {
>>>>>>>      "regs_offset": 0,
>>>>>>>      "regs_length": 3192,
>>>>>>>      "regs_width": 4,
>>>>>>>      "device_version": "0x1080f00",
>>>>>>>      "regs_file": "port_0_regs_test"
>>>>>>>    }
>>>>>>> }
>>>>>
>>>>>>
>>>>>> Above code writes register data to a file.
>>>>>>
>>>>>> I am not sure about this kind of usage of telemetry command, that it
>>>>>> cause data to be written to a file.
>>>>>>
>>>>>> My understanding is, telemetry usage is based on what telemetry client
>>>>>> receives.
>>>>>> What do you think just keep the 'reg_info' fields excluding data to the
>>>>>> file?
>>>>>>
>>>>>> .Hi, Ferruh
>>>>>
>>>>> I tried to write all register information to telemetry data,
>>>>> but gave up because some drivers had too many registers (eg.ixgbe)
>>>>> to carry. Therefore, the writing data to file approach is selected.
>>>>>
>>>>> When we query a register, the register content is the key.
>>>>> The information such as the width and length is only auxiliary
>>>>> information. If the register data cannot be obtained, the auxiliary
>>>>> information is optional. So I don't think register data should be removed.
>>>>>
>>>>> In my opinion, writing a file is a more appropriate way to do it.
>>>>> I wonder if there's a better way.
>>>>>
>>>>>
>>>>
>>>> Is there a usecase to get register information from telemetry interface?
>>>
>>> Among the available tools:
>>> 1, ethtool/proc-info: should use multi-process mechanism to connect to the main process
>>> 2, telemetry: easier, lighter load, and it don't need re-probe the ethdev in the secondary process,
>>>               and also cost more resource, like hugepage, cores.
>>>
>>> From our users, they prefer use the second 'telemetry', so I think we should move
>>> more status-query-points to telemetry.
>>>
>>> As for this question, I think it's okay to get register info from telemetry.
>>>
>>>
>>>
>>> Another question, we have some internal registers, which:
>>> 1. Is not suitable expose by xstats, because they may includes configuration
>>> 2. Is not suitable expose by dumps, because this dumps is hard to understand (because it only has value).
>>>
>>> So we plan to add some telemetry points in the driver itself, so we could display them like xstats:
>>> "xxxx" : 0x1234
>>> "yyyy" : 0x100
>>>
>>> Will the community accept this kind of telemetry points which limit one driver ?
>>>
>>
>> Hi Chengwen,
>>
>> I see there is a usecase/requirement.
>>
>> With this patch, even using file, only register values are dumped and
>> isn't it hard to find value of specific register?
>>
>> ("xxxx" : 0x1234) approach looks better, but instead of making this
>> telemetry support for specific driver, what about making it in two steps.
>>
>> First add new dev_ops, (or update existing one), to get registers with
>> "name: value" format, (in a way to allow empty name), or even perhaps
>> "name: offset, value" format.
> 
> I'm OK to add an API for dumping registers, and guess what?
> We already have it: rte_eth_dev_get_reg_info().
> We may extend it to query a subset of registers.
> 

This patch already using 'rte_eth_dev_get_reg_info()', but issue is how
it is used, it gets filename from telemetry and dumps registers to that
file.

>> And in second stage add telemetry support around it.
>> (Name being optional lets us wrap exiting 'get_reg' dev_ops with new one)
> 
> I am against overloading telemetry for debug purpose.
> 

Reading some registers can be debugging or monitoring, I believe it is
in the gray area.

>> When adding dev_ops, it may get an additional 'filter' parameter, to get
>> only subset of regs, like "mac*" to get regs name staring with "mac",
>> this may help for the cases there are too many registers you mentioned.
>>
>> Anyway, we can discuss more about its design, but what do you think
>> about first having a dev_ops for this?
> 
> 
>
  
fengchengwen Jan. 11, 2024, 1:55 a.m. UTC | #9
Hi Ferruh,

On 2024/1/10 20:15, Ferruh Yigit wrote:
> On 1/10/2024 1:38 AM, fengchengwen wrote:
>> Hi Ferruh,
>>
>> On 2024/1/10 2:06, Ferruh Yigit wrote:
>>> On 1/9/2024 2:19 AM, Jie Hai wrote:
>>>> On 2023/12/14 20:49, Ferruh Yigit wrote:
>>>>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>>>>> The ethdev library now registers a telemetry command for
>>>>>> dump regs.
>>>>>>
>>>>>> An example usage is shown below:
>>>>>> --> /ethdev/regs,test
>>>>>> {
>>>>>>    "/ethdev/regs": {
>>>>>>      "regs_offset": 0,
>>>>>>      "regs_length": 3192,
>>>>>>      "regs_width": 4,
>>>>>>      "device_version": "0x1080f00",
>>>>>>      "regs_file": "port_0_regs_test"
>>>>>>    }
>>>>>> }
>>>>
>>>>>
>>>>> Above code writes register data to a file.
>>>>>
>>>>> I am not sure about this kind of usage of telemetry command, that it
>>>>> cause data to be written to a file.
>>>>>
>>>>> My understanding is, telemetry usage is based on what telemetry client
>>>>> receives.
>>>>> What do you think just keep the 'reg_info' fields excluding data to the
>>>>> file?
>>>>>
>>>>> .Hi, Ferruh
>>>>
>>>> I tried to write all register information to telemetry data,
>>>> but gave up because some drivers had too many registers (eg.ixgbe)
>>>> to carry. Therefore, the writing data to file approach is selected.
>>>>
>>>> When we query a register, the register content is the key.
>>>> The information such as the width and length is only auxiliary
>>>> information. If the register data cannot be obtained, the auxiliary
>>>> information is optional. So I don't think register data should be removed.
>>>>
>>>> In my opinion, writing a file is a more appropriate way to do it.
>>>> I wonder if there's a better way.
>>>>
>>>>
>>>
>>> Is there a usecase to get register information from telemetry interface?
>>
>> Among the available tools:
>> 1, ethtool/proc-info: should use multi-process mechanism to connect to the main process
>> 2, telemetry: easier, lighter load, and it don't need re-probe the ethdev in the secondary process,
>>               and also cost more resource, like hugepage, cores.
>>
>> From our users, they prefer use the second 'telemetry', so I think we should move
>> more status-query-points to telemetry.
>>
>> As for this question, I think it's okay to get register info from telemetry.
>>
>>
>>
>> Another question, we have some internal registers, which:
>> 1. Is not suitable expose by xstats, because they may includes configuration
>> 2. Is not suitable expose by dumps, because this dumps is hard to understand (because it only has value).
>>
>> So we plan to add some telemetry points in the driver itself, so we could display them like xstats:
>> "xxxx" : 0x1234
>> "yyyy" : 0x100
>>
>> Will the community accept this kind of telemetry points which limit one driver ?
>>
> 
> Hi Chengwen,
> 
> I see there is a usecase/requirement.
> 
> With this patch, even using file, only register values are dumped and
> isn't it hard to find value of specific register?
> 
> ("xxxx" : 0x1234) approach looks better, but instead of making this
> telemetry support for specific driver, what about making it in two steps.
> 
> First add new dev_ops, (or update existing one), to get registers with
> "name: value" format, (in a way to allow empty name), or even perhaps
> "name: offset, value" format.
> And in second stage add telemetry support around it.
> (Name being optional lets us wrap exiting 'get_reg' dev_ops with new one)
> 
> When adding dev_ops, it may get an additional 'filter' parameter, to get
> only subset of regs, like "mac*" to get regs name staring with "mac",
> this may help for the cases there are too many registers you mentioned.
> 
> Anyway, we can discuss more about its design, but what do you think
> about first having a dev_ops for this?

I prefer extend struct rte_dev_reg_info, like this:

struct rte_eth_reg_name {
	char name[RTE_ETH_REG_NAME_SIZE];
};

struct rte_dev_reg_info {
	void *data; /**< Buffer for return registers */
	uint32_t offset; /**< Start register table location for access */
	uint32_t length; /**< Number of registers to fetch */
	uint32_t width; /**< Size of device register */
	uint32_t version; /**< Device version */
/* Note: below two fields are new added. */
	char *filter; /**< Filter for target subset of registers. This field could affects register selection for data/length/name.  */
	struct rte_eth_reg_name *names; /**< Registers name saver. */
};

For driver which don't identify the new filter and names fields:
  1. .get_reg return the all registers value.
  2. and driver will not touch the name fields.
  3. rte_eth_dev_get_reg_info() could detect name fileds not filled, and then it fill with default names, e.g. offset-1/offset-2/...

For driver which identify the new filter and names fields:
  1. rte_eth_dev_get_reg_info() will return filtered register's value and also their names.

So that those which invoke rte_eth_dev_get_reg_info() could extra prepare names, and it call the same API will get data and name.


Add one new .get_reg_name ops and corresponding API like: rte_eth_dev_get_reg_name() could also feasible.
But I think the rte_eth_dev_get_reg_info()'s name is too broad, the info could includes value and also it's name.
So I prefer not add one new ops.


Another question? what are the supported values of filters ?
I prefer report by dev_info ops, something like a string array end with NULL.
Use could query from rte_eth_dev_info_get API.

Thanks.

> 
> .
>
  
Ferruh Yigit Jan. 11, 2024, 11:11 a.m. UTC | #10
On 1/11/2024 1:55 AM, fengchengwen wrote:
> Hi Ferruh,
> 
> On 2024/1/10 20:15, Ferruh Yigit wrote:
>> On 1/10/2024 1:38 AM, fengchengwen wrote:
>>> Hi Ferruh,
>>>
>>> On 2024/1/10 2:06, Ferruh Yigit wrote:
>>>> On 1/9/2024 2:19 AM, Jie Hai wrote:
>>>>> On 2023/12/14 20:49, Ferruh Yigit wrote:
>>>>>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>>>>>> The ethdev library now registers a telemetry command for
>>>>>>> dump regs.
>>>>>>>
>>>>>>> An example usage is shown below:
>>>>>>> --> /ethdev/regs,test
>>>>>>> {
>>>>>>>    "/ethdev/regs": {
>>>>>>>      "regs_offset": 0,
>>>>>>>      "regs_length": 3192,
>>>>>>>      "regs_width": 4,
>>>>>>>      "device_version": "0x1080f00",
>>>>>>>      "regs_file": "port_0_regs_test"
>>>>>>>    }
>>>>>>> }
>>>>>
>>>>>>
>>>>>> Above code writes register data to a file.
>>>>>>
>>>>>> I am not sure about this kind of usage of telemetry command, that it
>>>>>> cause data to be written to a file.
>>>>>>
>>>>>> My understanding is, telemetry usage is based on what telemetry client
>>>>>> receives.
>>>>>> What do you think just keep the 'reg_info' fields excluding data to the
>>>>>> file?
>>>>>>
>>>>>> .Hi, Ferruh
>>>>>
>>>>> I tried to write all register information to telemetry data,
>>>>> but gave up because some drivers had too many registers (eg.ixgbe)
>>>>> to carry. Therefore, the writing data to file approach is selected.
>>>>>
>>>>> When we query a register, the register content is the key.
>>>>> The information such as the width and length is only auxiliary
>>>>> information. If the register data cannot be obtained, the auxiliary
>>>>> information is optional. So I don't think register data should be removed.
>>>>>
>>>>> In my opinion, writing a file is a more appropriate way to do it.
>>>>> I wonder if there's a better way.
>>>>>
>>>>>
>>>>
>>>> Is there a usecase to get register information from telemetry interface?
>>>
>>> Among the available tools:
>>> 1, ethtool/proc-info: should use multi-process mechanism to connect to the main process
>>> 2, telemetry: easier, lighter load, and it don't need re-probe the ethdev in the secondary process,
>>>               and also cost more resource, like hugepage, cores.
>>>
>>> From our users, they prefer use the second 'telemetry', so I think we should move
>>> more status-query-points to telemetry.
>>>
>>> As for this question, I think it's okay to get register info from telemetry.
>>>
>>>
>>>
>>> Another question, we have some internal registers, which:
>>> 1. Is not suitable expose by xstats, because they may includes configuration
>>> 2. Is not suitable expose by dumps, because this dumps is hard to understand (because it only has value).
>>>
>>> So we plan to add some telemetry points in the driver itself, so we could display them like xstats:
>>> "xxxx" : 0x1234
>>> "yyyy" : 0x100
>>>
>>> Will the community accept this kind of telemetry points which limit one driver ?
>>>
>>
>> Hi Chengwen,
>>
>> I see there is a usecase/requirement.
>>
>> With this patch, even using file, only register values are dumped and
>> isn't it hard to find value of specific register?
>>
>> ("xxxx" : 0x1234) approach looks better, but instead of making this
>> telemetry support for specific driver, what about making it in two steps.
>>
>> First add new dev_ops, (or update existing one), to get registers with
>> "name: value" format, (in a way to allow empty name), or even perhaps
>> "name: offset, value" format.
>> And in second stage add telemetry support around it.
>> (Name being optional lets us wrap exiting 'get_reg' dev_ops with new one)
>>
>> When adding dev_ops, it may get an additional 'filter' parameter, to get
>> only subset of regs, like "mac*" to get regs name staring with "mac",
>> this may help for the cases there are too many registers you mentioned.
>>
>> Anyway, we can discuss more about its design, but what do you think
>> about first having a dev_ops for this?
> 
> I prefer extend struct rte_dev_reg_info, like this:
> 
> struct rte_eth_reg_name {
> 	char name[RTE_ETH_REG_NAME_SIZE];
> };
> 
> struct rte_dev_reg_info {
> 	void *data; /**< Buffer for return registers */
> 	uint32_t offset; /**< Start register table location for access */
> 	uint32_t length; /**< Number of registers to fetch */
> 	uint32_t width; /**< Size of device register */
> 	uint32_t version; /**< Device version */
> /* Note: below two fields are new added. */
> 	char *filter; /**< Filter for target subset of registers. This field could affects register selection for data/length/name.  */
> 	struct rte_eth_reg_name *names; /**< Registers name saver. */
> };
> 

ack

> For driver which don't identify the new filter and names fields:
>   1. .get_reg return the all registers value.
>

ack


>   2. and driver will not touch the name fields.
>   3. rte_eth_dev_get_reg_info() could detect name fileds not filled, and then it fill with default names, e.g. offset-1/offset-2/...
> 

Is there a benefit to provide default names? API can clear the 'names'
buffer, and driver may or may not fill it. If names not filled, API
behaves like existing one, it will just provide register values.


> For driver which identify the new filter and names fields:
>   1. rte_eth_dev_get_reg_info() will return filtered register's value and also their names.
> 

ack

> So that those which invoke rte_eth_dev_get_reg_info() could extra prepare names, and it call the same API will get data and name.
> 
> 
> Add one new .get_reg_name ops and corresponding API like: rte_eth_dev_get_reg_name() could also feasible.
> But I think the rte_eth_dev_get_reg_info()'s name is too broad, the info could includes value and also it's name.
> So I prefer not add one new ops.
> 

ack

> 
> Another question? what are the supported values of filters ?
> I prefer report by dev_info ops, something like a string array end with NULL.
> Use could query from rte_eth_dev_info_get API.
> 

I don't think there is a need to populate predefined filter list, it can
be free text with simple '*' and '.' wildcard support and ',' to support
list of text.

User may get full list first, later can filter the ones they are interested.
Like: "*mac*,*rss*" can match all register names that has 'mac' and
'rss' in it.
  
fengchengwen Jan. 11, 2024, 12:43 p.m. UTC | #11
Hi Ferruh,

On 2024/1/11 19:11, Ferruh Yigit wrote:
> On 1/11/2024 1:55 AM, fengchengwen wrote:
>> Hi Ferruh,
>>
>> On 2024/1/10 20:15, Ferruh Yigit wrote:
>>> On 1/10/2024 1:38 AM, fengchengwen wrote:
>>>> Hi Ferruh,
>>>>
>>>> On 2024/1/10 2:06, Ferruh Yigit wrote:
>>>>> On 1/9/2024 2:19 AM, Jie Hai wrote:
>>>>>> On 2023/12/14 20:49, Ferruh Yigit wrote:
>>>>>>> On 12/14/2023 1:56 AM, Jie Hai wrote:
>>>>>>>> The ethdev library now registers a telemetry command for
>>>>>>>> dump regs.
>>>>>>>>
>>>>>>>> An example usage is shown below:
>>>>>>>> --> /ethdev/regs,test
>>>>>>>> {
>>>>>>>>    "/ethdev/regs": {
>>>>>>>>      "regs_offset": 0,
>>>>>>>>      "regs_length": 3192,
>>>>>>>>      "regs_width": 4,
>>>>>>>>      "device_version": "0x1080f00",
>>>>>>>>      "regs_file": "port_0_regs_test"
>>>>>>>>    }
>>>>>>>> }
>>>>>>
>>>>>>>
>>>>>>> Above code writes register data to a file.
>>>>>>>
>>>>>>> I am not sure about this kind of usage of telemetry command, that it
>>>>>>> cause data to be written to a file.
>>>>>>>
>>>>>>> My understanding is, telemetry usage is based on what telemetry client
>>>>>>> receives.
>>>>>>> What do you think just keep the 'reg_info' fields excluding data to the
>>>>>>> file?
>>>>>>>
>>>>>>> .Hi, Ferruh
>>>>>>
>>>>>> I tried to write all register information to telemetry data,
>>>>>> but gave up because some drivers had too many registers (eg.ixgbe)
>>>>>> to carry. Therefore, the writing data to file approach is selected.
>>>>>>
>>>>>> When we query a register, the register content is the key.
>>>>>> The information such as the width and length is only auxiliary
>>>>>> information. If the register data cannot be obtained, the auxiliary
>>>>>> information is optional. So I don't think register data should be removed.
>>>>>>
>>>>>> In my opinion, writing a file is a more appropriate way to do it.
>>>>>> I wonder if there's a better way.
>>>>>>
>>>>>>
>>>>>
>>>>> Is there a usecase to get register information from telemetry interface?
>>>>
>>>> Among the available tools:
>>>> 1, ethtool/proc-info: should use multi-process mechanism to connect to the main process
>>>> 2, telemetry: easier, lighter load, and it don't need re-probe the ethdev in the secondary process,
>>>>               and also cost more resource, like hugepage, cores.
>>>>
>>>> From our users, they prefer use the second 'telemetry', so I think we should move
>>>> more status-query-points to telemetry.
>>>>
>>>> As for this question, I think it's okay to get register info from telemetry.
>>>>
>>>>
>>>>
>>>> Another question, we have some internal registers, which:
>>>> 1. Is not suitable expose by xstats, because they may includes configuration
>>>> 2. Is not suitable expose by dumps, because this dumps is hard to understand (because it only has value).
>>>>
>>>> So we plan to add some telemetry points in the driver itself, so we could display them like xstats:
>>>> "xxxx" : 0x1234
>>>> "yyyy" : 0x100
>>>>
>>>> Will the community accept this kind of telemetry points which limit one driver ?
>>>>
>>>
>>> Hi Chengwen,
>>>
>>> I see there is a usecase/requirement.
>>>
>>> With this patch, even using file, only register values are dumped and
>>> isn't it hard to find value of specific register?
>>>
>>> ("xxxx" : 0x1234) approach looks better, but instead of making this
>>> telemetry support for specific driver, what about making it in two steps.
>>>
>>> First add new dev_ops, (or update existing one), to get registers with
>>> "name: value" format, (in a way to allow empty name), or even perhaps
>>> "name: offset, value" format.
>>> And in second stage add telemetry support around it.
>>> (Name being optional lets us wrap exiting 'get_reg' dev_ops with new one)
>>>
>>> When adding dev_ops, it may get an additional 'filter' parameter, to get
>>> only subset of regs, like "mac*" to get regs name staring with "mac",
>>> this may help for the cases there are too many registers you mentioned.
>>>
>>> Anyway, we can discuss more about its design, but what do you think
>>> about first having a dev_ops for this?
>>
>> I prefer extend struct rte_dev_reg_info, like this:
>>
>> struct rte_eth_reg_name {
>> 	char name[RTE_ETH_REG_NAME_SIZE];
>> };
>>
>> struct rte_dev_reg_info {
>> 	void *data; /**< Buffer for return registers */
>> 	uint32_t offset; /**< Start register table location for access */
>> 	uint32_t length; /**< Number of registers to fetch */
>> 	uint32_t width; /**< Size of device register */
>> 	uint32_t version; /**< Device version */
>> /* Note: below two fields are new added. */
>> 	char *filter; /**< Filter for target subset of registers. This field could affects register selection for data/length/name.  */
>> 	struct rte_eth_reg_name *names; /**< Registers name saver. */
>> };
>>
> 
> ack
> 
>> For driver which don't identify the new filter and names fields:
>>   1. .get_reg return the all registers value.
>>
> 
> ack
> 
> 
>>   2. and driver will not touch the name fields.
>>   3. rte_eth_dev_get_reg_info() could detect name fileds not filled, and then it fill with default names, e.g. offset-1/offset-2/...
>>
> 
> Is there a benefit to provide default names? API can clear the 'names'
> buffer, and driver may or may not fill it. If names not filled, API
> behaves like existing one, it will just provide register values.

ok

> 
> 
>> For driver which identify the new filter and names fields:
>>   1. rte_eth_dev_get_reg_info() will return filtered register's value and also their names.
>>
> 
> ack
> 
>> So that those which invoke rte_eth_dev_get_reg_info() could extra prepare names, and it call the same API will get data and name.
>>
>>
>> Add one new .get_reg_name ops and corresponding API like: rte_eth_dev_get_reg_name() could also feasible.
>> But I think the rte_eth_dev_get_reg_info()'s name is too broad, the info could includes value and also it's name.
>> So I prefer not add one new ops.
>>
> 
> ack
> 
>>
>> Another question? what are the supported values of filters ?
>> I prefer report by dev_info ops, something like a string array end with NULL.
>> Use could query from rte_eth_dev_info_get API.
>>
> 
> I don't think there is a need to populate predefined filter list, it can
> be free text with simple '*' and '.' wildcard support and ',' to support
> list of text.
> 
> User may get full list first, later can filter the ones they are interested.
> Like: "*mac*,*rss*" can match all register names that has 'mac' and
> 'rss' in it.

ok.

Our team will send v1 ASAP.

Thanks.

> 
> .
>
  

Patch

diff --git a/lib/ethdev/rte_ethdev_telemetry.c b/lib/ethdev/rte_ethdev_telemetry.c
index b01028ce9b60..33ec4739aa9b 100644
--- a/lib/ethdev/rte_ethdev_telemetry.c
+++ b/lib/ethdev/rte_ethdev_telemetry.c
@@ -5,6 +5,7 @@ 
 #include <ctype.h>
 #include <stdlib.h>
 
+#include <rte_malloc.h>
 #include <rte_kvargs.h>
 #include <rte_telemetry.h>
 
@@ -1395,6 +1396,96 @@  eth_dev_handle_port_tm_node_caps(const char *cmd __rte_unused,
 	return ret;
 }
 
+static int
+eth_dev_get_port_regs(uint16_t port_id, struct rte_dev_reg_info *reg_info,
+		      const char *file_name)
+{
+	uint64_t buf_size;
+	size_t nr_written;
+	void *data;
+	FILE *fp;
+	int ret;
+
+	ret = rte_eth_dev_get_reg_info(port_id, reg_info);
+	if (ret != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Error getting device reg info: %d\n", ret);
+		return ret;
+	}
+
+	buf_size = reg_info->length * reg_info->width;
+	data = rte_zmalloc(NULL, buf_size, 0);
+	if (!data) {
+		RTE_ETHDEV_LOG(ERR,
+			"Error allocating %zu bytes buffer\n", buf_size);
+		return -ENOMEM;
+	}
+
+	reg_info->data = data;
+	ret = rte_eth_dev_get_reg_info(port_id, reg_info);
+	if (ret != 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Error getting regs from device: %d\n", ret);
+		goto out;
+	}
+
+	fp = fopen(file_name, "wb");
+	if (fp == NULL) {
+		printf("Error during opening '%s' for writing: %s\n",
+			file_name, strerror(errno));
+		ret = -EINVAL;
+	} else {
+		nr_written = fwrite(reg_info->data, 1, buf_size, fp);
+		if (nr_written != buf_size)
+			printf("Error during writing %s: %s\n",
+				file_name, strerror(errno));
+		fclose(fp);
+	}
+
+out:
+	rte_free(data);
+	reg_info->data = NULL;
+	return ret;
+}
+
+static int
+eth_dev_handle_port_regs(const char *cmd __rte_unused,
+		const char *params,
+		struct rte_tel_data *d)
+{
+	struct rte_dev_reg_info reg_info = {0};
+	char file_name[RTE_TEL_MAX_STRING_LEN];
+	const char *suffix;
+	uint16_t port_id;
+	char *end_param;
+	int ret;
+
+	ret = eth_dev_parse_port_params(params, &port_id, &end_param, true);
+	if (ret != 0)
+		return ret;
+
+	suffix = strtok_r(end_param, ",", &end_param);
+	if (!suffix || strlen(suffix) == 0) {
+		RTE_ETHDEV_LOG(ERR,
+			"Please pass suffix parameters ethdev telemetry command\n");
+		return -EINVAL;
+	}
+	snprintf(file_name, RTE_TEL_MAX_STRING_LEN, "port_%u_regs_%s",
+		 port_id, suffix);
+	ret = eth_dev_get_port_regs(port_id, &reg_info, file_name);
+	if (ret != 0)
+		return ret;
+
+	rte_tel_data_start_dict(d);
+	rte_tel_data_add_dict_uint(d, "regs_offset", reg_info.offset);
+	rte_tel_data_add_dict_uint(d, "regs_length", reg_info.length);
+	rte_tel_data_add_dict_uint(d, "regs_width", reg_info.width);
+	rte_tel_data_add_dict_uint_hex(d, "device_version", reg_info.version, 0);
+	rte_tel_data_add_dict_string(d, "regs_file", file_name);
+
+	return 0;
+}
+
 RTE_INIT(ethdev_init_telemetry)
 {
 	rte_telemetry_register_cmd("/ethdev/list", eth_dev_handle_port_list,
@@ -1436,4 +1527,6 @@  RTE_INIT(ethdev_init_telemetry)
 			"Returns TM Level Capabilities info for a port. Parameters: int port_id, int level_id (see tm_capability for the max)");
 	rte_telemetry_register_cmd("/ethdev/tm_node_capability", eth_dev_handle_port_tm_node_caps,
 			"Returns TM Node Capabilities info for a port. Parameters: int port_id, int node_id (see tm_capability for the max)");
+	rte_telemetry_register_cmd("/ethdev/regs", eth_dev_handle_port_regs,
+			"Returns regs for a port. Parameters: int port_id, char *suffix");
 }