mbox series

[v2,00/13] telemetry JSON escaping and other enhancements

Message ID 20220725163543.875775-1-bruce.richardson@intel.com (mailing list archive)
Headers
Series telemetry JSON escaping and other enhancements |

Message

Bruce Richardson July 25, 2022, 4:35 p.m. UTC
  This patchset contains fixes for the problem of handling characters
returned by telemetry callbacks which require escaping when encoded in
JSON format. It also includes unit tests to validate the correct
encoding in such scenarios and a number of smaller enhancements to
telemetry and telemetry testing.

RFC->V2:
* limited characters allowed in dictionary element names and command
  names to side-step the encoding problems there.
* added support for proper escaping of dictionary string values
* added more testing and test cases
* added other misc telemetry cleanups and refactoring

Bruce Richardson (13):
  test/telemetry_json: print success or failure per subtest
  telemetry: fix escaping of invalid json characters
  test/telemetry_json: add test for string character escaping
  telemetry: add escaping of strings in arrays
  test/telemetry-json: add test for escaping strings in arrays
  telemetry: limit characters allowed in dictionary names
  telemetry: add escaping of strings in dicts
  test/telemetry_json: add test for string escaping in objects
  telemetry: limit command characters
  test/telemetry_data: refactor for maintainability
  test/telemetry_data: add test cases for character escaping
  telemetry: eliminate duplicate code for json output
  telemetry: make help command more helpful

 app/test/test_telemetry_data.c       | 138 +++++++++++++++++++--------
 app/test/test_telemetry_json.c       |  98 +++++++++++++++++--
 doc/guides/rel_notes/deprecation.rst |   8 --
 lib/telemetry/rte_telemetry.h        |   8 ++
 lib/telemetry/telemetry.c            |  51 +++++-----
 lib/telemetry/telemetry_data.c       |  32 +++++++
 lib/telemetry/telemetry_json.h       |  72 ++++++++++++--
 7 files changed, 318 insertions(+), 89 deletions(-)

--
2.34.1
  

Comments

Morten Brørup July 26, 2022, 2:36 p.m. UTC | #1
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Monday, 25 July 2022 18.35
> 
> This patchset contains fixes for the problem of handling characters
> returned by telemetry callbacks which require escaping when encoded in
> JSON format. It also includes unit tests to validate the correct
> encoding in such scenarios and a number of smaller enhancements to
> telemetry and telemetry testing.
> 
> RFC->V2:
> * limited characters allowed in dictionary element names and command
>   names to side-step the encoding problems there.
> * added support for proper escaping of dictionary string values
> * added more testing and test cases
> * added other misc telemetry cleanups and refactoring
> 

Good job, Bruce!

Series-Acked-by: Morten Brørup <mb@smartsharesystems.com>
  
fengchengwen July 27, 2022, 1:51 a.m. UTC | #2
Hi Bruce,

I think escape the string at begin (following function) seem more simple:
	rte_tel_data_string
	rte_tel_data_add_array_string
	rte_tel_data_add_dict_string

int
rte_tel_data_string(struct rte_tel_data *d, const char *str)
{
	d->type = RTE_TEL_STRING;
	d->data_len = strlcpy(d->data.str, str, sizeof(d->data.str));
		// e.g. do escape here!
	if (d->data_len >= RTE_TEL_MAX_SINGLE_STRING_LEN) {
		d->data_len = RTE_TEL_MAX_SINGLE_STRING_LEN - 1;
		return E2BIG; /* not necessarily and error, just truncation */
	}
	return 0;
}

Advantages:
1. simpler implementation
2. application are directly visible the result (by judge API retval) without waiting for JSON encapsulation.

Disadvantages:
1. not friend for new output format, but currently telemetry deep depend on json, so I think it's OK for it.


On 2022/7/26 0:35, Bruce Richardson wrote:
> This patchset contains fixes for the problem of handling characters
> returned by telemetry callbacks which require escaping when encoded in
> JSON format. It also includes unit tests to validate the correct
> encoding in such scenarios and a number of smaller enhancements to
> telemetry and telemetry testing.
> 
> RFC->V2:
> * limited characters allowed in dictionary element names and command
>   names to side-step the encoding problems there.
> * added support for proper escaping of dictionary string values
> * added more testing and test cases
> * added other misc telemetry cleanups and refactoring
> 
> Bruce Richardson (13):
>   test/telemetry_json: print success or failure per subtest
>   telemetry: fix escaping of invalid json characters
>   test/telemetry_json: add test for string character escaping
>   telemetry: add escaping of strings in arrays
>   test/telemetry-json: add test for escaping strings in arrays
>   telemetry: limit characters allowed in dictionary names
>   telemetry: add escaping of strings in dicts
>   test/telemetry_json: add test for string escaping in objects
>   telemetry: limit command characters
>   test/telemetry_data: refactor for maintainability
>   test/telemetry_data: add test cases for character escaping
>   telemetry: eliminate duplicate code for json output
>   telemetry: make help command more helpful
> 
>  app/test/test_telemetry_data.c       | 138 +++++++++++++++++++--------
>  app/test/test_telemetry_json.c       |  98 +++++++++++++++++--
>  doc/guides/rel_notes/deprecation.rst |   8 --
>  lib/telemetry/rte_telemetry.h        |   8 ++
>  lib/telemetry/telemetry.c            |  51 +++++-----
>  lib/telemetry/telemetry_data.c       |  32 +++++++
>  lib/telemetry/telemetry_json.h       |  72 ++++++++++++--
>  7 files changed, 318 insertions(+), 89 deletions(-)
> 
> --
> 2.34.1
> 
> 
> .
>
  
Bruce Richardson July 27, 2022, 9:12 a.m. UTC | #3
On Wed, Jul 27, 2022 at 09:51:04AM +0800, fengchengwen wrote:
> Hi Bruce,
> 
> I think escape the string at begin (following function) seem more simple:
> 	rte_tel_data_string
> 	rte_tel_data_add_array_string
> 	rte_tel_data_add_dict_string
> 
> int
> rte_tel_data_string(struct rte_tel_data *d, const char *str)
> {
> 	d->type = RTE_TEL_STRING;
> 	d->data_len = strlcpy(d->data.str, str, sizeof(d->data.str));
> 		// e.g. do escape here!
> 	if (d->data_len >= RTE_TEL_MAX_SINGLE_STRING_LEN) {
> 		d->data_len = RTE_TEL_MAX_SINGLE_STRING_LEN - 1;
> 		return E2BIG; /* not necessarily and error, just truncation */
> 	}
> 	return 0;
> }
> 
> Advantages:
> 1. simpler implementation
> 2. application are directly visible the result (by judge API retval) without waiting for JSON encapsulation.
> 
> Disadvantages:
> 1. not friend for new output format, but currently telemetry deep depend on json, so I think it's OK for it.
>
I'm quite happy to implement things in a simpler way, however, in the past
there was a great concern to keep things flexible enough for future changes
to add other output formats. By that logic, keeping the escaping in the
json layer is the correct design choice.

However, adding escaping on addition to the return data structure may not
be that much of an inconvience to other output formats too, so perhaps it's
acceptable.

Again, looking for more input and consensus from the community. I am happy
to go with either approach for escaping - putting it in the data return
layer or the json one.

/Bruce
  
Morten Brørup July 27, 2022, 9:49 a.m. UTC | #4
> From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> Sent: Wednesday, 27 July 2022 11.13
> 
> On Wed, Jul 27, 2022 at 09:51:04AM +0800, fengchengwen wrote:
> > Hi Bruce,
> >
> > I think escape the string at begin (following function) seem more
> simple:
> > 	rte_tel_data_string
> > 	rte_tel_data_add_array_string
> > 	rte_tel_data_add_dict_string
> >
> > int
> > rte_tel_data_string(struct rte_tel_data *d, const char *str)
> > {
> > 	d->type = RTE_TEL_STRING;
> > 	d->data_len = strlcpy(d->data.str, str, sizeof(d->data.str));
> > 		// e.g. do escape here!
> > 	if (d->data_len >= RTE_TEL_MAX_SINGLE_STRING_LEN) {
> > 		d->data_len = RTE_TEL_MAX_SINGLE_STRING_LEN - 1;
> > 		return E2BIG; /* not necessarily and error, just truncation
> */
> > 	}
> > 	return 0;
> > }
> >
> > Advantages:
> > 1. simpler implementation
> > 2. application are directly visible the result (by judge API retval)
> without waiting for JSON encapsulation.
> >
> > Disadvantages:
> > 1. not friend for new output format, but currently telemetry deep
> depend on json, so I think it's OK for it.

The telemetry library currently only implements JSON as the output format. I understand why this gives the impression that it depends on JSON. But it does not.

When the DPDK telemetry library was initially discussed, JSON was the only target. Before that settled in too deeply, I argued for using generic data types internally, so it could support other formats (e.g. SNMP), which resulted in the current layering of the telemetry library - a both elegant and practical solution.

> >
> I'm quite happy to implement things in a simpler way, however, in the
> past
> there was a great concern to keep things flexible enough for future
> changes
> to add other output formats. By that logic, keeping the escaping in the
> json layer is the correct design choice.
> 
> However, adding escaping on addition to the return data structure may
> not
> be that much of an inconvience to other output formats too, so perhaps
> it's
> acceptable.
> 
> Again, looking for more input and consensus from the community. I am
> happy
> to go with either approach for escaping - putting it in the data return
> layer or the json one.
> 
> /Bruce

JSON encoding belongs in the JSON output layer.

Don't make assumptions about other output formats! The char '\' might not be legal in all output formats. And I certainly don't want any newlines escaped in SNMP text objects.
  
Power, Ciara Aug. 23, 2022, 12:35 p.m. UTC | #5
> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Monday 25 July 2022 17:35
> To: dev@dpdk.org
> Cc: Richardson, Bruce <bruce.richardson@intel.com>
> Subject: [PATCH v2 00/13] telemetry JSON escaping and other
> enhancements
> 
> This patchset contains fixes for the problem of handling characters returned
> by telemetry callbacks which require escaping when encoded in JSON
> format. It also includes unit tests to validate the correct encoding in such
> scenarios and a number of smaller enhancements to telemetry and
> telemetry testing.
> 
> RFC->V2:
> * limited characters allowed in dictionary element names and command
>   names to side-step the encoding problems there.
> * added support for proper escaping of dictionary string values
> * added more testing and test cases
> * added other misc telemetry cleanups and refactoring
> 
> Bruce Richardson (13):
>   test/telemetry_json: print success or failure per subtest
>   telemetry: fix escaping of invalid json characters
>   test/telemetry_json: add test for string character escaping
>   telemetry: add escaping of strings in arrays
>   test/telemetry-json: add test for escaping strings in arrays
>   telemetry: limit characters allowed in dictionary names
>   telemetry: add escaping of strings in dicts
>   test/telemetry_json: add test for string escaping in objects
>   telemetry: limit command characters
>   test/telemetry_data: refactor for maintainability
>   test/telemetry_data: add test cases for character escaping
>   telemetry: eliminate duplicate code for json output
>   telemetry: make help command more helpful
> 
>  app/test/test_telemetry_data.c       | 138 +++++++++++++++++++--------
>  app/test/test_telemetry_json.c       |  98 +++++++++++++++++--
>  doc/guides/rel_notes/deprecation.rst |   8 --
>  lib/telemetry/rte_telemetry.h        |   8 ++
>  lib/telemetry/telemetry.c            |  51 +++++-----
>  lib/telemetry/telemetry_data.c       |  32 +++++++
>  lib/telemetry/telemetry_json.h       |  72 ++++++++++++--
>  7 files changed, 318 insertions(+), 89 deletions(-)
> 
> --
> 2.34.1
 
Looks great, thanks Bruce.

Series-Acked-by: Ciara Power <ciara.power@intel.com>