eal: add telemetry callbacks for memory info

Message ID 20210915095336.105635-1-hkalra@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: David Marchand
Headers
Series eal: add telemetry callbacks for memory info |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/github-robot: build success github build: passed
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-x86_64-compile-testing fail Testing issues
ci/iol-mellanox-Performance success Performance Testing PASS

Commit Message

Harman Kalra Sept. 15, 2021, 9:53 a.m. UTC
  Registering new telemetry callbacks to dump named (memzones)
and unnamed (malloc) memory information to a file provided as
an argument.

Example:
Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
{"version": "DPDK 21.08.0", "pid": 34075, "max_output_len": 16384}
Connected to application: "dpdk-testpmd"
--> /eal/malloc_dump,/tmp/malloc_dump
{"/eal/malloc_dump": {"Malloc elements file: ": "/tmp/malloc_dump"}}
-->
--> /eal/malloc_info,/tmp/info
{"/eal/malloc_info": {"Malloc stats file: ": "/tmp/info"}}
-->
-->
--> /eal/memzone_dump,/tmp/memzone_info
{"/eal/memzone_dump": {"Memzones count: ": 11, \
"Memzones info file: ": "/tmp/memzone_info"}}

Signed-off-by: Harman Kalra <hkalra@marvell.com>
---
 lib/eal/common/eal_common_memory.c | 78 ++++++++++++++++++++++++++++++
 1 file changed, 78 insertions(+)
  

Comments

Bruce Richardson Sept. 20, 2021, 3:56 p.m. UTC | #1
On Wed, Sep 15, 2021 at 03:23:36PM +0530, Harman Kalra wrote:
> Registering new telemetry callbacks to dump named (memzones)
> and unnamed (malloc) memory information to a file provided as
> an argument.
> 
> Example:
> Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
> {"version": "DPDK 21.08.0", "pid": 34075, "max_output_len": 16384}
> Connected to application: "dpdk-testpmd"
> --> /eal/malloc_dump,/tmp/malloc_dump
> {"/eal/malloc_dump": {"Malloc elements file: ": "/tmp/malloc_dump"}}
> -->
> --> /eal/malloc_info,/tmp/info
> {"/eal/malloc_info": {"Malloc stats file: ": "/tmp/info"}}
> -->
> -->
> --> /eal/memzone_dump,/tmp/memzone_info
> {"/eal/memzone_dump": {"Memzones count: ": 11, \
> "Memzones info file: ": "/tmp/memzone_info"}}
> 
> Signed-off-by: Harman Kalra <hkalra@marvell.com>
> ---

For this info, why not just send the data out as telemetry data rather than
writing files on the filesystem containing it? If the info is too large to
dump it all in a single go, a shortened form could be sent via some form of
list call, and additional calls could be used to provide more detail on
specific items in the list.

 Also, this seems more a debugging operation than a telemetry one, though I
don't have a strong objection to the info being exported as telemetry
directly (just not via filesystem).

Regards,
/Bruce
  
Harman Kalra Sept. 21, 2021, 9:05 a.m. UTC | #2
> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Monday, September 20, 2021 9:27 PM
> To: Harman Kalra <hkalra@marvell.com>
> Cc: dev@dpdk.org; ciara.power@intel.com; Anatoly Burakov
> <anatoly.burakov@intel.com>
> Subject: [EXT] Re: [PATCH] eal: add telemetry callbacks for memory info
> 
> External Email
> 
> ----------------------------------------------------------------------
> On Wed, Sep 15, 2021 at 03:23:36PM +0530, Harman Kalra wrote:
> > Registering new telemetry callbacks to dump named (memzones) and
> > unnamed (malloc) memory information to a file provided as an argument.
> >
> > Example:
> > Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
> > {"version": "DPDK 21.08.0", "pid": 34075, "max_output_len": 16384}
> > Connected to application: "dpdk-testpmd"
> > --> /eal/malloc_dump,/tmp/malloc_dump
> > {"/eal/malloc_dump": {"Malloc elements file: ": "/tmp/malloc_dump"}}
> > -->
> > --> /eal/malloc_info,/tmp/info
> > {"/eal/malloc_info": {"Malloc stats file: ": "/tmp/info"}}
> > -->
> > -->
> > --> /eal/memzone_dump,/tmp/memzone_info
> > {"/eal/memzone_dump": {"Memzones count: ": 11, \ "Memzones info file:
> > ": "/tmp/memzone_info"}}
> >
> > Signed-off-by: Harman Kalra <hkalra@marvell.com>
> > ---
> 
> For this info, why not just send the data out as telemetry data rather than
> writing files on the filesystem containing it? If the info is too large to dump it
> all in a single go, a shortened form could be sent via some form of list call,
> and additional calls could be used to provide more detail on specific items in
> the list.
> 
>  Also, this seems more a debugging operation than a telemetry one, though I
> don't have a strong objection to the info being exported as telemetry directly
> (just not via filesystem).
> 
> Regards,
> /Bruce


Hi Bruce,

Thanks for reviewing the patch.
I have implemented these telemetry commands as a wrapper which uses existing malloc/memzone debug APIs to
collect the debug information, these debug APIs are implemented in the way that they accept a file pointer/stdout.
to get the information.

As a solution either  I should make changes to these debug APIs to accept a buffer also? Or other way could be get
the info dumped into a file, and inside telemetry command parse and convert the info into json format and send it.
But its lot of debug information so will require multiple iterations as you suggested. But on client (peer) side one
will have to again convert json to retrieve the info. 

Just for my understanding, what drawback do you see in dumping the information to a file? Because on peer side
It is very convenient to read the information from dumped file and use it.

Thanks
Harman
  
Bruce Richardson Sept. 27, 2021, 4:37 p.m. UTC | #3
On Tue, Sep 21, 2021 at 09:05:29AM +0000, Harman Kalra wrote:
> 
> > -----Original Message-----
> > From: Bruce Richardson <bruce.richardson@intel.com>
> > Sent: Monday, September 20, 2021 9:27 PM
> > To: Harman Kalra <hkalra@marvell.com>
> > Cc: dev@dpdk.org; ciara.power@intel.com; Anatoly Burakov
> > <anatoly.burakov@intel.com>
> > Subject: [EXT] Re: [PATCH] eal: add telemetry callbacks for memory info
> > 
> > External Email
> > 
> > ----------------------------------------------------------------------
> > On Wed, Sep 15, 2021 at 03:23:36PM +0530, Harman Kalra wrote:
> > > Registering new telemetry callbacks to dump named (memzones) and
> > > unnamed (malloc) memory information to a file provided as an argument.
> > >
> > > Example:
> > > Connecting to /var/run/dpdk/rte/dpdk_telemetry.v2
> > > {"version": "DPDK 21.08.0", "pid": 34075, "max_output_len": 16384}
> > > Connected to application: "dpdk-testpmd"
> > > --> /eal/malloc_dump,/tmp/malloc_dump
> > > {"/eal/malloc_dump": {"Malloc elements file: ": "/tmp/malloc_dump"}}
> > > -->
> > > --> /eal/malloc_info,/tmp/info
> > > {"/eal/malloc_info": {"Malloc stats file: ": "/tmp/info"}}
> > > -->
> > > -->
> > > --> /eal/memzone_dump,/tmp/memzone_info
> > > {"/eal/memzone_dump": {"Memzones count: ": 11, \ "Memzones info file:
> > > ": "/tmp/memzone_info"}}
> > >
> > > Signed-off-by: Harman Kalra <hkalra@marvell.com>
> > > ---
> > 
> > For this info, why not just send the data out as telemetry data rather than
> > writing files on the filesystem containing it? If the info is too large to dump it
> > all in a single go, a shortened form could be sent via some form of list call,
> > and additional calls could be used to provide more detail on specific items in
> > the list.
> > 
> >  Also, this seems more a debugging operation than a telemetry one, though I
> > don't have a strong objection to the info being exported as telemetry directly
> > (just not via filesystem).
> > 
> > Regards,
> > /Bruce
> 
> 
> Hi Bruce,
> 
> Thanks for reviewing the patch.
> I have implemented these telemetry commands as a wrapper which uses existing malloc/memzone debug APIs to
> collect the debug information, these debug APIs are implemented in the way that they accept a file pointer/stdout.
> to get the information.
> 
> As a solution either  I should make changes to these debug APIs to accept a buffer also? Or other way could be get
> the info dumped into a file, and inside telemetry command parse and convert the info into json format and send it.
> But its lot of debug information so will require multiple iterations as you suggested. But on client (peer) side one
> will have to again convert json to retrieve the info. 
> 
> Just for my understanding, what drawback do you see in dumping the information to a file? Because on peer side
> It is very convenient to read the information from dumped file and use it.
> 

Hi 

The drawback is largely a philosophical one in that what you add here are
not read-operations for telemetry, but rather commands which cause the
application to make changes on the running system - i.e. write out files.
It's certainly something we could look to do, but I think we should only do
so with some careful thought, rather than just adding it ad-hoc.

Regards,
/Bruce
  
Harman Kalra Oct. 7, 2021, 11:01 a.m. UTC | #4
> -----Original Message-----
> From: Bruce Richardson <bruce.richardson@intel.com>
> Sent: Monday, September 27, 2021 10:08 PM
> To: Harman Kalra <hkalra@marvell.com>
> Cc: dev@dpdk.org; ciara.power@intel.com; Anatoly Burakov
> <anatoly.burakov@intel.com>
> Subject: Re: [EXT] Re: [PATCH] eal: add telemetry callbacks for memory info
> 
> On Tue, Sep 21, 2021 at 09:05:29AM +0000, Harman Kalra wrote:
> >
> > > -----Original Message-----
> > > From: Bruce Richardson <bruce.richardson@intel.com>
> > > Sent: Monday, September 20, 2021 9:27 PM
> > > To: Harman Kalra <hkalra@marvell.com>
> > > Cc: dev@dpdk.org; ciara.power@intel.com; Anatoly Burakov
> > > <anatoly.burakov@intel.com>
> > > Subject: [EXT] Re: [PATCH] eal: add telemetry callbacks for memory
> > > info
> > >
> > > External Email
> > >
> > > --------------------------------------------------------------------
> > > > Signed-off-by: Harman Kalra <hkalra@marvell.com>
> > > > ---
> > >
> > > For this info, why not just send the data out as telemetry data
> > > rather than writing files on the filesystem containing it? If the
> > > info is too large to dump it all in a single go, a shortened form
> > > could be sent via some form of list call, and additional calls could
> > > be used to provide more detail on specific items in the list.
> > >
> > >  Also, this seems more a debugging operation than a telemetry one,
> > > though I don't have a strong objection to the info being exported as
> > > telemetry directly (just not via filesystem).
> > >
> > > Regards,
> > > /Bruce
> >
> >
> > Hi Bruce,
> >
> > Thanks for reviewing the patch.
> > I have implemented these telemetry commands as a wrapper which uses
> > existing malloc/memzone debug APIs to collect the debug information,
> these debug APIs are implemented in the way that they accept a file
> pointer/stdout.
> > to get the information.
> >
> > As a solution either  I should make changes to these debug APIs to
> > accept a buffer also? Or other way could be get the info dumped into a file,
> and inside telemetry command parse and convert the info into json format
> and send it.
> > But its lot of debug information so will require multiple iterations
> > as you suggested. But on client (peer) side one will have to again convert
> json to retrieve the info.
> >
> > Just for my understanding, what drawback do you see in dumping the
> > information to a file? Because on peer side It is very convenient to read the
> information from dumped file and use it.
> >
> 
> Hi
> 
> The drawback is largely a philosophical one in that what you add here are
> not read-operations for telemetry, but rather commands which cause the
> application to make changes on the running system - i.e. write out files.
> It's certainly something we could look to do, but I think we should only do so
> with some careful thought, rather than just adding it ad-hoc.

Hi Bruce

I have started working on this, and will come up with proper implementation as suggested.

Thanks
Harman

> 
> Regards,
> /Bruce
  

Patch

diff --git a/lib/eal/common/eal_common_memory.c b/lib/eal/common/eal_common_memory.c
index f83b75092e..592b3453b6 100644
--- a/lib/eal/common/eal_common_memory.c
+++ b/lib/eal/common/eal_common_memory.c
@@ -20,6 +20,8 @@ 
 #include <rte_eal_paging.h>
 #include <rte_errno.h>
 #include <rte_log.h>
+#include <rte_string_fns.h>
+#include <rte_telemetry.h>
 
 #include "eal_memalloc.h"
 #include "eal_private.h"
@@ -38,6 +40,7 @@ 
 
 #define MEMSEG_LIST_FMT "memseg-%" PRIu64 "k-%i-%i"
 
+static int count;
 static void *next_baseaddr;
 static uint64_t system_page_sz;
 
@@ -1102,3 +1105,78 @@  rte_eal_memory_init(void)
 	rte_mcfg_mem_read_unlock();
 	return -1;
 }
+
+#define EAL_MEMZONE_DUMP_REQ	"/eal/memzone_dump"
+#define EAL_MALLOC_INFO_REQ	"/eal/malloc_info"
+#define EAL_MALLOC_DUMP_REQ	"/eal/malloc_dump"
+
+static void
+memzone_walk_clb(const struct rte_memzone *mz __rte_unused,
+		 void *arg __rte_unused)
+{
+	count++;
+}
+
+/* Callback handler for telemetry library to dump named and unnamed memory
+ * information.
+ */
+static int
+handle_eal_mem_info_request(const char *cmd, const char *params,
+		      struct rte_tel_data *d)
+{
+	char filename[PATH_MAX];
+	FILE *fp;
+
+	if (params == NULL || strlen(params) == 0)
+		return -1;
+
+	rte_strscpy(filename, params, PATH_MAX);
+
+	fp = fopen(filename, "w+");
+	if (fp == NULL) {
+		RTE_LOG(ERR, EAL, "cannot open %s", filename);
+		return -1;
+	}
+
+	rte_tel_data_start_dict(d);
+	/* Dumping memzone info. */
+	if (strcmp(cmd, EAL_MEMZONE_DUMP_REQ) == 0) {
+		count = 0;
+		/* Callback to count memzones */
+		rte_memzone_walk(memzone_walk_clb, NULL);
+		rte_tel_data_add_dict_int(d, "Memzones count: ", count);
+		rte_tel_data_add_dict_string(d, "Memzones info file: ",
+					     filename);
+		rte_memzone_dump(fp);
+	}
+
+	/* Dumping malloc statistics */
+	if (strcmp(cmd, EAL_MALLOC_INFO_REQ) == 0) {
+		rte_tel_data_add_dict_string(d, "Malloc stats file: ",
+					     filename);
+		rte_malloc_dump_stats(fp, NULL);
+	}
+
+	/* Dumping malloc elements info */
+	if (strcmp(cmd, EAL_MALLOC_DUMP_REQ) == 0) {
+		rte_tel_data_add_dict_string(d, "Malloc elements file: ",
+					     filename);
+		rte_malloc_dump_heaps(fp);
+	}
+
+	fclose(fp);
+	return 0;
+}
+
+RTE_INIT(memory_telemetry)
+{
+	rte_telemetry_register_cmd(
+			EAL_MEMZONE_DUMP_REQ, handle_eal_mem_info_request,
+			"Dumps memzones info to file. Parameters: file name");
+	rte_telemetry_register_cmd(
+			EAL_MALLOC_INFO_REQ, handle_eal_mem_info_request,
+			"Dumps malloc info to file. Parameters: file name");
+	rte_telemetry_register_cmd(
+			EAL_MALLOC_DUMP_REQ, handle_eal_mem_info_request,
+			"Dumps malloc elems to file. Parameters: file name");
+}