[24.03,v2,1/5] arg_parser: new library for command line parsing

Message ID 20231128140745.595481-2-euan.bourke@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series add new command line argument parsing library |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Euan Bourke Nov. 28, 2023, 2:07 p.m. UTC
  Add a new library to make it easier for eal and other libraries to parse
command line arguments.

The first function in this library is one to parse a corelist string
into an array of individual core ids. The function will then return the
total number of cores described in the corelist.

Signed-off-by: Euan Bourke <euan.bourke@intel.com>
---
 .mailmap                        |   1 +
 MAINTAINERS                     |   4 ++
 doc/api/doxy-api-index.md       |   3 +-
 doc/api/doxy-api.conf.in        |   1 +
 lib/arg_parser/arg_parser.c     | 113 ++++++++++++++++++++++++++++++++
 lib/arg_parser/meson.build      |   7 ++
 lib/arg_parser/rte_arg_parser.h |  65 ++++++++++++++++++
 lib/arg_parser/version.map      |  10 +++
 lib/meson.build                 |   2 +
 9 files changed, 205 insertions(+), 1 deletion(-)
 create mode 100644 lib/arg_parser/arg_parser.c
 create mode 100644 lib/arg_parser/meson.build
 create mode 100644 lib/arg_parser/rte_arg_parser.h
 create mode 100644 lib/arg_parser/version.map
  

Comments

Stephen Hemminger Nov. 29, 2023, 10:12 p.m. UTC | #1
On Tue, 28 Nov 2023 14:07:41 +0000
Euan Bourke <euan.bourke@intel.com> wrote:

> +struct core_bits {
> +	uint8_t bits[(UINT16_MAX + 1)/CHAR_BIT];
> +	uint16_t max_bit_set;
> +	uint16_t min_bit_set;
> +	uint32_t total_bits_set;
> +};

Looks like a good candidate for flex array
  
Stephen Hemminger Nov. 29, 2023, 10:12 p.m. UTC | #2
On Tue, 28 Nov 2023 14:07:41 +0000
Euan Bourke <euan.bourke@intel.com> wrote:

> +static inline bool
> +get_core_bit(struct core_bits *mask, uint16_t idx)
> +{
> +	return !!(mask->bits[idx / 8] & (1 << (idx % 8)));
> +}
> +

You used CHAR_BIT above, why not here?
  
Stephen Hemminger Nov. 29, 2023, 10:14 p.m. UTC | #3
On Tue, 28 Nov 2023 14:07:41 +0000
Euan Bourke <euan.bourke@intel.com> wrote:

> +int
> +rte_parse_corelist(const char *corelist, uint16_t *cores, uint32_t cores_len)
> +{
> +	int32_t min = -1;
> +	char *end = NULL;
> +
> +	struct core_bits *mask = malloc(sizeof(struct core_bits));
> +	if (mask == NULL)
> +		return -1;
> +	memset(mask, 0, sizeof(struct core_bits));
> +

The structure is not variable size, and small, why bother using malloc().
Just use on stack structure.
  
Bruce Richardson Nov. 30, 2023, 8:59 a.m. UTC | #4
On Wed, Nov 29, 2023 at 02:14:13PM -0800, Stephen Hemminger wrote:
> On Tue, 28 Nov 2023 14:07:41 +0000
> Euan Bourke <euan.bourke@intel.com> wrote:
> 
> > +int
> > +rte_parse_corelist(const char *corelist, uint16_t *cores, uint32_t cores_len)
> > +{
> > +	int32_t min = -1;
> > +	char *end = NULL;
> > +
> > +	struct core_bits *mask = malloc(sizeof(struct core_bits));
> > +	if (mask == NULL)
> > +		return -1;
> > +	memset(mask, 0, sizeof(struct core_bits));
> > +
> 
> The structure is not variable size, and small, why bother using malloc().
> Just use on stack structure.

Well, it's just over 4k in size, and we hit problems in the past with
alpine linux where the stack size wasn't as big as expected, leading to
issues with telemetry string manipulation. In that case I think it was due
to nested calls as well as large stack structs, though. For this core
parsing, having this struct on stack is unlikely to cause issues, but for a
non-datapath API I suggests to Euan to use malloc as a lower-risk
alternative, since performance of an arg parsing API is unlikely to be a
major concern.

However, if you feel that it should be on the stack and won't cause any
issues, it does make the code shorter and easier to work with!

/Bruce
  

Patch

diff --git a/.mailmap b/.mailmap
index ab0742a382..528bc68a30 100644
--- a/.mailmap
+++ b/.mailmap
@@ -379,6 +379,7 @@  Eric Zhang <eric.zhang@windriver.com>
 Erik Gabriel Carrillo <erik.g.carrillo@intel.com>
 Erik Ziegenbalg <eziegenb@brocade.com>
 Erlu Chen <erlu.chen@intel.com>
+Euan Bourke <euan.bourke@intel.com>
 Eugenio Pérez <eperezma@redhat.com>
 Eugeny Parshutin <eugeny.parshutin@linux.intel.com>
 Evan Swanson <evan.swanson@intel.com>
diff --git a/MAINTAINERS b/MAINTAINERS
index 0d1c8126e3..68ef5ba14b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1756,6 +1756,10 @@  M: Nithin Dabilpuram <ndabilpuram@marvell.com>
 M: Pavan Nikhilesh <pbhagavatula@marvell.com>
 F: lib/node/
 
+Argument parsing
+M: Bruce Richardson <bruce.richardson@intel.com>
+F: lib/arg_parser/
+
 
 Test Applications
 -----------------
diff --git a/doc/api/doxy-api-index.md b/doc/api/doxy-api-index.md
index a6a768bd7c..f711010140 100644
--- a/doc/api/doxy-api-index.md
+++ b/doc/api/doxy-api-index.md
@@ -221,7 +221,8 @@  The public API headers are grouped by topics:
   [config file](@ref rte_cfgfile.h),
   [key/value args](@ref rte_kvargs.h),
   [string](@ref rte_string_fns.h),
-  [thread](@ref rte_thread.h)
+  [thread](@ref rte_thread.h),
+  [argument parsing](@ref rte_arg_parser.h)
 
 - **debug**:
   [jobstats](@ref rte_jobstats.h),
diff --git a/doc/api/doxy-api.conf.in b/doc/api/doxy-api.conf.in
index e94c9e4e46..05718ba6ed 100644
--- a/doc/api/doxy-api.conf.in
+++ b/doc/api/doxy-api.conf.in
@@ -28,6 +28,7 @@  INPUT                   = @TOPDIR@/doc/api/doxy-api-index.md \
                           @TOPDIR@/lib/eal/include \
                           @TOPDIR@/lib/eal/include/generic \
                           @TOPDIR@/lib/acl \
+                          @TOPDIR@/lib/arg_parser \
                           @TOPDIR@/lib/bbdev \
                           @TOPDIR@/lib/bitratestats \
                           @TOPDIR@/lib/bpf \
diff --git a/lib/arg_parser/arg_parser.c b/lib/arg_parser/arg_parser.c
new file mode 100644
index 0000000000..4aa876b4ed
--- /dev/null
+++ b/lib/arg_parser/arg_parser.c
@@ -0,0 +1,113 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 Intel Corporation
+ */
+
+#include "errno.h"
+#include "stdlib.h"
+#include "ctype.h"
+#include "string.h"
+#include "stdbool.h"
+
+#include <rte_arg_parser.h>
+#include <rte_common.h>
+
+
+struct core_bits {
+	uint8_t bits[(UINT16_MAX + 1)/CHAR_BIT];
+	uint16_t max_bit_set;
+	uint16_t min_bit_set;
+	uint32_t total_bits_set;
+};
+
+static inline bool
+get_core_bit(struct core_bits *mask, uint16_t idx)
+{
+	return !!(mask->bits[idx / 8] & (1 << (idx % 8)));
+}
+
+static inline void
+set_core_bit(struct core_bits *mask, uint16_t idx)
+{
+	if (get_core_bit(mask, idx))
+		return;
+
+	mask->bits[idx/8] |= 1 << (idx % 8);
+	/* Update min and max bit if its first time setting a bit */
+	if (++(mask->total_bits_set) == 1) {
+		mask->min_bit_set = idx;
+		mask->max_bit_set = idx;
+		return;
+	}
+
+	if (idx > mask->max_bit_set)
+		mask->max_bit_set = idx;
+
+	if (idx < mask->min_bit_set)
+		mask->min_bit_set = idx;
+}
+
+static inline uint32_t
+corebits_to_array(struct core_bits *mask, uint16_t *cores, size_t max_cores)
+{
+	uint32_t count = 0;
+	for (uint32_t i = mask->min_bit_set; i <= mask->max_bit_set && count < max_cores; i++) {
+		if (get_core_bit(mask, i))
+			cores[count++] = i;
+	}
+	return mask->total_bits_set;
+}
+
+
+int
+rte_parse_corelist(const char *corelist, uint16_t *cores, uint32_t cores_len)
+{
+	int32_t min = -1;
+	char *end = NULL;
+
+	struct core_bits *mask = malloc(sizeof(struct core_bits));
+	if (mask == NULL)
+		return -1;
+	memset(mask, 0, sizeof(struct core_bits));
+
+	min = -1;
+	do {
+		uint32_t idx;
+		int32_t max;
+
+		while (isblank(*corelist))
+			corelist++;
+		if (!isdigit(*corelist))
+			return -1;
+
+		errno = 0;
+		idx = strtol(corelist, &end, 10);
+		if (errno || end == NULL || idx > UINT16_MAX)
+			return -1;
+		while (isblank(*end))
+			end++;
+		if (*end == '-')
+			min = idx;
+
+		else if (*end == ',' || *end == '\0') {
+			max = idx;
+			if (min == -1)
+				min = idx;
+
+			/* Swap min and max if min is larger than max */
+			if (min > max)
+				RTE_SWAP(min, max);
+
+			for (; min <= max; min++)
+				set_core_bit(mask, min);
+
+			min = -1;
+		} else
+			return -1;
+		corelist = end + 1;
+	} while (*end != '\0');
+
+	uint32_t total_count = corebits_to_array(mask, cores, cores_len);
+	free(mask);
+
+	return total_count;
+}
diff --git a/lib/arg_parser/meson.build b/lib/arg_parser/meson.build
new file mode 100644
index 0000000000..6ee228bd69
--- /dev/null
+++ b/lib/arg_parser/meson.build
@@ -0,0 +1,7 @@ 
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2023 Intel Corporation
+
+sources = files('arg_parser.c')
+headers = files('rte_arg_parser.h')
+
+includes += global_inc
diff --git a/lib/arg_parser/rte_arg_parser.h b/lib/arg_parser/rte_arg_parser.h
new file mode 100644
index 0000000000..cf9291d13a
--- /dev/null
+++ b/lib/arg_parser/rte_arg_parser.h
@@ -0,0 +1,65 @@ 
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2023 Intel Corporation
+ */
+
+#ifndef _RTE_ARG_PARSER_H_
+#define _RTE_ARG_PARSER_H_
+
+/**
+ * @file
+ *
+ * RTE Argument Parsing API
+ *
+ * The argument parsing API is a collection of functions to help parse
+ * command line arguments. The API takes a string input and will return
+ * it to the user in a more usable format.
+ */
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <stdint.h>
+
+#include <rte_compat.h>
+
+
+/**
+ * Convert a string describing a list of core ids into an array of core ids.
+ *
+ * On success, the passed array is filled with the core ids present in the
+ * list up to the "cores_len", and the length of the array is returned.
+ * For example, passing a 1-3,6 "corelist" results in an array of [1, 2, 3, 6]
+ * and would return 4.
+ *
+ * Like the snprintf function for strings, if the length of the input array is
+ * insufficient to hold the number of cores in the "corelist", the input array is
+ * filled to capacity and the return value is the number of elements which would
+ * be returned if the array had been big enough.
+ * Function can also be called with a NULL array and 0 "cores_len" to find out
+ * the "cores_len" required.
+ *
+ * @param corelist
+ *   Input string describing a list of core ids.
+ * @param cores
+ *   An array where to store the core ids.
+ *   Array can be NULL if "cores_len" is 0.
+ * @param cores_len
+ *   The length of the "cores" array.
+ *   If the size is smaller than that needed to hold all cores from "corelist",
+ *   only "cores_len" elements will be written to the array.
+ * @return
+ *   n: the number of unique cores present in "corelist".
+ *   -1 if the string was invalid.
+ *   NOTE: if n > "cores_len", then only "cores_len" elements in the "cores" array are valid.
+ */
+__rte_experimental
+int
+rte_parse_corelist(const char *corelist, uint16_t *cores, uint32_t cores_len);
+
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ARG_PARSER_H_ */
diff --git a/lib/arg_parser/version.map b/lib/arg_parser/version.map
new file mode 100644
index 0000000000..588e14c7ad
--- /dev/null
+++ b/lib/arg_parser/version.map
@@ -0,0 +1,10 @@ 
+DPDK_24 {
+	local: *;
+};
+
+EXPERIMENTAL {
+	global:
+
+	# added in 24.03
+	rte_parse_corelist;
+};
diff --git a/lib/meson.build b/lib/meson.build
index 6c143ce5a6..db9e769033 100644
--- a/lib/meson.build
+++ b/lib/meson.build
@@ -11,6 +11,7 @@ 
 libraries = [
         'log',
         'kvargs', # eal depends on kvargs
+        'arg_parser',
         'telemetry', # basic info querying
         'eal', # everything depends on eal
         'ring',
@@ -72,6 +73,7 @@  if is_ms_compiler
             'log',
             'kvargs',
             'telemetry',
+            'arg_parser',
     ]
 endif