mbox series

[RFC,0/1] mldev: introduce machine learning device library

Message ID 20220803132839.2747858-1-jerinj@marvell.com (mailing list archive)
Headers
Series mldev: introduce machine learning device library |

Message

Jerin Jacob Kollanukkaran Aug. 3, 2022, 1:28 p.m. UTC
  From: Jerin Jacob <jerinj@marvell.com>

Machine learning inference library
==================================

Definition of machine learning inference
----------------------------------------
Inference in machine learning is the process of making an output prediction
based on new input data using a pre-trained machine learning model.

The scope of the RFC would include only inferencing with pre-trained machine learning models,
training and building/compiling the ML models is out of scope for this RFC or
DPDK mldev API. Use existing machine learning compiler frameworks for model creation.

Motivation for the new library
------------------------------
Multiple semiconductor vendors are offering accelerator products such as DPU
(often called Smart-NIC), FPGA, GPU, etc., which have ML inferencing capabilities
integrated as part of the product. Use of ML inferencing is increasing in the domain
of packet processing for flow classification, intrusion, malware and anomaly detection.

Lack of inferencing support through DPDK APIs will involve complexities and
increased latency from moving data across frameworks (i.e, dataplane to
non dataplane ML frameworks and vice-versa). Having a standardized DPDK APIs for ML
inferencing would enable the dataplane solutions to harness the benefit of inline
inferencing supported by the hardware.

Contents of RFC
---------------
This RFC attempts to define standard APIs for:

1) Discovery of ML capabilities (e.g., device specific features) in a vendor
independent fashion
2) Definition of functions to handle ML devices, which includes probing,
initialization and termination of the devices.
3) Definition of functions to handle ML models used to perform inference operations.
4) Definition of function to handle quantize and dequantize operations

Roadmap
-------
1) Address the comments for this RFC.
2) Common code for mldev
3) SW mldev driver based on TVM (https://tvm.apache.org/)
4) HW mldev driver for cn10k
3) Add app/test-mldev application similar to other device class tests

Machine learning library framework
----------------------------------

The ML framework is built on the following model:


    +-----------------+               rte_ml_[en|de]queue_burst()
    |                 |                          |
    |     Machine     o------+     +--------+    |
    |     Learning    |      |     | queue  |    |    +------+
    |     Inference   o------+-----o        |<===o===>|Core 0|
    |     Engine      |      |     | pair 0 |         +------+
    |                 o----+ |     +--------+
    |                 |    | |
    +-----------------+    | |     +--------+
             ^             | |     | queue  |         +------+
             |             | +-----o        |<=======>|Core 1|
             |             |       | pair 1 |         +------+
             |             |       +--------+
    +--------+--------+    |
    | +-------------+ |    |       +--------+
    | |   Model 0   | |    |       | queue  |         +------+
    | +-------------+ |    +-------o        |<=======>|Core N|
    | +-------------+ |            | pair N |         +------+
    | |   Model 1   | |            +--------+
    | +-------------+ |
    | +-------------+ |<------- rte_ml_model_load()
    | |   Model ..  | |-------> rte_ml_model_info()
    | +-------------+ |<------- rte_ml_model_start()
    | +-------------+ |<------- rte_ml_model_stop()
    | |   Model N   | |<------- rte_ml_model_params_update()
    | +-------------+ |<------- rte_ml_model_unload()
    +-----------------+

ML Device: A hardware or software-based implementation of ML device API for
running inferences using a pre-trained ML model.

ML Model: An ML model is an algorithm trained over a dataset. A model consists of
procedure/algorithm and data/pattern required to make predictions on live data.
Once the model is created and trained outside of the DPDK scope, the model can be loaded
via rte_ml_model_load() and then start it using rte_ml_model_start() API.
The rte_ml_model_params_update() can be used to update the model parameters such as weight
and bias without unloading the model using rte_ml_model_unload().

ML Inference: ML inference is the process of feeding data to the model via
rte_ml_enqueue_burst() API and use rte_ml_dequeue_burst() API to get the calculated
outputs/predictions from the started model.

In all functions of the ML device API, the ML device is designated by an
integer >= 0 named as device identifier *dev_id*.

The functions exported by the ML device API to setup a device designated by
its device identifier must be invoked in the following order:

     - rte_ml_dev_configure()
     - rte_ml_dev_queue_pair_setup()
     - rte_ml_dev_start()

A model is required to run the inference operations with the user specified inputs.
Application needs to invoke the ML model API in the following order before queueing
inference jobs.

     - rte_ml_model_load()
     - rte_ml_model_start()

The rte_ml_model_info() API is provided to retrieve the information related to the model.
The information would include the shape and type of input and output required for the inference.

Data quantization and dequantization is one of the main aspects in ML domain. This involves
conversion of input data from a higher precision to a lower precision data type and vice-versa
for the output. APIs are provided to enable quantization through rte_ml_io_quantize() and
dequantization through rte_ml_io_dequantize(). These APIs have the capability to handle input
and output buffers holding data for multiple batches.
Two utility APIs rte_ml_io_input_size_get() and rte_ml_io_output_size_get() can used to get the
size of quantized and de-quantized multi-batch input and output buffers.

User can optionally update the model parameters with rte_ml_model_params_update() after
invoking rte_ml_model_stop() API on a given model ID.

The application can invoke, in any order, the functions exported by the ML API to enqueue
inference jobs and dequeue inference response.

If the application wants to change the device configuration (i.e., call
rte_ml_dev_configure() or rte_ml_dev_queue_pair_setup()), then application must stop the
device using rte_ml_dev_stop() API. Likewise, if model parameters need to be updated then
the application must call rte_ml_model_stop() followed by rte_ml_model_params_update() API
for the given model. The application does not need to call rte_ml_dev_stop() API for
any model re-configuration such as rte_ml_model_params_update(), rte_ml_model_unload() etc.

Once the device is in the start state after invoking rte_ml_dev_start() API and the model is in
start state after invoking rte_ml_model_start() API, then the application can call
rte_ml_enqueue() and rte_ml_dequeue() API on the destined device and model ID.

Finally, an application can close an ML device by invoking the rte_ml_dev_close() function.

Typical application utilisation of the ML API will follow the following
programming flow.

- rte_ml_dev_configure()
- rte_ml_dev_queue_pair_setup()
- rte_ml_model_load()
- rte_ml_model_start()
- rte_ml_model_info()
- rte_ml_dev_start()
- rte_ml_enqueue_burst()
- rte_ml_dequeue_burst()
- rte_ml_model_stop()
- rte_ml_model_unload()
- rte_ml_dev_stop()
- rte_ml_dev_close()

Regarding multi-threading, by default, all the functions of the ML Device API exported by a PMD
are lock-free functions which assume to not be invoked in parallel on different logical cores
on the same target object. For instance, the dequeue function of a poll mode driver cannot be
invoked in parallel on two logical cores to operate on same queue pair. Of course, this function
can be invoked in parallel by different logical core on different queue pair.
It is the responsibility of the user application to enforce this rule.

Example application usage for ML inferencing
--------------------------------------------
This example application is to demonstrate the programming model of ML device
library. This example omits the error checks to simplify the application. This
example also assumes that the input data received is quantized and output expected
is also quantized. In order to handle non-quantized inputs and outputs, users can
invoke rte_ml_io_quantize() or rte_ml_io_dequantize() for data type conversions.

#define ML_MODEL_NAME "model"
#define IO_MZ "io_mz"

struct app_ctx {
	char model_file[PATH_MAX];
	char inp_file[PATH_MAX];
	char out_file[PATH_MAX];

	struct rte_ml_model_params params;
	struct rte_ml_model_info info;
	uint16_t id;

	uint64_t input_size;
	uint64_t output_size;
	uint8_t *input_buffer;
	uint8_t *output_buffer;
} __rte_cache_aligned;

struct app_ctx ctx;

static int
parse_args(int argc, char **argv)
{
	int opt, option_index;
	static struct option lgopts[] = {{"model", required_argument, NULL, 'm'},
					 {"input", required_argument, NULL, 'i'},
					 {"output", required_argument, NULL, 'o'},
					 {NULL, 0, NULL, 0}};

	while ((opt = getopt_long(argc, argv, "m:i:o:", lgopts, &option_index)) != EOF)
		switch (opt) {
		case 'm':
			strncpy(ctx.model_file, optarg, PATH_MAX - 1);
			break;
		case 'i':
			strncpy(ctx.inp_file, optarg, PATH_MAX - 1);
			break;
		case 'o':
			strncpy(ctx.out_file, optarg, PATH_MAX - 1);
			break;
		default:
			return -1;
		}

	return 0;
}

int
main(int argc, char **argv)
{
	struct rte_ml_dev_qp_conf qp_conf;
	struct rte_ml_dev_config config;
	struct rte_ml_dev_info dev_info;
	const struct rte_memzone *mz;
	struct rte_mempool *op_pool;
	struct rte_ml_op *op_enq;
	struct rte_ml_op *op_deq;

	FILE *fp;
	int rc;

	/* Initialize EAL */
	rc = rte_eal_init(argc, argv);
	if (rc < 0)
		rte_exit(EXIT_FAILURE, "Invalid EAL arguments\n");
	argc -= rc;
	argv += rc;

	/* Parse application arguments (after the EAL args) */
	if (parse_args(argc, argv) < 0)
		rte_exit(EXIT_FAILURE, "Invalid application arguments\n");

	/* Step 1: Check for ML devices */
	if (rte_ml_dev_count() <= 0)
		rte_exit(EXIT_FAILURE, "Failed to find ML devices\n");

	/* Step 2: Get device info */
	if (rte_ml_dev_info_get(0, &dev_info) != 0)
		rte_exit(EXIT_FAILURE, "Failed to get device info\n");

	/* Step 3: Configure ML device, use device 0 */
	config.socket_id = rte_ml_dev_socket_id(0);
	config.max_nb_models = dev_info.max_models;
	config.nb_queue_pairs = dev_info.max_queue_pairs;
	if (rte_ml_dev_configure(0, &config) != 0)
		rte_exit(EXIT_FAILURE, "Device configuration failed\n");

	/* Step 4: Setup queue pairs, used qp_id = 0 */
	qp_conf.nb_desc = 1;
	if (rte_ml_dev_queue_pair_setup(0, 0, &qp_conf, config.socket_id) != 0)
		rte_exit(EXIT_FAILURE, "Queue-pair setup failed\n");

	/* Step 5: Start device */
	if (rte_ml_dev_start(0) != 0)
		rte_exit(EXIT_FAILURE, "Device start failed\n");

	/* Step 6: Read model data and update load params structure */
	fp = fopen(ctx.model_file, "r+");
	if (fp == NULL)
		rte_exit(EXIT_FAILURE, "Failed to open model file\n");

	fseek(fp, 0, SEEK_END);
	ctx.params.size = ftell(fp);
	fseek(fp, 0, SEEK_SET);

	ctx.params.addr = malloc(ctx.params.size);
	if (fread(ctx.params.addr, 1, ctx.params.size, fp) != ctx.params.size){
		fclose(fp);
		rte_exit(EXIT_FAILURE, "Failed to read model\n");
	}
	fclose(fp);
	strcpy(ctx.params.name, ML_MODEL_NAME);

	/* Step 7: Load the model */
	if (rte_ml_model_load(0, &ctx.params, &ctx.id) != 0)
		rte_exit(EXIT_FAILURE, "Failed to load model\n");
	free(ctx.params.addr);

	/* Step 8: Start the model */
	if (rte_ml_model_start(0, ctx.id) != 0)
		rte_exit(EXIT_FAILURE, "Failed to start model\n");

	/* Step 9: Allocate buffers for quantized input and output */

	/* Get model information */
	if (rte_ml_model_info_get(0, ctx.id, &ctx.info) != 0)
		rte_exit(EXIT_FAILURE, "Failed to get model info\n");

	/* Get the buffer size for input and output */
	rte_ml_io_input_size_get(0, ctx.id, ctx.info.batch_size, &ctx.input_size, NULL);
	rte_ml_io_output_size_get(0, ctx.id, ctx.info.batch_size, &ctx.output_size, NULL);

	mz = rte_memzone_reserve(IO_MZ, ctx.input_size + ctx.output_size, config.socket_id, 0);
	if (mz == NULL)
		rte_exit(EXIT_FAILURE, "Failed to create IO memzone\n");

	ctx.input_buffer = mz->addr;
	ctx.output_buffer = ctx.input_buffer + ctx.input_size;

	/* Step 10: Fill the input data */
	fp = fopen(ctx.inp_file, "r+");
	if (fp == NULL)
		rte_exit(EXIT_FAILURE, "Failed to open input file\n");

	if (fread(ctx.input_buffer, 1, ctx.input_size, fp) != ctx.input_size) {
		fclose(fp);
		rte_exit(EXIT_FAILURE, "Failed to read input file\n");
	}
	fclose(fp);

	/* Step 11: Create ML op mempool */
	op_pool = rte_ml_op_pool_create("ml_op_pool", 1, 0, 0, config.socket_id);
	if (op_pool == NULL)
		rte_exit(EXIT_FAILURE, "Failed to create op pool\n");

	/* Step 12: Form an ML op */
	rte_mempool_get_bulk(op_pool, (void *)op_enq, 1);
	op_enq->model_id = ctx.id;
	op_enq->nb_batches = ctx.info.batch_size;
	op_enq->mempool = op_pool;
	op_enq->input.addr = ctx.input_buffer;
	op_enq->input.length = ctx.input_size;
	op_enq->input.next = NULL;
	op_enq->output.addr = ctx.output_buffer;
	op_enq->output.length = ctx.output_size;
	op_enq->output.next = NULL;

	/* Step 13: Enqueue jobs */
	rte_ml_enqueue_burst(0, 0, &op_enq, 1);

	/* Step 14: Dequeue jobs and release op pool */
	while (rte_ml_dequeue_burst(0, 0, &op_deq, 1) != 1)
		;

	/* Step 15: Write output */
	fp = fopen(ctx.out_file, "w+");
	if (fp == NULL)
		rte_exit(EXIT_FAILURE, "Failed to open output file\n");
	fwrite(ctx.output_buffer, 1, ctx.output_size, fp);
	fclose(fp);

	/* Step 16: Clean up */
	/* Stop ML model */
	rte_ml_model_stop(0, ctx.id);
	/* Unload ML model */
	rte_ml_model_unload(0, ctx.id);
	/* Free input/output memory */
	rte_memzone_free(rte_memzone_lookup(IO_MZ));
	/* Free the ml op back to pool */
	rte_mempool_put_bulk(op_pool, (void *)op_deq, 1);
	/* Free ml op pool */
	rte_mempool_free(op_pool);
	/* Stop the device */
	rte_ml_dev_stop(0);
	rte_ml_dev_close(0);
	rte_eal_cleanup();

	return 0;
}

Jerin Jacob (1):
  mldev: introduce machine learning device library

 config/rte_config.h             |    3 +
 doc/api/doxy-api-index.md       |    1 +
 doc/api/doxy-api.conf.in        |    1 +
 doc/guides/prog_guide/index.rst |    1 +
 doc/guides/prog_guide/mldev.rst |  164 +++++
 lib/eal/common/eal_common_log.c |    1 +
 lib/eal/include/rte_log.h       |    1 +
 lib/meson.build                 |    1 +
 lib/mldev/meson.build           |   12 +
 lib/mldev/rte_mldev.c           |    5 +
 lib/mldev/rte_mldev.h           | 1081 +++++++++++++++++++++++++++++++
 lib/mldev/version.map           |    5 +
 12 files changed, 1276 insertions(+)
 create mode 100644 doc/guides/prog_guide/mldev.rst
 create mode 100644 lib/mldev/meson.build
 create mode 100644 lib/mldev/rte_mldev.c
 create mode 100644 lib/mldev/rte_mldev.h
 create mode 100644 lib/mldev/version.map
  

Comments

Stephen Hemminger Aug. 3, 2022, 3:19 p.m. UTC | #1
On Wed, 3 Aug 2022 18:58:37 +0530
<jerinj@marvell.com> wrote:

> Roadmap
> -------
> 1) Address the comments for this RFC.
> 2) Common code for mldev
> 3) SW mldev driver based on TVM (https://tvm.apache.org/)

Having a SW implementation is important because then it can be covered
by tests.
  
Jerin Jacob Aug. 16, 2022, 1:13 p.m. UTC | #2
On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Wed, 3 Aug 2022 18:58:37 +0530
> <jerinj@marvell.com> wrote:
>
> > Roadmap
> > -------
> > 1) Address the comments for this RFC.
> > 2) Common code for mldev
> > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
>
> Having a SW implementation is important because then it can be covered
> by tests.

Yes. That reason for adding TVM based SW driver as item (3).

Is there any other high level or API level comments before proceeding
with v1 and implementation.
Or Anyone else interested to review or contribute to this new DPDK device class?
  
Morten Brørup Aug. 16, 2022, 3:45 p.m. UTC | #3
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Tuesday, 16 August 2022 15.13
> 
> On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > On Wed, 3 Aug 2022 18:58:37 +0530
> > <jerinj@marvell.com> wrote:
> >
> > > Roadmap
> > > -------
> > > 1) Address the comments for this RFC.
> > > 2) Common code for mldev
> > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> >
> > Having a SW implementation is important because then it can be
> covered
> > by tests.
> 
> Yes. That reason for adding TVM based SW driver as item (3).
> 
> Is there any other high level or API level comments before proceeding
> with v1 and implementation.

Have you seriously considered if the DPDK Project is the best home for this project? I can easily imagine the DPDK development process being a hindrance in many aspects for an evolving AI/ML library. Off the top of my head, it would probably be better off as a separate project, like SPDK.

If all this stuff can be completely omitted at build time, I have no objections.

A small note about naming (not intending to start a flame war, so please feel free to ignore!): I haven't worked seriously with ML/AI since university three decades ago, so I'm quite rusty in the domain. However, I don't see any Machine Learning functions proposed by this API. The library provides an API to an Inference Engine - but nobody says the inference model stems from Machine Learning; it might as well be a hand crafted model. Do you plan to propose APIs for training the models? If not, the name of the library could confuse some potential users.

> Or Anyone else interested to review or contribute to this new DPDK
> device class?
  
Honnappa Nagarahalli Aug. 16, 2022, 4:34 p.m. UTC | #4
<snip>

> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Tuesday, 16 August 2022 15.13
> >
> > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > <jerinj@marvell.com> wrote:
> > >
> > > > Roadmap
> > > > -------
> > > > 1) Address the comments for this RFC.
> > > > 2) Common code for mldev
> > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > >
> > > Having a SW implementation is important because then it can be
> > covered
> > > by tests.
> >
> > Yes. That reason for adding TVM based SW driver as item (3).
> >
> > Is there any other high level or API level comments before proceeding
> > with v1 and implementation.
> 
> Have you seriously considered if the DPDK Project is the best home for this
> project? I can easily imagine the DPDK development process being a hindrance
> in many aspects for an evolving AI/ML library. Off the top of my head, it would
> probably be better off as a separate project, like SPDK.
There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.

IMO, if we need to share the packets with the inference engine, then it fits into DPDK.

As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?

> 
> If all this stuff can be completely omitted at build time, I have no objections.
> 
> A small note about naming (not intending to start a flame war, so please feel
> free to ignore!): I haven't worked seriously with ML/AI since university three
> decades ago, so I'm quite rusty in the domain. However, I don't see any
> Machine Learning functions proposed by this API. The library provides an API to
> an Inference Engine - but nobody says the inference model stems from
> Machine Learning; it might as well be a hand crafted model. Do you plan to
> propose APIs for training the models? If not, the name of the library could
> confuse some potential users.
I think, at least on the edge devices, we need an inference device as ML requires more cycles/power.
 
> 
> > Or Anyone else interested to review or contribute to this new DPDK
> > device class?
  
Jerin Jacob Aug. 17, 2022, 5:37 a.m. UTC | #5
On Tue, Aug 16, 2022 at 9:15 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Tuesday, 16 August 2022 15.13
> >
> > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > <jerinj@marvell.com> wrote:
> > >
> > > > Roadmap
> > > > -------
> > > > 1) Address the comments for this RFC.
> > > > 2) Common code for mldev
> > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > >
> > > Having a SW implementation is important because then it can be
> > covered
> > > by tests.
> >
> > Yes. That reason for adding TVM based SW driver as item (3).
> >
> > Is there any other high level or API level comments before proceeding
> > with v1 and implementation.
>
> Have you seriously considered if the DPDK Project is the best home for this project? I can easily imagine the DPDK development process being a hindrance in many aspects for an evolving AI/ML library. Off the top of my head, it would probably be better off as a separate project, like SPDK.

Yes. The reasons are following

#  AI/ML compiler libraries more focused on model creation and
training etc (Thats where actual value addition the AI/ML libraries
can offer) and minimal part for interference(It is just added for
testing the model)
# Considering the inference is the scope of the DPDK. DPDK is ideal
place for following reasons

a) Inference scope is very limited.
b) Avoid memcpy of interference data (Use directly from network or
other class of device like cryptodev, regexdev)
c) Reuse highspeed IO interface like  PCI backed driver etc
d) Integration with other DPDK subsystems like eventdev etc for job completion.
e) Also support more inline offloads by merging two device classes
like rte_secuity.
f) Run the inference model from different AI/ML compiler frameworks or
abstract the inference usage.
Similar concept is already applied to other DPDK device classes like
1) In Regexdev,  The compiler generates the rule database which is out
of scope of DPDK. DPDK API just loads the rule database
2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
IO interface.

> If all this stuff can be completely omitted at build time, I have no objections.

Yes, It can be completely omitted at build time. Also no plan to
integrate to testpmd and other existing application. Planning to add
only app/test-mldev application.

>
> A small note about naming (not intending to start a flame war, so please feel free to ignore!): I haven't worked seriously with ML/AI since university three decades ago, so I'm quite rusty in the domain. However, I don't see any Machine Learning functions proposed by this API. The library provides an API to an Inference Engine - but nobody says the inference model stems from Machine Learning; it might as well be a hand crafted model. Do you plan to propose APIs for training the models? If not, the name of the library could confuse some potential users.

No, scope is only inference and it is documented in the programing
guide and API header file. I am trying to keep name similar to
regexdev, gpudev etc which have similar scope. But I am open to other
shortname/name if you have something in mind.

>
> > Or Anyone else interested to review or contribute to this new DPDK
> > device class?
>
  
Morten Brørup Aug. 17, 2022, 6:58 a.m. UTC | #6
> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Wednesday, 17 August 2022 07.37
> 
> On Tue, Aug 16, 2022 at 9:15 PM Morten Brørup
> <mb@smartsharesystems.com> wrote:
> >
> > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > Sent: Tuesday, 16 August 2022 15.13
> > >
> > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:
> > > >
> > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > <jerinj@marvell.com> wrote:
> > > >
> > > > > Roadmap
> > > > > -------
> > > > > 1) Address the comments for this RFC.
> > > > > 2) Common code for mldev
> > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > >
> > > > Having a SW implementation is important because then it can be
> > > covered
> > > > by tests.
> > >
> > > Yes. That reason for adding TVM based SW driver as item (3).
> > >
> > > Is there any other high level or API level comments before
> proceeding
> > > with v1 and implementation.
> >
> > Have you seriously considered if the DPDK Project is the best home
> for this project? I can easily imagine the DPDK development process
> being a hindrance in many aspects for an evolving AI/ML library. Off
> the top of my head, it would probably be better off as a separate
> project, like SPDK.
> 
> Yes. The reasons are following
> 
> #  AI/ML compiler libraries more focused on model creation and
> training etc (Thats where actual value addition the AI/ML libraries
> can offer) and minimal part for interference(It is just added for
> testing the model)
> # Considering the inference is the scope of the DPDK. DPDK is ideal
> place for following reasons
> 
> a) Inference scope is very limited.
> b) Avoid memcpy of interference data (Use directly from network or
> other class of device like cryptodev, regexdev)
> c) Reuse highspeed IO interface like  PCI backed driver etc
> d) Integration with other DPDK subsystems like eventdev etc for job
> completion.
> e) Also support more inline offloads by merging two device classes
> like rte_secuity.
> f) Run the inference model from different AI/ML compiler frameworks or
> abstract the inference usage.
> Similar concept is already applied to other DPDK device classes like
> 1) In Regexdev,  The compiler generates the rule database which is out
> of scope of DPDK. DPDK API just loads the rule database
> 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> IO interface.

Thank you for the detailed reply, Jerin.

These are good reasons for adding the new device class to the DPDK project - especially the Regexdev comparison got me convinced.

> 
> > If all this stuff can be completely omitted at build time, I have no
> objections.
> 
> Yes, It can be completely omitted at build time.

Perfect.

> Also no plan to
> integrate to testpmd and other existing application. Planning to add
> only app/test-mldev application.

+1 to that

> 
> >
> > A small note about naming (not intending to start a flame war, so
> please feel free to ignore!): I haven't worked seriously with ML/AI
> since university three decades ago, so I'm quite rusty in the domain.
> However, I don't see any Machine Learning functions proposed by this
> API. The library provides an API to an Inference Engine - but nobody
> says the inference model stems from Machine Learning; it might as well
> be a hand crafted model. Do you plan to propose APIs for training the
> models? If not, the name of the library could confuse some potential
> users.
> 
> No, scope is only inference and it is documented in the programing
> guide and API header file. I am trying to keep name similar to
> regexdev, gpudev etc which have similar scope. But I am open to other
> shortname/name if you have something in mind.

The AI(Artificial Intelligence)/ML(Machine Learning)/IE(Inference Engine) chip market still seems immature and fragmented, so I can't find any consensus on generic names for such hardware accelerator devices.

Some of the chip vendors represented on the DPDK mailing list offer AI/ML/IE accelerator chips. Perhaps their marketing department could propose alternatives to "Machine Learning Device"/"mldev" for inference engine devices (with no acceleration for training the models). If not, the initially proposed name is good enough.

So: Everyone ask your marketing departments and speak up now, or the name "mldev" will be set in stone. ;-)

I'm thinking: While "Inference Engine Device"/iedev might be technically more correct, it doesn't have same value as "Machine Learning Device"/"mldev" on a marketing scale. And we should choose a name that we expect might become industry standard consensus.

> 
> >
> > > Or Anyone else interested to review or contribute to this new DPDK
> > > device class?
> >
  
Jerin Jacob Aug. 17, 2022, 2:53 p.m. UTC | #7
On Tue, Aug 16, 2022 at 10:04 PM Honnappa Nagarahalli
<Honnappa.Nagarahalli@arm.com> wrote:
>
> <snip>
>
> > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > Sent: Tuesday, 16 August 2022 15.13
> > >
> > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > <stephen@networkplumber.org> wrote:
> > > >
> > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > <jerinj@marvell.com> wrote:
> > > >
> > > > > Roadmap
> > > > > -------
> > > > > 1) Address the comments for this RFC.
> > > > > 2) Common code for mldev
> > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > >
> > > > Having a SW implementation is important because then it can be
> > > covered
> > > > by tests.
> > >
> > > Yes. That reason for adding TVM based SW driver as item (3).
> > >
> > > Is there any other high level or API level comments before proceeding
> > > with v1 and implementation.
> >
> > Have you seriously considered if the DPDK Project is the best home for this
> > project? I can easily imagine the DPDK development process being a hindrance
> > in many aspects for an evolving AI/ML library. Off the top of my head, it would
> > probably be better off as a separate project, like SPDK.
> There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.

Simple application for the inference usage is added in the cover letter.

Regarding the use cases, There are many like firewall, intrusion
detection etc. Most of the use cases are driven by product
requirements and SW IP vendors try to keep it to themselves as a
product differentiate factor.
That is the prime reason for DPDK scope only for inference where IO is
involved. Model creation and training etc will heavily vary based on
use case but not the inference model.

>
> IMO, if we need to share the packets with the inference engine, then it fits into DPDK.

Yes. Yes for networking or ORAN use cases the interface data comes
over wire and result can go over wire.

>
> As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?

#  AI/ML compiler libraries more focused on model creation and
training etc (Thats where actual value addition the AI/ML libraries
can offer) and
minimal part for inference (It is just added for testing the model)
# Considering the inference is the scope of the DPDK. DPDK is ideal
place for following reasons

a) Inference scope is very limited.
b) Avoid memcpy of inference data (Use directly from network or
other class of device like cryptodev, regexdev)
c) Reuse highspeed IO interface like  PCI backed driver etc
d) Integration with other DPDK subsystems like eventdev etc for job completion.
e) Also support more inline offloads by merging two device classes
like rte_secuity.
f) Run the inference model from different AI/ML compiler frameworks or
abstract the inference usage.
Similar concept is already applied to other DPDK device classes like
1) In Regexdev,  The compiler generates the rule database which is out
of scope of DPDK. DPDK API just loads the rule database
2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
IO interface.

>
> >
> > If all this stuff can be completely omitted at build time, I have no objections.
> >
> > A small note about naming (not intending to start a flame war, so please feel
> > free to ignore!): I haven't worked seriously with ML/AI since university three
> > decades ago, so I'm quite rusty in the domain. However, I don't see any
> > Machine Learning functions proposed by this API. The library provides an API to
> > an Inference Engine - but nobody says the inference model stems from
> > Machine Learning; it might as well be a hand crafted model. Do you plan to
> > propose APIs for training the models? If not, the name of the library could
> > confuse some potential users.
> I think, at least on the edge devices, we need an inference device as ML requires more cycles/power.
>
> >
> > > Or Anyone else interested to review or contribute to this new DPDK
> > > device class?
>
  
Thomas Monjalon Jan. 25, 2023, 1:45 p.m. UTC | #8
17/08/2022 08:58, Morten Brørup:
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Wednesday, 17 August 2022 07.37
> > 
> > On Tue, Aug 16, 2022 at 9:15 PM Morten Brørup
> > <mb@smartsharesystems.com> wrote:
> > >
> > > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > > Sent: Tuesday, 16 August 2022 15.13
> > > >
> > > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > > <stephen@networkplumber.org> wrote:
> > > > >
> > > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > > <jerinj@marvell.com> wrote:
> > > > >
> > > > > > Roadmap
> > > > > > -------
> > > > > > 1) Address the comments for this RFC.
> > > > > > 2) Common code for mldev
> > > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > > >
> > > > > Having a SW implementation is important because then it can be
> > > > covered
> > > > > by tests.
> > > >
> > > > Yes. That reason for adding TVM based SW driver as item (3).
> > > >
> > > > Is there any other high level or API level comments before
> > proceeding
> > > > with v1 and implementation.
> > >
> > > Have you seriously considered if the DPDK Project is the best home
> > for this project? I can easily imagine the DPDK development process
> > being a hindrance in many aspects for an evolving AI/ML library. Off
> > the top of my head, it would probably be better off as a separate
> > project, like SPDK.
> > 
> > Yes. The reasons are following
> > 
> > #  AI/ML compiler libraries more focused on model creation and
> > training etc (Thats where actual value addition the AI/ML libraries
> > can offer) and minimal part for interference(It is just added for
> > testing the model)
> > # Considering the inference is the scope of the DPDK. DPDK is ideal
> > place for following reasons
> > 
> > a) Inference scope is very limited.
> > b) Avoid memcpy of interference data (Use directly from network or
> > other class of device like cryptodev, regexdev)
> > c) Reuse highspeed IO interface like  PCI backed driver etc
> > d) Integration with other DPDK subsystems like eventdev etc for job
> > completion.
> > e) Also support more inline offloads by merging two device classes
> > like rte_secuity.
> > f) Run the inference model from different AI/ML compiler frameworks or
> > abstract the inference usage.
> > Similar concept is already applied to other DPDK device classes like
> > 1) In Regexdev,  The compiler generates the rule database which is out
> > of scope of DPDK. DPDK API just loads the rule database
> > 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> > IO interface.
> 
> Thank you for the detailed reply, Jerin.
> 
> These are good reasons for adding the new device class to the DPDK project - especially the Regexdev comparison got me convinced.
> 
> > 
> > > If all this stuff can be completely omitted at build time, I have no
> > objections.
> > 
> > Yes, It can be completely omitted at build time.
> 
> Perfect.
> 
> > Also no plan to
> > integrate to testpmd and other existing application. Planning to add
> > only app/test-mldev application.
> 
> +1 to that
> 
> > 
> > >
> > > A small note about naming (not intending to start a flame war, so
> > please feel free to ignore!): I haven't worked seriously with ML/AI
> > since university three decades ago, so I'm quite rusty in the domain.
> > However, I don't see any Machine Learning functions proposed by this
> > API. The library provides an API to an Inference Engine - but nobody
> > says the inference model stems from Machine Learning; it might as well
> > be a hand crafted model. Do you plan to propose APIs for training the
> > models? If not, the name of the library could confuse some potential
> > users.
> > 
> > No, scope is only inference and it is documented in the programing
> > guide and API header file. I am trying to keep name similar to
> > regexdev, gpudev etc which have similar scope. But I am open to other
> > shortname/name if you have something in mind.
> 
> The AI(Artificial Intelligence)/ML(Machine Learning)/IE(Inference Engine) chip market still seems immature and fragmented, so I can't find any consensus on generic names for such hardware accelerator devices.
> 
> Some of the chip vendors represented on the DPDK mailing list offer AI/ML/IE accelerator chips. Perhaps their marketing department could propose alternatives to "Machine Learning Device"/"mldev" for inference engine devices (with no acceleration for training the models). If not, the initially proposed name is good enough.
> 
> So: Everyone ask your marketing departments and speak up now, or the name "mldev" will be set in stone. ;-)
> 
> I'm thinking: While "Inference Engine Device"/iedev might be technically more correct, it doesn't have same value as "Machine Learning Device"/"mldev" on a marketing scale. And we should choose a name that we expect might become industry standard consensus.

I don't why but I like mldev and dislike iedev.
I could be OK with aidev as well.
  
Thomas Monjalon Jan. 25, 2023, 1:47 p.m. UTC | #9
17/08/2022 16:53, Jerin Jacob:
> On Tue, Aug 16, 2022 at 10:04 PM Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com> wrote:
> >
> > <snip>
> >
> > > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > > Sent: Tuesday, 16 August 2022 15.13
> > > >
> > > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > > <stephen@networkplumber.org> wrote:
> > > > >
> > > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > > <jerinj@marvell.com> wrote:
> > > > >
> > > > > > Roadmap
> > > > > > -------
> > > > > > 1) Address the comments for this RFC.
> > > > > > 2) Common code for mldev
> > > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > > >
> > > > > Having a SW implementation is important because then it can be
> > > > covered
> > > > > by tests.
> > > >
> > > > Yes. That reason for adding TVM based SW driver as item (3).
> > > >
> > > > Is there any other high level or API level comments before proceeding
> > > > with v1 and implementation.
> > >
> > > Have you seriously considered if the DPDK Project is the best home for this
> > > project? I can easily imagine the DPDK development process being a hindrance
> > > in many aspects for an evolving AI/ML library. Off the top of my head, it would
> > > probably be better off as a separate project, like SPDK.
> > There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.
> 
> Simple application for the inference usage is added in the cover letter.
> 
> Regarding the use cases, There are many like firewall, intrusion
> detection etc. Most of the use cases are driven by product
> requirements and SW IP vendors try to keep it to themselves as a
> product differentiate factor.
> That is the prime reason for DPDK scope only for inference where IO is
> involved. Model creation and training etc will heavily vary based on
> use case but not the inference model.
> 
> >
> > IMO, if we need to share the packets with the inference engine, then it fits into DPDK.
> 
> Yes. Yes for networking or ORAN use cases the interface data comes
> over wire and result can go over wire.
> 
> >
> > As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?
> 
> #  AI/ML compiler libraries more focused on model creation and
> training etc (Thats where actual value addition the AI/ML libraries
> can offer) and
> minimal part for inference (It is just added for testing the model)
> # Considering the inference is the scope of the DPDK. DPDK is ideal
> place for following reasons
> 
> a) Inference scope is very limited.
> b) Avoid memcpy of inference data (Use directly from network or
> other class of device like cryptodev, regexdev)
> c) Reuse highspeed IO interface like  PCI backed driver etc
> d) Integration with other DPDK subsystems like eventdev etc for job completion.
> e) Also support more inline offloads by merging two device classes
> like rte_secuity.
> f) Run the inference model from different AI/ML compiler frameworks or
> abstract the inference usage.
> Similar concept is already applied to other DPDK device classes like
> 1) In Regexdev,  The compiler generates the rule database which is out
> of scope of DPDK. DPDK API just loads the rule database
> 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> IO interface.

I think Honnappa was thinking about linking an existing inference library.
What are the similar libraries?
  
Jerin Jacob Jan. 25, 2023, 1:54 p.m. UTC | #10
On Wed, Jan 25, 2023 at 7:17 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 17/08/2022 16:53, Jerin Jacob:
> > On Tue, Aug 16, 2022 at 10:04 PM Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com> wrote:
> > >
> > > <snip>
> > >
> > > > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > > > Sent: Tuesday, 16 August 2022 15.13
> > > > >
> > > > > On Wed, Aug 3, 2022 at 8:49 PM Stephen Hemminger
> > > > > <stephen@networkplumber.org> wrote:
> > > > > >
> > > > > > On Wed, 3 Aug 2022 18:58:37 +0530
> > > > > > <jerinj@marvell.com> wrote:
> > > > > >
> > > > > > > Roadmap
> > > > > > > -------
> > > > > > > 1) Address the comments for this RFC.
> > > > > > > 2) Common code for mldev
> > > > > > > 3) SW mldev driver based on TVM (https://tvm.apache.org/)
> > > > > >
> > > > > > Having a SW implementation is important because then it can be
> > > > > covered
> > > > > > by tests.
> > > > >
> > > > > Yes. That reason for adding TVM based SW driver as item (3).
> > > > >
> > > > > Is there any other high level or API level comments before proceeding
> > > > > with v1 and implementation.
> > > >
> > > > Have you seriously considered if the DPDK Project is the best home for this
> > > > project? I can easily imagine the DPDK development process being a hindrance
> > > > in many aspects for an evolving AI/ML library. Off the top of my head, it would
> > > > probably be better off as a separate project, like SPDK.
> > > There is a lot of talk about using ML in networking workloads. Although, I am not very sure on how the use case looks like. For ex: is the inference engine going to be inline (i.e. the packet goes through the inference engine before coming to the CPU and provide some data (what sort of data?)), look aside (does it require the packets to be sent to the inference engine or is it some other data?), what would be an end to end use case? A sample application using these APIs would be helpful.
> >
> > Simple application for the inference usage is added in the cover letter.
> >
> > Regarding the use cases, There are many like firewall, intrusion
> > detection etc. Most of the use cases are driven by product
> > requirements and SW IP vendors try to keep it to themselves as a
> > product differentiate factor.
> > That is the prime reason for DPDK scope only for inference where IO is
> > involved. Model creation and training etc will heavily vary based on
> > use case but not the inference model.
> >
> > >
> > > IMO, if we need to share the packets with the inference engine, then it fits into DPDK.
> >
> > Yes. Yes for networking or ORAN use cases the interface data comes
> > over wire and result can go over wire.
> >
> > >
> > > As I understand, there are many mature open source projects for ML/inference outside of DPDK. Does it make sense for DPDK to adopt those projects rather than inventing our own?
> >
> > #  AI/ML compiler libraries more focused on model creation and
> > training etc (Thats where actual value addition the AI/ML libraries
> > can offer) and
> > minimal part for inference (It is just added for testing the model)
> > # Considering the inference is the scope of the DPDK. DPDK is ideal
> > place for following reasons
> >
> > a) Inference scope is very limited.
> > b) Avoid memcpy of inference data (Use directly from network or
> > other class of device like cryptodev, regexdev)
> > c) Reuse highspeed IO interface like  PCI backed driver etc
> > d) Integration with other DPDK subsystems like eventdev etc for job completion.
> > e) Also support more inline offloads by merging two device classes
> > like rte_secuity.
> > f) Run the inference model from different AI/ML compiler frameworks or
> > abstract the inference usage.
> > Similar concept is already applied to other DPDK device classes like
> > 1) In Regexdev,  The compiler generates the rule database which is out
> > of scope of DPDK. DPDK API just loads the rule database
> > 2) In Gpudev, The GPU kernel etc out of scope of DPDK.DPDK cares about
> > IO interface.
>
> I think Honnappa was thinking about linking an existing inference library.
> What are the similar libraries?

Not sure. Honnappa can tell if any which meets (a) to (f).

>
>