[1/4] compressdev: add LZ4 algorithm support

Message ID 20230109074526.2507344-2-michaelba@nvidia.com (mailing list archive)
State Superseded, archived
Delegated to: akhil goyal
Headers
Series compressdev: add LZ4 support |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation warning apply patch failure

Commit Message

Michael Baum Jan. 9, 2023, 7:45 a.m. UTC
  Add support for LZ4 algorithm:
 - Add Lz4 param structure to XFORM structures.
 - Add capabilities flags for LZ4 params.
 - Add xxHash-32 checksum and capabilities flag.

Signed-off-by: Michael Baum <michaelba@nvidia.com>
---
 doc/guides/compressdevs/features/default.ini |  4 ++
 doc/guides/rel_notes/release_23_03.rst       |  7 +++
 lib/compressdev/rte_comp.c                   |  6 +++
 lib/compressdev/rte_comp.h                   | 51 +++++++++++++++++++-
 4 files changed, 66 insertions(+), 2 deletions(-)
  

Comments

Akhil Goyal Jan. 30, 2023, 6:35 p.m. UTC | #1
> +/**
> + * Block checksum flag.
> + * If this flag is set, each data block will be followed by a 4-bytes checksum,
> + * calculated by using the xxHash-32 algorithm on the raw (compressed) data
> + * block. The intention is to detect data corruption (storage or transmission
> + * errors) immediately, before decoding. Block checksum usage is optional.
> + */
> +#define RTE_COMP_LZ4_FLAG_BLOCK_CHECKSUM (1 << 4)
> +
> +/**
> + * Block Independence flag.
> + * If this flag is set to 1, blocks are independent.
> + * If this flag is set to 0, each block depends on previous ones (up to LZ4
> + * window size, which is 64 KB). In such case, it is necessary to decode all
> + * blocks in sequence.
> + * Block dependency improves compression ratio, especially for small blocks.
> On
> + * the other hand, it makes random access or multi-threaded decoding
> impossible.
> + */
> +#define RTE_COMP_LZ4_FLAG_BLOCK_INDEPENDENCE (1 << 5)

Why did you start with 4th and 5th bit of the flags? Why not first two bits?

++ more people for review.
  
Michael Baum Jan. 30, 2023, 8:36 p.m. UTC | #2
On Mon, Jan 30, 2023 at 20:35 PM Akhil Goyal <gakhil@marvell.com> wrote: 
> 
> > +/**
> > + * Block checksum flag.
> > + * If this flag is set, each data block will be followed by a 4-bytes checksum,
> > + * calculated by using the xxHash-32 algorithm on the raw (compressed) data
> > + * block. The intention is to detect data corruption (storage or transmission
> > + * errors) immediately, before decoding. Block checksum usage is optional.
> > + */
> > +#define RTE_COMP_LZ4_FLAG_BLOCK_CHECKSUM (1 << 4)
> > +
> > +/**
> > + * Block Independence flag.
> > + * If this flag is set to 1, blocks are independent.
> > + * If this flag is set to 0, each block depends on previous ones (up to LZ4
> > + * window size, which is 64 KB). In such case, it is necessary to decode all
> > + * blocks in sequence.
> > + * Block dependency improves compression ratio, especially for small blocks.
> > On
> > + * the other hand, it makes random access or multi-threaded decoding
> > impossible.
> > + */
> > +#define RTE_COMP_LZ4_FLAG_BLOCK_INDEPENDENCE (1 << 5)
> 
> Why did you start with 4th and 5th bit of the flags? Why not first two bits?

I didn't choose the values by myself, I took them from LZ4 standard:
https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md#frame-descriptor

> 
> ++ more people for review.
  
Akhil Goyal Jan. 31, 2023, 6:29 a.m. UTC | #3
> -----Original Message-----
> From: Michael Baum <michaelba@nvidia.com>
> Sent: Tuesday, January 31, 2023 2:07 AM
> To: Akhil Goyal <gakhil@marvell.com>; dev@dpdk.org; Mahipal Challa
> <mchalla@marvell.com>; Fan Zhang <fanzhang.oss@gmail.com>; Ashish Gupta
> <ashishg@marvell.com>
> Cc: Matan Azrad <matan@nvidia.com>; Fiona Trahe <fiona.trahe@intel.com>;
> NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>
> Subject: RE: [EXT] [PATCH 1/4] compressdev: add LZ4 algorithm support
> 
> On Mon, Jan 30, 2023 at 20:35 PM Akhil Goyal <gakhil@marvell.com> wrote:
> >
> > > +/**
> > > + * Block checksum flag.
> > > + * If this flag is set, each data block will be followed by a 4-bytes checksum,
> > > + * calculated by using the xxHash-32 algorithm on the raw (compressed)
> data
> > > + * block. The intention is to detect data corruption (storage or transmission
> > > + * errors) immediately, before decoding. Block checksum usage is optional.
> > > + */
> > > +#define RTE_COMP_LZ4_FLAG_BLOCK_CHECKSUM (1 << 4)
> > > +
> > > +/**
> > > + * Block Independence flag.
> > > + * If this flag is set to 1, blocks are independent.
> > > + * If this flag is set to 0, each block depends on previous ones (up to LZ4
> > > + * window size, which is 64 KB). In such case, it is necessary to decode all
> > > + * blocks in sequence.
> > > + * Block dependency improves compression ratio, especially for small
> blocks.
> > > On
> > > + * the other hand, it makes random access or multi-threaded decoding
> > > impossible.
> > > + */
> > > +#define RTE_COMP_LZ4_FLAG_BLOCK_INDEPENDENCE (1 << 5)
> >
> > Why did you start with 4th and 5th bit of the flags? Why not first two bits?
> 
> I didn't choose the values by myself, I took them from LZ4 standard:
> https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md#frame-descriptor

Ok, Are we defining and making the frame descriptor inside library?
Also, why all the bits not added?
I believe these are just flags which PMD would use to make the frame descriptors.
If so, we should start with 0th bit and let the PMD handle bit shifting.
  

Patch

diff --git a/doc/guides/compressdevs/features/default.ini b/doc/guides/compressdevs/features/default.ini
index e1419ee8db..208b8591cf 100644
--- a/doc/guides/compressdevs/features/default.ini
+++ b/doc/guides/compressdevs/features/default.ini
@@ -20,8 +20,12 @@  OOP SGL In LB  Out     =
 OOP LB  In SGL Out     =
 Deflate                =
 LZS                    =
+LZ4                    =
 Adler32                =
 Crc32                  =
 Adler32&Crc32          =
+xxHash32               =
 Fixed                  =
 Dynamic                =
+LZ4 Block Checksum     =
+LZ4 Block Independence =
diff --git a/doc/guides/rel_notes/release_23_03.rst b/doc/guides/rel_notes/release_23_03.rst
index b8c5b68d6c..3bcf638544 100644
--- a/doc/guides/rel_notes/release_23_03.rst
+++ b/doc/guides/rel_notes/release_23_03.rst
@@ -55,6 +55,13 @@  New Features
      Also, make sure to start the actual text at the margin.
      =======================================================
 
+* **Added LZ4 algorithm in Compressdev Library.**
+
+  Added new compression algorithm, including:
+
+  * Added support for ``RTE_COMP_ALGO_LZ4``.
+  * Added support for ``RTE_COMP_CHECKSUM_XXHASH32``.
+
 
 Removed Items
 -------------
diff --git a/lib/compressdev/rte_comp.c b/lib/compressdev/rte_comp.c
index 320c6dab92..2ad05dbf01 100644
--- a/lib/compressdev/rte_comp.c
+++ b/lib/compressdev/rte_comp.c
@@ -39,6 +39,12 @@  rte_comp_get_feature_name(uint64_t flag)
 		return "HUFFMAN_FIXED";
 	case RTE_COMP_FF_HUFFMAN_DYNAMIC:
 		return "HUFFMAN_DYNAMIC";
+	case RTE_COMP_FF_XXHASH32_CHECKSUM:
+		return "XXHASH32_CHECKSUM";
+	case RTE_COMP_FF_LZ4_BLOCK_INDEPENDENCE:
+		return "LZ4_BLOCK_INDEPENDENCE";
+	case RTE_COMP_FF_LZ4_BLOCK_WITH_CHECKSUM:
+		return "LZ4_BLOCK_WITH_CHECKSUM";
 	default:
 		return NULL;
 	}
diff --git a/lib/compressdev/rte_comp.h b/lib/compressdev/rte_comp.h
index a8f398b57b..ec206d17cf 100644
--- a/lib/compressdev/rte_comp.h
+++ b/lib/compressdev/rte_comp.h
@@ -67,6 +67,12 @@  extern "C" {
 /**< Fixed huffman encoding is supported */
 #define RTE_COMP_FF_HUFFMAN_DYNAMIC		(1ULL << 14)
 /**< Dynamic huffman encoding is supported */
+#define RTE_COMP_FF_XXHASH32_CHECKSUM		(1ULL << 15)
+/**< xxHash-32 Checksum is supported */
+#define RTE_COMP_FF_LZ4_BLOCK_INDEPENDENCE	(1ULL << 16)
+/**< LZ4 block independent is supported */
+#define RTE_COMP_FF_LZ4_BLOCK_WITH_CHECKSUM	(1ULL << 17)
+/**< LZ4 block with checksum is supported */
 
 /** Status of comp operation */
 enum rte_comp_op_status {
@@ -109,6 +115,10 @@  enum rte_comp_algorithm {
 	/**< LZS compression algorithm
 	 * https://tools.ietf.org/html/rfc2395
 	 */
+	RTE_COMP_ALGO_LZ4,
+	/**< LZ4 compression algorithm
+	 * https://github.com/lz4/lz4
+	 */
 	RTE_COMP_ALGO_LIST_END
 };
 
@@ -149,9 +159,12 @@  enum rte_comp_checksum_type {
 	/**< Generates both Adler-32 and CRC32 checksums, concatenated.
 	 * CRC32 is in the lower 32bits, Adler-32 in the upper 32 bits.
 	 */
+	RTE_COMP_CHECKSUM_XXHASH32,
+	/**< Generates a xxHash-32 checksum, as used by lz4.
+	 * https://github.com/Cyan4973/xxHash/blob/dev/doc/xxhash_spec.md
+	 */
 };
 
-
 /** Compression Huffman Type - used by DEFLATE algorithm */
 enum rte_comp_huffman {
 	RTE_COMP_HUFFMAN_DEFAULT,
@@ -208,13 +221,41 @@  enum rte_comp_op_type {
 	 */
 };
 
-
 /** Parameters specific to the deflate algorithm */
 struct rte_comp_deflate_params {
 	enum rte_comp_huffman huffman;
 	/**< Compression huffman encoding type */
 };
 
+/**
+ * Block checksum flag.
+ * If this flag is set, each data block will be followed by a 4-bytes checksum,
+ * calculated by using the xxHash-32 algorithm on the raw (compressed) data
+ * block. The intention is to detect data corruption (storage or transmission
+ * errors) immediately, before decoding. Block checksum usage is optional.
+ */
+#define RTE_COMP_LZ4_FLAG_BLOCK_CHECKSUM (1 << 4)
+
+/**
+ * Block Independence flag.
+ * If this flag is set to 1, blocks are independent.
+ * If this flag is set to 0, each block depends on previous ones (up to LZ4
+ * window size, which is 64 KB). In such case, it is necessary to decode all
+ * blocks in sequence.
+ * Block dependency improves compression ratio, especially for small blocks. On
+ * the other hand, it makes random access or multi-threaded decoding impossible.
+ */
+#define RTE_COMP_LZ4_FLAG_BLOCK_INDEPENDENCE (1 << 5)
+
+/** Parameters specific to the LZ4 algorithm */
+struct rte_comp_lz4_params {
+	uint8_t flags;
+	/**< Compression LZ4 parameter flags.
+	 * Based on LZ4 standard flags:
+	 * https://github.com/lz4/lz4/blob/dev/doc/lz4_Frame_format.md#frame-descriptor
+	 */
+};
+
 /** Setup Data for compression */
 struct rte_comp_compress_xform {
 	enum rte_comp_algorithm algo;
@@ -222,6 +263,8 @@  struct rte_comp_compress_xform {
 	union {
 		struct rte_comp_deflate_params deflate;
 		/**< Parameters specific to the deflate algorithm */
+		struct rte_comp_lz4_params lz4;
+		/**< Parameters specific to the LZ4 algorithm */
 	}; /**< Algorithm specific parameters */
 	int level;
 	/**< Compression level */
@@ -251,6 +294,10 @@  struct rte_comp_decompress_xform {
 	 * compressed data. If window size can't be supported by the PMD then
 	 * setup of stream or private_xform should fail.
 	 */
+	union {
+		struct rte_comp_lz4_params lz4;
+		/**< Parameters specific to the LZ4 algorithm */
+	}; /**< Algorithm specific parameters */
 	enum rte_comp_hash_algorithm hash_algo;
 	/**< Hash algorithm to be used with decompress operation. Hash is always
 	 * done on plaintext.