From patchwork Wed Jul 3 13:45:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jack Bond-Preston X-Patchwork-Id: 142086 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C102345561; Wed, 3 Jul 2024 15:46:17 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id CD6FA42E50; Wed, 3 Jul 2024 15:46:10 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id CEAA340265; Wed, 3 Jul 2024 15:46:08 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4DAE7367; Wed, 3 Jul 2024 06:46:33 -0700 (PDT) Received: from cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com (cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com [10.7.10.57]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6E3F23F762; Wed, 3 Jul 2024 06:46:07 -0700 (PDT) From: Jack Bond-Preston To: Kai Ji , Fan Zhang , Akhil Goyal Cc: dev@dpdk.org, stable@dpdk.org, Wathsala Vithanage Subject: [PATCH v5 1/5] crypto/openssl: fix GCM and CCM thread unsafe ctxs Date: Wed, 3 Jul 2024 13:45:47 +0000 Message-Id: <20240703134552.1439633-2-jack.bond-preston@foss.arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> References: <20240603160119.1279476-1-jack.bond-preston@foss.arm.com> <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Commit 67ab783b5d70 ("crypto/openssl: use local copy for session contexts") introduced a fix for concurrency bugs which could occur when using one OpenSSL PMD session across multiple cores simultaneously. The solution was to clone the EVP contexts per-buffer to avoid them being used concurrently. However, part of commit 75adf1eae44f ("crypto/openssl: update HMAC routine with 3.0 EVP API") reverted this fix, only for combined ops (AES-GCM and AES-CCM). Fix the concurrency issue by cloning EVP contexts per-buffer. An extra workaround is required for OpenSSL versions which are >= 3.0.0, and <= 3.2.0. This is because, prior to OpenSSL 3.2.0, EVP_CIPHER_CTX_copy() is not implemented for AES-GCM or AES-CCM. When using these OpenSSL versions, create and initialise the context from scratch, per-buffer. Throughput performance uplift measurements for AES-GCM-128 encrypt on Ampere Altra Max platform: 1 worker lcore | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 2.60 | 1.31 | -49.5% | | 256 | 7.69 | 4.45 | -42.1% | | 1024 | 15.33 | 11.30 | -26.3% | | 2048 | 18.74 | 15.37 | -18.0% | | 4096 | 21.11 | 18.80 | -10.9% | 8 worker lcores | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 19.94 | 2.83 | -85.8% | | 256 | 58.84 | 11.00 | -81.3% | | 1024 | 119.71 | 42.46 | -64.5% | | 2048 | 147.69 | 80.91 | -45.2% | | 4096 | 167.39 | 121.25 | -27.6% | Fixes: 75adf1eae44f ("crypto/openssl: update HMAC routine with 3.0 EVP API") Cc: stable@dpdk.org Signed-off-by: Jack Bond-Preston Acked-by: Kai Ji Reviewed-by: Wathsala Vithanage --- drivers/crypto/openssl/rte_openssl_pmd.c | 84 ++++++++++++++++++------ 1 file changed, 64 insertions(+), 20 deletions(-) diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/openssl/rte_openssl_pmd.c index e8cb09defc..c661528738 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd.c +++ b/drivers/crypto/openssl/rte_openssl_pmd.c @@ -350,7 +350,8 @@ get_aead_algo(enum rte_crypto_aead_algorithm sess_algo, size_t keylen, static int openssl_set_sess_aead_enc_param(struct openssl_session *sess, enum rte_crypto_aead_algorithm algo, - uint8_t tag_len, const uint8_t *key) + uint8_t tag_len, const uint8_t *key, + EVP_CIPHER_CTX **ctx) { int iv_type = 0; unsigned int do_ccm; @@ -378,7 +379,7 @@ openssl_set_sess_aead_enc_param(struct openssl_session *sess, } sess->cipher.mode = OPENSSL_CIPHER_LIB; - sess->cipher.ctx = EVP_CIPHER_CTX_new(); + *ctx = EVP_CIPHER_CTX_new(); if (get_aead_algo(algo, sess->cipher.key.length, &sess->cipher.evp_algo) != 0) @@ -388,19 +389,19 @@ openssl_set_sess_aead_enc_param(struct openssl_session *sess, sess->chain_order = OPENSSL_CHAIN_COMBINED; - if (EVP_EncryptInit_ex(sess->cipher.ctx, sess->cipher.evp_algo, + if (EVP_EncryptInit_ex(*ctx, sess->cipher.evp_algo, NULL, NULL, NULL) <= 0) return -EINVAL; - if (EVP_CIPHER_CTX_ctrl(sess->cipher.ctx, iv_type, sess->iv.length, + if (EVP_CIPHER_CTX_ctrl(*ctx, iv_type, sess->iv.length, NULL) <= 0) return -EINVAL; if (do_ccm) - EVP_CIPHER_CTX_ctrl(sess->cipher.ctx, EVP_CTRL_CCM_SET_TAG, + EVP_CIPHER_CTX_ctrl(*ctx, EVP_CTRL_CCM_SET_TAG, tag_len, NULL); - if (EVP_EncryptInit_ex(sess->cipher.ctx, NULL, NULL, key, NULL) <= 0) + if (EVP_EncryptInit_ex(*ctx, NULL, NULL, key, NULL) <= 0) return -EINVAL; return 0; @@ -410,7 +411,8 @@ openssl_set_sess_aead_enc_param(struct openssl_session *sess, static int openssl_set_sess_aead_dec_param(struct openssl_session *sess, enum rte_crypto_aead_algorithm algo, - uint8_t tag_len, const uint8_t *key) + uint8_t tag_len, const uint8_t *key, + EVP_CIPHER_CTX **ctx) { int iv_type = 0; unsigned int do_ccm = 0; @@ -437,7 +439,7 @@ openssl_set_sess_aead_dec_param(struct openssl_session *sess, } sess->cipher.mode = OPENSSL_CIPHER_LIB; - sess->cipher.ctx = EVP_CIPHER_CTX_new(); + *ctx = EVP_CIPHER_CTX_new(); if (get_aead_algo(algo, sess->cipher.key.length, &sess->cipher.evp_algo) != 0) @@ -447,24 +449,54 @@ openssl_set_sess_aead_dec_param(struct openssl_session *sess, sess->chain_order = OPENSSL_CHAIN_COMBINED; - if (EVP_DecryptInit_ex(sess->cipher.ctx, sess->cipher.evp_algo, + if (EVP_DecryptInit_ex(*ctx, sess->cipher.evp_algo, NULL, NULL, NULL) <= 0) return -EINVAL; - if (EVP_CIPHER_CTX_ctrl(sess->cipher.ctx, iv_type, + if (EVP_CIPHER_CTX_ctrl(*ctx, iv_type, sess->iv.length, NULL) <= 0) return -EINVAL; if (do_ccm) - EVP_CIPHER_CTX_ctrl(sess->cipher.ctx, EVP_CTRL_CCM_SET_TAG, + EVP_CIPHER_CTX_ctrl(*ctx, EVP_CTRL_CCM_SET_TAG, tag_len, NULL); - if (EVP_DecryptInit_ex(sess->cipher.ctx, NULL, NULL, key, NULL) <= 0) + if (EVP_DecryptInit_ex(*ctx, NULL, NULL, key, NULL) <= 0) return -EINVAL; return 0; } +static int openssl_aesni_ctx_clone(EVP_CIPHER_CTX **dest, + struct openssl_session *sess) +{ +#if (OPENSSL_VERSION_NUMBER >= 0x30200000L) + *dest = EVP_CIPHER_CTX_dup(sess->ctx); + return 0; +#elif (OPENSSL_VERSION_NUMBER >= 0x30000000L) + /* OpenSSL versions 3.0.0 <= V < 3.2.0 have no dupctx() implementation + * for AES-GCM and AES-CCM. In this case, we have to create new empty + * contexts and initialise, as we did the original context. + */ + if (sess->auth.algo == RTE_CRYPTO_AUTH_AES_GMAC) + sess->aead_algo = RTE_CRYPTO_AEAD_AES_GCM; + + if (sess->cipher.direction == RTE_CRYPTO_CIPHER_OP_ENCRYPT) + return openssl_set_sess_aead_enc_param(sess, sess->aead_algo, + sess->auth.digest_length, sess->cipher.key.data, + dest); + else + return openssl_set_sess_aead_dec_param(sess, sess->aead_algo, + sess->auth.digest_length, sess->cipher.key.data, + dest); +#else + *dest = EVP_CIPHER_CTX_new(); + if (EVP_CIPHER_CTX_copy(*dest, sess->cipher.ctx) != 1) + return -EINVAL; + return 0; +#endif +} + /** Set session cipher parameters */ static int openssl_set_session_cipher_parameters(struct openssl_session *sess, @@ -623,12 +655,14 @@ openssl_set_session_auth_parameters(struct openssl_session *sess, return openssl_set_sess_aead_enc_param(sess, RTE_CRYPTO_AEAD_AES_GCM, xform->auth.digest_length, - xform->auth.key.data); + xform->auth.key.data, + &sess->cipher.ctx); else return openssl_set_sess_aead_dec_param(sess, RTE_CRYPTO_AEAD_AES_GCM, xform->auth.digest_length, - xform->auth.key.data); + xform->auth.key.data, + &sess->cipher.ctx); break; case RTE_CRYPTO_AUTH_MD5: @@ -770,10 +804,12 @@ openssl_set_session_aead_parameters(struct openssl_session *sess, /* Select cipher direction */ if (xform->aead.op == RTE_CRYPTO_AEAD_OP_ENCRYPT) return openssl_set_sess_aead_enc_param(sess, xform->aead.algo, - xform->aead.digest_length, xform->aead.key.data); + xform->aead.digest_length, xform->aead.key.data, + &sess->cipher.ctx); else return openssl_set_sess_aead_dec_param(sess, xform->aead.algo, - xform->aead.digest_length, xform->aead.key.data); + xform->aead.digest_length, xform->aead.key.data, + &sess->cipher.ctx); } /** Parse crypto xform chain and set private session parameters */ @@ -1590,6 +1626,12 @@ process_openssl_combined_op return; } + EVP_CIPHER_CTX *ctx; + if (openssl_aesni_ctx_clone(&ctx, sess) != 0) { + op->status = RTE_CRYPTO_OP_STATUS_ERROR; + return; + } + iv = rte_crypto_op_ctod_offset(op, uint8_t *, sess->iv.offset); if (sess->auth.algo == RTE_CRYPTO_AUTH_AES_GMAC) { @@ -1623,12 +1665,12 @@ process_openssl_combined_op status = process_openssl_auth_encryption_gcm( mbuf_src, offset, srclen, aad, aadlen, iv, - dst, tag, sess->cipher.ctx); + dst, tag, ctx); else status = process_openssl_auth_encryption_ccm( mbuf_src, offset, srclen, aad, aadlen, iv, - dst, tag, taglen, sess->cipher.ctx); + dst, tag, taglen, ctx); } else { if (sess->auth.algo == RTE_CRYPTO_AUTH_AES_GMAC || @@ -1636,14 +1678,16 @@ process_openssl_combined_op status = process_openssl_auth_decryption_gcm( mbuf_src, offset, srclen, aad, aadlen, iv, - dst, tag, sess->cipher.ctx); + dst, tag, ctx); else status = process_openssl_auth_decryption_ccm( mbuf_src, offset, srclen, aad, aadlen, iv, - dst, tag, taglen, sess->cipher.ctx); + dst, tag, taglen, ctx); } + EVP_CIPHER_CTX_free(ctx); + if (status != 0) { if (status == (-EFAULT) && sess->auth.operation == From patchwork Wed Jul 3 13:45:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jack Bond-Preston X-Patchwork-Id: 142087 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9A3B445561; Wed, 3 Jul 2024 15:46:23 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 12C0C42E0B; Wed, 3 Jul 2024 15:46:12 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 9A38B40265 for ; Wed, 3 Jul 2024 15:46:09 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 250791007; Wed, 3 Jul 2024 06:46:34 -0700 (PDT) Received: from cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com (cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com [10.7.10.57]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 942173F762; Wed, 3 Jul 2024 06:46:08 -0700 (PDT) From: Jack Bond-Preston To: Kai Ji Cc: dev@dpdk.org, Wathsala Vithanage Subject: [PATCH v5 2/5] crypto/openssl: only init 3DES-CTR key + impl once Date: Wed, 3 Jul 2024 13:45:48 +0000 Message-Id: <20240703134552.1439633-3-jack.bond-preston@foss.arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> References: <20240603160119.1279476-1-jack.bond-preston@foss.arm.com> <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Currently the 3DES-CTR cipher context is initialised for every buffer, setting the cipher implementation and key - even though for every buffer in the session these values will be the same. Change to initialising the cipher context once, before any buffers are processed, instead. Throughput performance uplift measurements for 3DES-CTR encrypt on Ampere Altra Max platform: 1 worker lcore | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 0.16 | 0.21 | 35.3% | | 256 | 0.20 | 0.22 | 9.4% | | 1024 | 0.22 | 0.23 | 2.3% | | 2048 | 0.22 | 0.23 | 0.9% | | 4096 | 0.22 | 0.23 | 0.9% | 8 worker lcores | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 1.01 | 1.34 | 32.9% | | 256 | 1.51 | 1.66 | 9.9% | | 1024 | 1.72 | 1.77 | 2.6% | | 2048 | 1.76 | 1.78 | 1.1% | | 4096 | 1.79 | 1.80 | 0.6% | Signed-off-by: Jack Bond-Preston Acked-by: Kai Ji Reviewed-by: Wathsala Vithanage --- drivers/crypto/openssl/rte_openssl_pmd.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/openssl/rte_openssl_pmd.c index c661528738..8d3deb1354 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd.c +++ b/drivers/crypto/openssl/rte_openssl_pmd.c @@ -553,6 +553,15 @@ openssl_set_session_cipher_parameters(struct openssl_session *sess, sess->cipher.key.length, sess->cipher.key.data) != 0) return -EINVAL; + + + /* We use 3DES encryption also for decryption. + * IV is not important for 3DES ECB. + */ + if (EVP_EncryptInit_ex(sess->cipher.ctx, EVP_des_ede3_ecb(), + NULL, sess->cipher.key.data, NULL) != 1) + return -EINVAL; + break; case RTE_CRYPTO_CIPHER_DES_CBC: @@ -1172,8 +1181,7 @@ process_openssl_cipher_decrypt(struct rte_mbuf *mbuf_src, uint8_t *dst, /** Process cipher des 3 ctr encryption, decryption algorithm */ static int process_openssl_cipher_des3ctr(struct rte_mbuf *mbuf_src, uint8_t *dst, - int offset, uint8_t *iv, uint8_t *key, int srclen, - EVP_CIPHER_CTX *ctx) + int offset, uint8_t *iv, int srclen, EVP_CIPHER_CTX *ctx) { uint8_t ebuf[8], ctr[8]; int unused, n; @@ -1191,12 +1199,6 @@ process_openssl_cipher_des3ctr(struct rte_mbuf *mbuf_src, uint8_t *dst, src = rte_pktmbuf_mtod_offset(m, uint8_t *, offset); l = rte_pktmbuf_data_len(m) - offset; - /* We use 3DES encryption also for decryption. - * IV is not important for 3DES ecb - */ - if (EVP_EncryptInit_ex(ctx, EVP_des_ede3_ecb(), NULL, key, NULL) <= 0) - goto process_cipher_des3ctr_err; - memcpy(ctr, iv, 8); for (n = 0; n < srclen; n++) { @@ -1740,8 +1742,7 @@ process_openssl_cipher_op srclen, ctx_copy, inplace); else status = process_openssl_cipher_des3ctr(mbuf_src, dst, - op->sym->cipher.data.offset, iv, - sess->cipher.key.data, srclen, + op->sym->cipher.data.offset, iv, srclen, ctx_copy); EVP_CIPHER_CTX_free(ctx_copy); From patchwork Wed Jul 3 13:45:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jack Bond-Preston X-Patchwork-Id: 142088 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2615245561; Wed, 3 Jul 2024 15:46:30 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7D6ED42E3A; Wed, 3 Jul 2024 15:46:13 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id A3FDB42E4C for ; Wed, 3 Jul 2024 15:46:10 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1A5A9367; Wed, 3 Jul 2024 06:46:35 -0700 (PDT) Received: from cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com (cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com [10.7.10.57]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6EE443F762; Wed, 3 Jul 2024 06:46:09 -0700 (PDT) From: Jack Bond-Preston To: Kai Ji Cc: dev@dpdk.org, Wathsala Vithanage Subject: [PATCH v5 3/5] crypto/openssl: per-qp cipher context clones Date: Wed, 3 Jul 2024 13:45:49 +0000 Message-Id: <20240703134552.1439633-4-jack.bond-preston@foss.arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> References: <20240603160119.1279476-1-jack.bond-preston@foss.arm.com> <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Currently EVP_CIPHER_CTXs are allocated, copied to (from openssl_session), and then freed for every cipher operation (ie. per packet). This is very inefficient, and avoidable. Make each openssl_session hold an array of pointers to per-queue-pair cipher context copies. These are populated on first use by allocating a new context and copying from the main context. These copies can then be used in a thread-safe manner by different worker lcores simultaneously. Consequently the cipher context allocation and copy only has to happen once - the first time a given qp uses an openssl_session. This brings about a large performance boost. Throughput performance uplift measurements for AES-CBC-128 encrypt on Ampere Altra Max platform: 1 worker lcore | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 1.51 | 2.94 | 94.4% | | 256 | 4.90 | 8.05 | 64.3% | | 1024 | 11.07 | 14.21 | 28.3% | | 2048 | 14.03 | 16.28 | 16.0% | | 4096 | 16.20 | 17.59 | 8.6% | 8 worker lcores | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 3.05 | 23.74 | 678.8% | | 256 | 10.46 | 64.86 | 520.3% | | 1024 | 40.97 | 113.80 | 177.7% | | 2048 | 73.25 | 130.21 | 77.8% | | 4096 | 103.89 | 140.62 | 35.4% | Signed-off-by: Jack Bond-Preston Acked-by: Kai Ji Reviewed-by: Wathsala Vithanage --- drivers/crypto/openssl/openssl_pmd_private.h | 11 +- drivers/crypto/openssl/rte_openssl_pmd.c | 105 ++++++++++++------- drivers/crypto/openssl/rte_openssl_pmd_ops.c | 34 +++++- 3 files changed, 108 insertions(+), 42 deletions(-) diff --git a/drivers/crypto/openssl/openssl_pmd_private.h b/drivers/crypto/openssl/openssl_pmd_private.h index 0f038b218c..bad7dcf2f5 100644 --- a/drivers/crypto/openssl/openssl_pmd_private.h +++ b/drivers/crypto/openssl/openssl_pmd_private.h @@ -166,6 +166,14 @@ struct __rte_cache_aligned openssl_session { /**< digest length */ } auth; + uint16_t ctx_copies_len; + /* < number of entries in ctx_copies */ + EVP_CIPHER_CTX *qp_ctx[]; + /**< Flexible array member of per-queue-pair pointers to copies of EVP + * context structure. Cipher contexts are not safe to use from multiple + * cores simultaneously, so maintaining these copies allows avoiding + * per-buffer copying into a temporary context. + */ }; /** OPENSSL crypto private asymmetric session structure */ @@ -217,7 +225,8 @@ struct __rte_cache_aligned openssl_asym_session { /** Set and validate OPENSSL crypto session parameters */ extern int openssl_set_session_parameters(struct openssl_session *sess, - const struct rte_crypto_sym_xform *xform); + const struct rte_crypto_sym_xform *xform, + uint16_t nb_queue_pairs); /** Reset OPENSSL crypto session parameters */ extern void diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/openssl/rte_openssl_pmd.c index 8d3deb1354..df44cc097e 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd.c +++ b/drivers/crypto/openssl/rte_openssl_pmd.c @@ -467,13 +467,10 @@ openssl_set_sess_aead_dec_param(struct openssl_session *sess, return 0; } +#if (OPENSSL_VERSION_NUMBER >= 0x30000000L && OPENSSL_VERSION_NUMBER < 0x30200000L) static int openssl_aesni_ctx_clone(EVP_CIPHER_CTX **dest, struct openssl_session *sess) { -#if (OPENSSL_VERSION_NUMBER >= 0x30200000L) - *dest = EVP_CIPHER_CTX_dup(sess->ctx); - return 0; -#elif (OPENSSL_VERSION_NUMBER >= 0x30000000L) /* OpenSSL versions 3.0.0 <= V < 3.2.0 have no dupctx() implementation * for AES-GCM and AES-CCM. In this case, we have to create new empty * contexts and initialise, as we did the original context. @@ -489,13 +486,8 @@ static int openssl_aesni_ctx_clone(EVP_CIPHER_CTX **dest, return openssl_set_sess_aead_dec_param(sess, sess->aead_algo, sess->auth.digest_length, sess->cipher.key.data, dest); -#else - *dest = EVP_CIPHER_CTX_new(); - if (EVP_CIPHER_CTX_copy(*dest, sess->cipher.ctx) != 1) - return -EINVAL; - return 0; -#endif } +#endif /** Set session cipher parameters */ static int @@ -824,7 +816,8 @@ openssl_set_session_aead_parameters(struct openssl_session *sess, /** Parse crypto xform chain and set private session parameters */ int openssl_set_session_parameters(struct openssl_session *sess, - const struct rte_crypto_sym_xform *xform) + const struct rte_crypto_sym_xform *xform, + uint16_t nb_queue_pairs) { const struct rte_crypto_sym_xform *cipher_xform = NULL; const struct rte_crypto_sym_xform *auth_xform = NULL; @@ -886,6 +879,12 @@ openssl_set_session_parameters(struct openssl_session *sess, } } + /* + * With only one queue pair, the array of copies is not needed. + * Otherwise, one entry per queue pair is required. + */ + sess->ctx_copies_len = nb_queue_pairs > 1 ? nb_queue_pairs : 0; + return 0; } @@ -893,6 +892,13 @@ openssl_set_session_parameters(struct openssl_session *sess, void openssl_reset_session(struct openssl_session *sess) { + for (uint16_t i = 0; i < sess->ctx_copies_len; i++) { + if (sess->qp_ctx[i] != NULL) { + EVP_CIPHER_CTX_free(sess->qp_ctx[i]); + sess->qp_ctx[i] = NULL; + } + } + EVP_CIPHER_CTX_free(sess->cipher.ctx); if (sess->chain_order == OPENSSL_CHAIN_CIPHER_BPI) @@ -959,7 +965,7 @@ get_session(struct openssl_qp *qp, struct rte_crypto_op *op) sess = (struct openssl_session *)_sess->driver_priv_data; if (unlikely(openssl_set_session_parameters(sess, - op->sym->xform) != 0)) { + op->sym->xform, 1) != 0)) { rte_mempool_put(qp->sess_mp, _sess); sess = NULL; } @@ -1607,11 +1613,45 @@ process_openssl_auth_cmac(struct rte_mbuf *mbuf_src, uint8_t *dst, int offset, # endif /*----------------------------------------------------------------------------*/ +static inline EVP_CIPHER_CTX * +get_local_cipher_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{ + /* If the array is not being used, just return the main context. */ + if (sess->ctx_copies_len == 0) + return sess->cipher.ctx; + + EVP_CIPHER_CTX **lctx = &sess->qp_ctx[qp->id]; + + if (unlikely(*lctx == NULL)) { +#if OPENSSL_VERSION_NUMBER >= 0x30200000L + /* EVP_CIPHER_CTX_dup() added in OSSL 3.2 */ + *lctx = EVP_CIPHER_CTX_dup(sess->cipher.ctx); + return *lctx; +#elif OPENSSL_VERSION_NUMBER >= 0x30000000L + if (sess->chain_order == OPENSSL_CHAIN_COMBINED) { + /* AESNI special-cased to use openssl_aesni_ctx_clone() + * to allow for working around lack of + * EVP_CIPHER_CTX_copy support for 3.0.0 <= OSSL Version + * < 3.2.0. + */ + if (openssl_aesni_ctx_clone(lctx, sess) != 0) + *lctx = NULL; + return *lctx; + } +#endif + + *lctx = EVP_CIPHER_CTX_new(); + EVP_CIPHER_CTX_copy(*lctx, sess->cipher.ctx); + } + + return *lctx; +} + /** Process auth/cipher combined operation */ static void -process_openssl_combined_op - (struct rte_crypto_op *op, struct openssl_session *sess, - struct rte_mbuf *mbuf_src, struct rte_mbuf *mbuf_dst) +process_openssl_combined_op(struct openssl_qp *qp, struct rte_crypto_op *op, + struct openssl_session *sess, struct rte_mbuf *mbuf_src, + struct rte_mbuf *mbuf_dst) { /* cipher */ uint8_t *dst = NULL, *iv, *tag, *aad; @@ -1628,11 +1668,7 @@ process_openssl_combined_op return; } - EVP_CIPHER_CTX *ctx; - if (openssl_aesni_ctx_clone(&ctx, sess) != 0) { - op->status = RTE_CRYPTO_OP_STATUS_ERROR; - return; - } + EVP_CIPHER_CTX *ctx = get_local_cipher_ctx(sess, qp); iv = rte_crypto_op_ctod_offset(op, uint8_t *, sess->iv.offset); @@ -1688,8 +1724,6 @@ process_openssl_combined_op dst, tag, taglen, ctx); } - EVP_CIPHER_CTX_free(ctx); - if (status != 0) { if (status == (-EFAULT) && sess->auth.operation == @@ -1702,14 +1736,13 @@ process_openssl_combined_op /** Process cipher operation */ static void -process_openssl_cipher_op - (struct rte_crypto_op *op, struct openssl_session *sess, - struct rte_mbuf *mbuf_src, struct rte_mbuf *mbuf_dst) +process_openssl_cipher_op(struct openssl_qp *qp, struct rte_crypto_op *op, + struct openssl_session *sess, struct rte_mbuf *mbuf_src, + struct rte_mbuf *mbuf_dst) { uint8_t *dst, *iv; int srclen, status; uint8_t inplace = (mbuf_src == mbuf_dst) ? 1 : 0; - EVP_CIPHER_CTX *ctx_copy; /* * Segmented OOP destination buffer is not supported for encryption/ @@ -1728,24 +1761,22 @@ process_openssl_cipher_op iv = rte_crypto_op_ctod_offset(op, uint8_t *, sess->iv.offset); - ctx_copy = EVP_CIPHER_CTX_new(); - EVP_CIPHER_CTX_copy(ctx_copy, sess->cipher.ctx); + + EVP_CIPHER_CTX *ctx = get_local_cipher_ctx(sess, qp); if (sess->cipher.mode == OPENSSL_CIPHER_LIB) if (sess->cipher.direction == RTE_CRYPTO_CIPHER_OP_ENCRYPT) status = process_openssl_cipher_encrypt(mbuf_src, dst, op->sym->cipher.data.offset, iv, - srclen, ctx_copy, inplace); + srclen, ctx, inplace); else status = process_openssl_cipher_decrypt(mbuf_src, dst, op->sym->cipher.data.offset, iv, - srclen, ctx_copy, inplace); + srclen, ctx, inplace); else status = process_openssl_cipher_des3ctr(mbuf_src, dst, - op->sym->cipher.data.offset, iv, srclen, - ctx_copy); + op->sym->cipher.data.offset, iv, srclen, ctx); - EVP_CIPHER_CTX_free(ctx_copy); if (status != 0) op->status = RTE_CRYPTO_OP_STATUS_ERROR; } @@ -3150,13 +3181,13 @@ process_op(struct openssl_qp *qp, struct rte_crypto_op *op, switch (sess->chain_order) { case OPENSSL_CHAIN_ONLY_CIPHER: - process_openssl_cipher_op(op, sess, msrc, mdst); + process_openssl_cipher_op(qp, op, sess, msrc, mdst); break; case OPENSSL_CHAIN_ONLY_AUTH: process_openssl_auth_op(qp, op, sess, msrc, mdst); break; case OPENSSL_CHAIN_CIPHER_AUTH: - process_openssl_cipher_op(op, sess, msrc, mdst); + process_openssl_cipher_op(qp, op, sess, msrc, mdst); /* OOP */ if (msrc != mdst) copy_plaintext(msrc, mdst, op); @@ -3164,10 +3195,10 @@ process_op(struct openssl_qp *qp, struct rte_crypto_op *op, break; case OPENSSL_CHAIN_AUTH_CIPHER: process_openssl_auth_op(qp, op, sess, msrc, mdst); - process_openssl_cipher_op(op, sess, msrc, mdst); + process_openssl_cipher_op(qp, op, sess, msrc, mdst); break; case OPENSSL_CHAIN_COMBINED: - process_openssl_combined_op(op, sess, msrc, mdst); + process_openssl_combined_op(qp, op, sess, msrc, mdst); break; case OPENSSL_CHAIN_CIPHER_BPI: process_openssl_docsis_bpi_op(op, sess, msrc, mdst); diff --git a/drivers/crypto/openssl/rte_openssl_pmd_ops.c b/drivers/crypto/openssl/rte_openssl_pmd_ops.c index b16baaa08f..4209c6ab6f 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd_ops.c +++ b/drivers/crypto/openssl/rte_openssl_pmd_ops.c @@ -794,9 +794,34 @@ openssl_pmd_qp_setup(struct rte_cryptodev *dev, uint16_t qp_id, /** Returns the size of the symmetric session structure */ static unsigned -openssl_pmd_sym_session_get_size(struct rte_cryptodev *dev __rte_unused) +openssl_pmd_sym_session_get_size(struct rte_cryptodev *dev) { - return sizeof(struct openssl_session); + /* + * For 0 qps, return the max size of the session - this is necessary if + * the user calls into this function to create the session mempool, + * without first configuring the number of qps for the cryptodev. + */ + if (dev->data->nb_queue_pairs == 0) { + unsigned int max_nb_qps = ((struct openssl_private *) + dev->data->dev_private)->max_nb_qpairs; + return sizeof(struct openssl_session) + + (sizeof(void *) * max_nb_qps); + } + + /* + * With only one queue pair, the thread safety of multiple context + * copies is not necessary, so don't allocate extra memory for the + * array. + */ + if (dev->data->nb_queue_pairs == 1) + return sizeof(struct openssl_session); + + /* + * Otherwise, the size of the flexible array member should be enough to + * fit pointers to per-qp contexts. + */ + return sizeof(struct openssl_session) + + (sizeof(void *) * dev->data->nb_queue_pairs); } /** Returns the size of the asymmetric session structure */ @@ -808,7 +833,7 @@ openssl_pmd_asym_session_get_size(struct rte_cryptodev *dev __rte_unused) /** Configure the session from a crypto xform chain */ static int -openssl_pmd_sym_session_configure(struct rte_cryptodev *dev __rte_unused, +openssl_pmd_sym_session_configure(struct rte_cryptodev *dev, struct rte_crypto_sym_xform *xform, struct rte_cryptodev_sym_session *sess) { @@ -820,7 +845,8 @@ openssl_pmd_sym_session_configure(struct rte_cryptodev *dev __rte_unused, return -EINVAL; } - ret = openssl_set_session_parameters(sess_private_data, xform); + ret = openssl_set_session_parameters(sess_private_data, xform, + dev->data->nb_queue_pairs); if (ret != 0) { OPENSSL_LOG(ERR, "failed configure session parameters"); From patchwork Wed Jul 3 13:45:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jack Bond-Preston X-Patchwork-Id: 142089 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9803945561; Wed, 3 Jul 2024 15:46:38 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1206742E63; Wed, 3 Jul 2024 15:46:15 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 71C7142E52 for ; Wed, 3 Jul 2024 15:46:11 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id EB3F81007; Wed, 3 Jul 2024 06:46:35 -0700 (PDT) Received: from cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com (cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com [10.7.10.57]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 65BC13F762; Wed, 3 Jul 2024 06:46:10 -0700 (PDT) From: Jack Bond-Preston To: Kai Ji Cc: dev@dpdk.org, Wathsala Vithanage Subject: [PATCH v5 4/5] crypto/openssl: per-qp auth context clones Date: Wed, 3 Jul 2024 13:45:50 +0000 Message-Id: <20240703134552.1439633-5-jack.bond-preston@foss.arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> References: <20240603160119.1279476-1-jack.bond-preston@foss.arm.com> <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Currently EVP auth ctxs (e.g. EVP_MD_CTX, EVP_MAC_CTX) are allocated, copied to (from openssl_session), and then freed for every auth operation (ie. per packet). This is very inefficient, and avoidable. Make each openssl_session hold an array of structures, containing pointers to per-queue-pair cipher and auth context copies. These are populated on first use by allocating a new context and copying from the main context. These copies can then be used in a thread-safe manner by different worker lcores simultaneously. Consequently the auth context allocation and copy only has to happen once - the first time a given qp uses an openssl_session. This brings about a large performance boost. Throughput performance uplift measurements for HMAC-SHA1 generate on Ampere Altra Max platform: 1 worker lcore | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 0.63 | 1.42 | 123.5% | | 256 | 2.24 | 4.40 | 96.4% | | 1024 | 6.15 | 9.26 | 50.6% | | 2048 | 8.68 | 11.38 | 31.1% | | 4096 | 10.92 | 12.84 | 17.6% | 8 worker lcores | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 0.93 | 11.35 | 1122.5% | | 256 | 3.70 | 35.30 | 853.7% | | 1024 | 15.22 | 74.27 | 387.8% | | 2048 | 30.20 | 91.08 | 201.6% | | 4096 | 56.92 | 102.76 | 80.5% | Signed-off-by: Jack Bond-Preston Acked-by: Kai Ji Reviewed-by: Wathsala Vithanage --- drivers/crypto/openssl/compat.h | 26 +++ drivers/crypto/openssl/openssl_pmd_private.h | 25 ++- drivers/crypto/openssl/rte_openssl_pmd.c | 176 +++++++++++++++---- drivers/crypto/openssl/rte_openssl_pmd_ops.c | 7 +- 4 files changed, 193 insertions(+), 41 deletions(-) diff --git a/drivers/crypto/openssl/compat.h b/drivers/crypto/openssl/compat.h index 9f9167c4f1..e1814fea8c 100644 --- a/drivers/crypto/openssl/compat.h +++ b/drivers/crypto/openssl/compat.h @@ -5,6 +5,32 @@ #ifndef __RTA_COMPAT_H__ #define __RTA_COMPAT_H__ +#if OPENSSL_VERSION_NUMBER >= 0x30000000L +static __rte_always_inline void +free_hmac_ctx(EVP_MAC_CTX *ctx) +{ + EVP_MAC_CTX_free(ctx); +} + +static __rte_always_inline void +free_cmac_ctx(EVP_MAC_CTX *ctx) +{ + EVP_MAC_CTX_free(ctx); +} +#else +static __rte_always_inline void +free_hmac_ctx(HMAC_CTX *ctx) +{ + HMAC_CTX_free(ctx); +} + +static __rte_always_inline void +free_cmac_ctx(CMAC_CTX *ctx) +{ + CMAC_CTX_free(ctx); +} +#endif + #if (OPENSSL_VERSION_NUMBER < 0x10100000L) static __rte_always_inline int diff --git a/drivers/crypto/openssl/openssl_pmd_private.h b/drivers/crypto/openssl/openssl_pmd_private.h index bad7dcf2f5..a50e4d4918 100644 --- a/drivers/crypto/openssl/openssl_pmd_private.h +++ b/drivers/crypto/openssl/openssl_pmd_private.h @@ -80,6 +80,20 @@ struct __rte_cache_aligned openssl_qp { */ }; +struct evp_ctx_pair { + EVP_CIPHER_CTX *cipher; + union { + EVP_MD_CTX *auth; +#if OPENSSL_VERSION_NUMBER >= 0x30000000L + EVP_MAC_CTX *hmac; + EVP_MAC_CTX *cmac; +#else + HMAC_CTX *hmac; + CMAC_CTX *cmac; +#endif + }; +}; + /** OPENSSL crypto private session structure */ struct __rte_cache_aligned openssl_session { enum openssl_chain_order chain_order; @@ -168,11 +182,12 @@ struct __rte_cache_aligned openssl_session { uint16_t ctx_copies_len; /* < number of entries in ctx_copies */ - EVP_CIPHER_CTX *qp_ctx[]; - /**< Flexible array member of per-queue-pair pointers to copies of EVP - * context structure. Cipher contexts are not safe to use from multiple - * cores simultaneously, so maintaining these copies allows avoiding - * per-buffer copying into a temporary context. + struct evp_ctx_pair qp_ctx[]; + /**< Flexible array member of per-queue-pair structures, each containing + * pointers to copies of the cipher and auth EVP contexts. Cipher + * contexts are not safe to use from multiple cores simultaneously, so + * maintaining these copies allows avoiding per-buffer copying into a + * temporary context. */ }; diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/openssl/rte_openssl_pmd.c index df44cc097e..7e2e505222 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd.c +++ b/drivers/crypto/openssl/rte_openssl_pmd.c @@ -892,40 +892,45 @@ openssl_set_session_parameters(struct openssl_session *sess, void openssl_reset_session(struct openssl_session *sess) { + /* Free all the qp_ctx entries. */ for (uint16_t i = 0; i < sess->ctx_copies_len; i++) { - if (sess->qp_ctx[i] != NULL) { - EVP_CIPHER_CTX_free(sess->qp_ctx[i]); - sess->qp_ctx[i] = NULL; + if (sess->qp_ctx[i].cipher != NULL) { + EVP_CIPHER_CTX_free(sess->qp_ctx[i].cipher); + sess->qp_ctx[i].cipher = NULL; + } + + switch (sess->auth.mode) { + case OPENSSL_AUTH_AS_AUTH: + EVP_MD_CTX_destroy(sess->qp_ctx[i].auth); + sess->qp_ctx[i].auth = NULL; + break; + case OPENSSL_AUTH_AS_HMAC: + free_hmac_ctx(sess->qp_ctx[i].hmac); + sess->qp_ctx[i].hmac = NULL; + break; + case OPENSSL_AUTH_AS_CMAC: + free_cmac_ctx(sess->qp_ctx[i].cmac); + sess->qp_ctx[i].cmac = NULL; + break; } } EVP_CIPHER_CTX_free(sess->cipher.ctx); - if (sess->chain_order == OPENSSL_CHAIN_CIPHER_BPI) - EVP_CIPHER_CTX_free(sess->cipher.bpi_ctx); - switch (sess->auth.mode) { case OPENSSL_AUTH_AS_AUTH: EVP_MD_CTX_destroy(sess->auth.auth.ctx); break; case OPENSSL_AUTH_AS_HMAC: - EVP_PKEY_free(sess->auth.hmac.pkey); -# if OPENSSL_VERSION_NUMBER >= 0x30000000L - EVP_MAC_CTX_free(sess->auth.hmac.ctx); -# else - HMAC_CTX_free(sess->auth.hmac.ctx); -# endif + free_hmac_ctx(sess->auth.hmac.ctx); break; case OPENSSL_AUTH_AS_CMAC: -# if OPENSSL_VERSION_NUMBER >= 0x30000000L - EVP_MAC_CTX_free(sess->auth.cmac.ctx); -# else - CMAC_CTX_free(sess->auth.cmac.ctx); -# endif - break; - default: + free_cmac_ctx(sess->auth.cmac.ctx); break; } + + if (sess->chain_order == OPENSSL_CHAIN_CIPHER_BPI) + EVP_CIPHER_CTX_free(sess->cipher.bpi_ctx); } /** Provide session for operation */ @@ -1471,6 +1476,9 @@ process_openssl_auth_mac(struct rte_mbuf *mbuf_src, uint8_t *dst, int offset, if (m == 0) goto process_auth_err; + if (EVP_MAC_init(ctx, NULL, 0, NULL) <= 0) + goto process_auth_err; + src = rte_pktmbuf_mtod_offset(m, uint8_t *, offset); l = rte_pktmbuf_data_len(m) - offset; @@ -1497,11 +1505,9 @@ process_openssl_auth_mac(struct rte_mbuf *mbuf_src, uint8_t *dst, int offset, if (EVP_MAC_final(ctx, dst, &dstlen, DIGEST_LENGTH_MAX) != 1) goto process_auth_err; - EVP_MAC_CTX_free(ctx); return 0; process_auth_err: - EVP_MAC_CTX_free(ctx); OPENSSL_LOG(ERR, "Process openssl auth failed"); return -EINVAL; } @@ -1620,7 +1626,7 @@ get_local_cipher_ctx(struct openssl_session *sess, struct openssl_qp *qp) if (sess->ctx_copies_len == 0) return sess->cipher.ctx; - EVP_CIPHER_CTX **lctx = &sess->qp_ctx[qp->id]; + EVP_CIPHER_CTX **lctx = &sess->qp_ctx[qp->id].cipher; if (unlikely(*lctx == NULL)) { #if OPENSSL_VERSION_NUMBER >= 0x30200000L @@ -1647,6 +1653,112 @@ get_local_cipher_ctx(struct openssl_session *sess, struct openssl_qp *qp) return *lctx; } +static inline EVP_MD_CTX * +get_local_auth_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{ + /* If the array is not being used, just return the main context. */ + if (sess->ctx_copies_len == 0) + return sess->auth.auth.ctx; + + EVP_MD_CTX **lctx = &sess->qp_ctx[qp->id].auth; + + if (unlikely(*lctx == NULL)) { +#if OPENSSL_VERSION_NUMBER >= 0x30100000L + /* EVP_MD_CTX_dup() added in OSSL 3.1 */ + *lctx = EVP_MD_CTX_dup(sess->auth.auth.ctx); +#else + *lctx = EVP_MD_CTX_new(); + EVP_MD_CTX_copy(*lctx, sess->auth.auth.ctx); +#endif + } + + return *lctx; +} + +#if OPENSSL_VERSION_NUMBER >= 0x30000000L +static inline EVP_MAC_CTX * +#else +static inline HMAC_CTX * +#endif +get_local_hmac_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{ +#if (OPENSSL_VERSION_NUMBER >= 0x30000000L && OPENSSL_VERSION_NUMBER < 0x30003000L) + /* For OpenSSL versions 3.0.0 <= v < 3.0.3, re-initing of + * EVP_MAC_CTXs is broken, and doesn't actually reset their + * state. This was fixed in OSSL commit c9ddc5af5199 ("Avoid + * undefined behavior of provided macs on EVP_MAC + * reinitialization"). In cases where the fix is not present, + * fall back to duplicating the context every buffer as a + * workaround, at the cost of performance. + */ + RTE_SET_USED(qp); + return EVP_MAC_CTX_dup(sess->auth.hmac.ctx); +#else + if (sess->ctx_copies_len == 0) + return sess->auth.hmac.ctx; + +#if OPENSSL_VERSION_NUMBER >= 0x30000000L + EVP_MAC_CTX **lctx = +#else + HMAC_CTX **lctx = +#endif + &sess->qp_ctx[qp->id].hmac; + + if (unlikely(*lctx == NULL)) { +#if OPENSSL_VERSION_NUMBER >= 0x30000000L + *lctx = EVP_MAC_CTX_dup(sess->auth.hmac.ctx); +#else + *lctx = HMAC_CTX_new(); + HMAC_CTX_copy(*lctx, sess->auth.hmac.ctx); +#endif + } + + return *lctx; +#endif +} + +#if OPENSSL_VERSION_NUMBER >= 0x30000000L +static inline EVP_MAC_CTX * +#else +static inline CMAC_CTX * +#endif +get_local_cmac_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{ +#if (OPENSSL_VERSION_NUMBER >= 0x30000000L && OPENSSL_VERSION_NUMBER < 0x30003000L) + /* For OpenSSL versions 3.0.0 <= v < 3.0.3, re-initing of + * EVP_MAC_CTXs is broken, and doesn't actually reset their + * state. This was fixed in OSSL commit c9ddc5af5199 ("Avoid + * undefined behavior of provided macs on EVP_MAC + * reinitialization"). In cases where the fix is not present, + * fall back to duplicating the context every buffer as a + * workaround, at the cost of performance. + */ + RTE_SET_USED(qp); + return EVP_MAC_CTX_dup(sess->auth.cmac.ctx); +#else + if (sess->ctx_copies_len == 0) + return sess->auth.cmac.ctx; + +#if OPENSSL_VERSION_NUMBER >= 0x30000000L + EVP_MAC_CTX **lctx = +#else + CMAC_CTX **lctx = +#endif + &sess->qp_ctx[qp->id].cmac; + + if (unlikely(*lctx == NULL)) { +#if OPENSSL_VERSION_NUMBER >= 0x30000000L + *lctx = EVP_MAC_CTX_dup(sess->auth.cmac.ctx); +#else + *lctx = CMAC_CTX_new(); + CMAC_CTX_copy(*lctx, sess->auth.cmac.ctx); +#endif + } + + return *lctx; +#endif +} + /** Process auth/cipher combined operation */ static void process_openssl_combined_op(struct openssl_qp *qp, struct rte_crypto_op *op, @@ -1895,42 +2007,40 @@ process_openssl_auth_op(struct openssl_qp *qp, struct rte_crypto_op *op, switch (sess->auth.mode) { case OPENSSL_AUTH_AS_AUTH: - ctx_a = EVP_MD_CTX_create(); - EVP_MD_CTX_copy_ex(ctx_a, sess->auth.auth.ctx); + ctx_a = get_local_auth_ctx(sess, qp); status = process_openssl_auth(mbuf_src, dst, op->sym->auth.data.offset, NULL, NULL, srclen, ctx_a, sess->auth.auth.evp_algo); - EVP_MD_CTX_destroy(ctx_a); break; case OPENSSL_AUTH_AS_HMAC: + ctx_h = get_local_hmac_ctx(sess, qp); # if OPENSSL_VERSION_NUMBER >= 0x30000000L - ctx_h = EVP_MAC_CTX_dup(sess->auth.hmac.ctx); status = process_openssl_auth_mac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_h); # else - ctx_h = HMAC_CTX_new(); - HMAC_CTX_copy(ctx_h, sess->auth.hmac.ctx); status = process_openssl_auth_hmac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_h); - HMAC_CTX_free(ctx_h); # endif +#if (OPENSSL_VERSION_NUMBER >= 0x30000000L && OPENSSL_VERSION_NUMBER < 0x30003000L) + EVP_MAC_CTX_free(ctx_h); +#endif break; case OPENSSL_AUTH_AS_CMAC: + ctx_c = get_local_cmac_ctx(sess, qp); # if OPENSSL_VERSION_NUMBER >= 0x30000000L - ctx_c = EVP_MAC_CTX_dup(sess->auth.cmac.ctx); status = process_openssl_auth_mac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_c); # else - ctx_c = CMAC_CTX_new(); - CMAC_CTX_copy(ctx_c, sess->auth.cmac.ctx); status = process_openssl_auth_cmac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_c); - CMAC_CTX_free(ctx_c); # endif +#if (OPENSSL_VERSION_NUMBER >= 0x30000000L && OPENSSL_VERSION_NUMBER < 0x30003000L) + EVP_MAC_CTX_free(ctx_c); +#endif break; default: status = -1; diff --git a/drivers/crypto/openssl/rte_openssl_pmd_ops.c b/drivers/crypto/openssl/rte_openssl_pmd_ops.c index 4209c6ab6f..1bbb855a59 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd_ops.c +++ b/drivers/crypto/openssl/rte_openssl_pmd_ops.c @@ -805,7 +805,7 @@ openssl_pmd_sym_session_get_size(struct rte_cryptodev *dev) unsigned int max_nb_qps = ((struct openssl_private *) dev->data->dev_private)->max_nb_qpairs; return sizeof(struct openssl_session) + - (sizeof(void *) * max_nb_qps); + (sizeof(struct evp_ctx_pair) * max_nb_qps); } /* @@ -818,10 +818,11 @@ openssl_pmd_sym_session_get_size(struct rte_cryptodev *dev) /* * Otherwise, the size of the flexible array member should be enough to - * fit pointers to per-qp contexts. + * fit pointers to per-qp contexts. This is twice the number of queue + * pairs, to allow for auth and cipher contexts. */ return sizeof(struct openssl_session) + - (sizeof(void *) * dev->data->nb_queue_pairs); + (sizeof(struct evp_ctx_pair) * dev->data->nb_queue_pairs); } /** Returns the size of the asymmetric session structure */ From patchwork Wed Jul 3 13:45:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jack Bond-Preston X-Patchwork-Id: 142090 X-Patchwork-Delegate: gakhil@marvell.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 255F545561; Wed, 3 Jul 2024 15:46:45 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4B8DC42E5C; Wed, 3 Jul 2024 15:46:16 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 441A142E55 for ; Wed, 3 Jul 2024 15:46:12 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C288B367; Wed, 3 Jul 2024 06:46:36 -0700 (PDT) Received: from cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com (cesw-amp-gbt-1s-m12830-01.lab.cambridge.arm.com [10.7.10.57]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3D8773F762; Wed, 3 Jul 2024 06:46:11 -0700 (PDT) From: Jack Bond-Preston To: Kai Ji Cc: dev@dpdk.org, Wathsala Vithanage Subject: [PATCH v5 5/5] crypto/openssl: only set cipher padding once Date: Wed, 3 Jul 2024 13:45:51 +0000 Message-Id: <20240703134552.1439633-6-jack.bond-preston@foss.arm.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> References: <20240603160119.1279476-1-jack.bond-preston@foss.arm.com> <20240703134552.1439633-1-jack.bond-preston@foss.arm.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Setting the cipher padding has a noticeable performance footprint, and it doesn't need to be done for every call to process_openssl_cipher_{en,de}crypt(). Setting it causes OpenSSL to set it on every future context re-init. Thus, for every buffer after the first one, the padding is being set twice. Instead, just set the cipher padding once - when configuring the session parameters - avoiding the unnecessary double setting behaviour. This is skipped for AEAD ciphers, where disabling padding is not necessary. Throughput performance uplift measurements for AES-CBC-128 encrypt on Ampere Altra Max platform: 1 worker lcore | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 2.97 | 3.72 | 25.2% | | 256 | 8.10 | 9.42 | 16.3% | | 1024 | 14.22 | 15.18 | 6.8% | | 2048 | 16.28 | 16.93 | 4.0% | | 4096 | 17.58 | 17.97 | 2.2% | 8 worker lcores | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 21.27 | 29.85 | 40.3% | | 256 | 60.05 | 75.53 | 25.8% | | 1024 | 110.11 | 121.56 | 10.4% | | 2048 | 128.05 | 135.40 | 5.7% | | 4096 | 139.45 | 143.76 | 3.1% | Signed-off-by: Jack Bond-Preston Acked-by: Kai Ji Reviewed-by: Wathsala Vithanage --- drivers/crypto/openssl/rte_openssl_pmd.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/openssl/rte_openssl_pmd.c index 7e2e505222..101111e85b 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd.c +++ b/drivers/crypto/openssl/rte_openssl_pmd.c @@ -619,6 +619,8 @@ openssl_set_session_cipher_parameters(struct openssl_session *sess, return -ENOTSUP; } + EVP_CIPHER_CTX_set_padding(sess->cipher.ctx, 0); + return 0; } @@ -1124,8 +1126,6 @@ process_openssl_cipher_encrypt(struct rte_mbuf *mbuf_src, uint8_t *dst, if (EVP_EncryptInit_ex(ctx, NULL, NULL, NULL, iv) <= 0) goto process_cipher_encrypt_err; - EVP_CIPHER_CTX_set_padding(ctx, 0); - if (process_openssl_encryption_update(mbuf_src, offset, &dst, srclen, ctx, inplace)) goto process_cipher_encrypt_err; @@ -1174,8 +1174,6 @@ process_openssl_cipher_decrypt(struct rte_mbuf *mbuf_src, uint8_t *dst, if (EVP_DecryptInit_ex(ctx, NULL, NULL, NULL, iv) <= 0) goto process_cipher_decrypt_err; - EVP_CIPHER_CTX_set_padding(ctx, 0); - if (process_openssl_decryption_update(mbuf_src, offset, &dst, srclen, ctx, inplace)) goto process_cipher_decrypt_err;