diff mbox series

[v4,2/6] crypto/scheduler: use unified session

Message ID 20220926191424.1069668-3-gakhil@marvell.com (mailing list archive)
State Superseded
Delegated to: akhil goyal
Headers show
Series crypto/security session framework rework | expand

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Akhil Goyal Sept. 26, 2022, 7:14 p.m. UTC
From: Fan Zhang <roy.fan.zhang@intel.com>

This patch updates the scheduler PMD to use unified session
data structure. Previously thanks to the private session
array in cryptodev sym session there are no necessary
change needed for scheduler PMD other than the way ops
are enqueued/dequeued. The patch inherits the same design
in the original session data structure to the scheduler PMD
so the cryptodev sym session can be as a linear buffer for
both session header and driver private data.

With the change there are inevitable extra cost on both memory
(64 bytes per session per driver type) and cycle count (set
the correct session for each cop based on the worker before
enqueue, and retrieve the original session after dequeue).

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
Signed-off-by: Akhil Goyal <gakhil@marvell.com>
Acked-by: Kai Ji <kai.ji@intel.com>
Tested-by: Gagandeep Singh <g.singh@nxp.com>
Tested-by: David Coyle <david.coyle@intel.com>
Tested-by: Kevin O'Sullivan <kevin.osullivan@intel.com>
---
 drivers/crypto/scheduler/scheduler_failover.c |  19 ++-
 .../crypto/scheduler/scheduler_multicore.c    |  17 +++
 .../scheduler/scheduler_pkt_size_distr.c      |  84 +++++++++---
 drivers/crypto/scheduler/scheduler_pmd_ops.c  | 107 +++++++++++++++-
 .../crypto/scheduler/scheduler_pmd_private.h  | 120 +++++++++++++++++-
 .../crypto/scheduler/scheduler_roundrobin.c   |  11 +-
 6 files changed, 318 insertions(+), 40 deletions(-)

Comments

Kai Ji Sept. 27, 2022, 11:03 a.m. UTC | #1
Have the sym sessions changes been tested with the dpdk-test-crypto-perf tool ? 

root@silpixa00401033:build# ./app/dpdk-test-crypto-perf -l 3,4 --socket-mem 4096,0 -a 0000:33:01.0,qat_sym_cipher_crc_enable=1 --vdev crypto_aesni_mb1 --vdev "crypto_scheduler,worker=crypto_aesni_mb1,worker=0000:33:01.0_qat_sym,mode=packet-size-distr,ordering=disable,mode_param=threshold:64" -n 6 --force-max-simd-bitwidth=512 -- --ptest throughput --silent --total-ops 3000000 --burst-sz 32 --buffer-sz 105,277,1301  --imix 15,10,75 --devtype crypto_scheduler --optype cipher-only --cipher-algo aes-docsisbpi --cipher-iv-sz 16 --cipher-op encrypt --cipher-key-sz 16 --docsis-hdr-sz 17
EAL: Detected CPU lcores: 128
EAL: Detected NUMA nodes: 2
EAL: Detected static linkage of DPDK
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: VFIO support initialized
EAL: Using IOMMU type 1 (Type 1)
EAL: Probe PCI driver: qat (8086:37c9) device: 0000:33:01.0 (socket 0)
CRYPTODEV: Creating cryptodev 0000:33:01.0_qat_sym

CRYPTODEV: Initialisation parameters - name: 0000:33:01.0_qat_sym,socket id: 0, max queue pairs: 0
CRYPTODEV: Creating cryptodev 0000:33:01.0_qat_asym

CRYPTODEV: Initialisation parameters - name: 0000:33:01.0_qat_asym,socket id: 0, max queue pairs: 0
CRYPTODEV: Creating cryptodev crypto_aesni_mb1

CRYPTODEV: Initialisation parameters - name: crypto_aesni_mb1,socket id: 0, max queue pairs: 8
ipsec_mb_create() line 152: IPSec Multi-buffer library version used: 1.2.0

CRYPTODEV: Creating cryptodev crypto_scheduler

CRYPTODEV: Initialisation parameters - name: crypto_scheduler,socket id: 0, max queue pairs: 8
cryptodev_scheduler_create() line 138:   Scheduling mode = packet-size-distr
PMD:   Sched mode param (threshold = 64)
cryptodev_scheduler_create() line 193:   Packet ordering = disable
scheduler_attach_init_worker() line 45: Scheduler crypto_scheduler attached worker 0000:33:01.0_qat_sym
scheduler_attach_init_worker() line 45: Scheduler crypto_scheduler attached worker crypto_aesni_mb1
Allocated pool "sess_mp_0" on socket 0
USER1: Test run constructor failed 

> -----Original Message-----
> From: Akhil Goyal <gakhil@marvell.com>
> Sent: Monday, September 26, 2022 8:14 PM
> To: dev@dpdk.org
> Cc: thomas@monjalon.net; david.marchand@redhat.com;
> hemant.agrawal@nxp.com; vattunuru@marvell.com;
> ferruh.yigit@xilinx.com; andrew.rybchenko@oktetlabs.ru;
> konstantin.v.ananyev@yandex.ru; jiawenwu@trustnetic.com;
> yisen.zhuang@huawei.com; irusskikh@marvell.com; jerinj@marvell.com;
> adwivedi@marvell.com; maxime.coquelin@redhat.com; chandu@amd.com;
> ruifeng.wang@arm.com; ajit.khaparde@broadcom.com;
> anoobj@marvell.com; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; matan@nvidia.com; g.singh@nxp.com;
> Yang, Qiming <qiming.yang@intel.com>; Wu, Wenjun1
> <wenjun1.wu@intel.com>; jianwang@trustnetic.com; Wu, Jingjing
> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> ndabilpuram@marvell.com; Zhang, Roy Fan <roy.fan.zhang@intel.com>;
> Akhil Goyal <gakhil@marvell.com>; Ji, Kai <kai.ji@intel.com>; Coyle, David
> <david.coyle@intel.com>; O'Sullivan, Kevin <kevin.osullivan@intel.com>
> Subject: [PATCH v4 2/6] crypto/scheduler: use unified session
> 
> From: Fan Zhang <roy.fan.zhang@intel.com>
> 
> This patch updates the scheduler PMD to use unified session data structure.
> Previously thanks to the private session array in cryptodev sym session there
> are no necessary change needed for scheduler PMD other than the way ops
> are enqueued/dequeued. The patch inherits the same design in the original
> session data structure to the scheduler PMD so the cryptodev sym session
> can be as a linear buffer for both session header and driver private data.
> 
> With the change there are inevitable extra cost on both memory
> (64 bytes per session per driver type) and cycle count (set the correct session
> for each cop based on the worker before enqueue, and retrieve the original
> session after dequeue).
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> Signed-off-by: Akhil Goyal <gakhil@marvell.com>
> Acked-by: Kai Ji <kai.ji@intel.com>
> Tested-by: Gagandeep Singh <g.singh@nxp.com>
> Tested-by: David Coyle <david.coyle@intel.com>
> Tested-by: Kevin O'Sullivan <kevin.osullivan@intel.com>
> ---
Akhil Goyal Sept. 27, 2022, 7:25 p.m. UTC | #2
> Have the sym sessions changes been tested with the dpdk-test-crypto-perf tool ?

I have not tested for scheduler pmd. Can you root cause the issue and send a fix? 
Fan may have tested it.

> 
> root@silpixa00401033:build# ./app/dpdk-test-crypto-perf -l 3,4 --socket-mem
> 4096,0 -a 0000:33:01.0,qat_sym_cipher_crc_enable=1 --vdev crypto_aesni_mb1
> --vdev
> "crypto_scheduler,worker=crypto_aesni_mb1,worker=0000:33:01.0_qat_sym,m
> ode=packet-size-distr,ordering=disable,mode_param=threshold:64" -n 6 --force-
> max-simd-bitwidth=512 -- --ptest throughput --silent --total-ops 3000000 --
> burst-sz 32 --buffer-sz 105,277,1301  --imix 15,10,75 --devtype crypto_scheduler
> --optype cipher-only --cipher-algo aes-docsisbpi --cipher-iv-sz 16 --cipher-op
> encrypt --cipher-key-sz 16 --docsis-hdr-sz 17
> EAL: Detected CPU lcores: 128
> EAL: Detected NUMA nodes: 2
> EAL: Detected static linkage of DPDK
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: VFIO support initialized
> EAL: Using IOMMU type 1 (Type 1)
> EAL: Probe PCI driver: qat (8086:37c9) device: 0000:33:01.0 (socket 0)
> CRYPTODEV: Creating cryptodev 0000:33:01.0_qat_sym
> 
> CRYPTODEV: Initialisation parameters - name: 0000:33:01.0_qat_sym,socket id:
> 0, max queue pairs: 0
> CRYPTODEV: Creating cryptodev 0000:33:01.0_qat_asym
> 
> CRYPTODEV: Initialisation parameters - name: 0000:33:01.0_qat_asym,socket
> id: 0, max queue pairs: 0
> CRYPTODEV: Creating cryptodev crypto_aesni_mb1
> 
> CRYPTODEV: Initialisation parameters - name: crypto_aesni_mb1,socket id: 0,
> max queue pairs: 8
> ipsec_mb_create() line 152: IPSec Multi-buffer library version used: 1.2.0
> 
> CRYPTODEV: Creating cryptodev crypto_scheduler
> 
> CRYPTODEV: Initialisation parameters - name: crypto_scheduler,socket id: 0,
> max queue pairs: 8
> cryptodev_scheduler_create() line 138:   Scheduling mode = packet-size-distr
> PMD:   Sched mode param (threshold = 64)
> cryptodev_scheduler_create() line 193:   Packet ordering = disable
> scheduler_attach_init_worker() line 45: Scheduler crypto_scheduler attached
> worker 0000:33:01.0_qat_sym
> scheduler_attach_init_worker() line 45: Scheduler crypto_scheduler attached
> worker crypto_aesni_mb1
> Allocated pool "sess_mp_0" on socket 0
> USER1: Test run constructor failed
> 
> > -----Original Message-----
> > From: Akhil Goyal <gakhil@marvell.com>
> > Sent: Monday, September 26, 2022 8:14 PM
> > To: dev@dpdk.org
> > Cc: thomas@monjalon.net; david.marchand@redhat.com;
> > hemant.agrawal@nxp.com; vattunuru@marvell.com;
> > ferruh.yigit@xilinx.com; andrew.rybchenko@oktetlabs.ru;
> > konstantin.v.ananyev@yandex.ru; jiawenwu@trustnetic.com;
> > yisen.zhuang@huawei.com; irusskikh@marvell.com; jerinj@marvell.com;
> > adwivedi@marvell.com; maxime.coquelin@redhat.com; chandu@amd.com;
> > ruifeng.wang@arm.com; ajit.khaparde@broadcom.com;
> > anoobj@marvell.com; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>; matan@nvidia.com; g.singh@nxp.com;
> > Yang, Qiming <qiming.yang@intel.com>; Wu, Wenjun1
> > <wenjun1.wu@intel.com>; jianwang@trustnetic.com; Wu, Jingjing
> > <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> > ndabilpuram@marvell.com; Zhang, Roy Fan <roy.fan.zhang@intel.com>;
> > Akhil Goyal <gakhil@marvell.com>; Ji, Kai <kai.ji@intel.com>; Coyle, David
> > <david.coyle@intel.com>; O'Sullivan, Kevin <kevin.osullivan@intel.com>
> > Subject: [PATCH v4 2/6] crypto/scheduler: use unified session
> >
> > From: Fan Zhang <roy.fan.zhang@intel.com>
> >
> > This patch updates the scheduler PMD to use unified session data structure.
> > Previously thanks to the private session array in cryptodev sym session there
> > are no necessary change needed for scheduler PMD other than the way ops
> > are enqueued/dequeued. The patch inherits the same design in the original
> > session data structure to the scheduler PMD so the cryptodev sym session
> > can be as a linear buffer for both session header and driver private data.
> >
> > With the change there are inevitable extra cost on both memory
> > (64 bytes per session per driver type) and cycle count (set the correct session
> > for each cop based on the worker before enqueue, and retrieve the original
> > session after dequeue).
> >
> > Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > Signed-off-by: Akhil Goyal <gakhil@marvell.com>
> > Acked-by: Kai Ji <kai.ji@intel.com>
> > Tested-by: Gagandeep Singh <g.singh@nxp.com>
> > Tested-by: David Coyle <david.coyle@intel.com>
> > Tested-by: Kevin O'Sullivan <kevin.osullivan@intel.com>
> > ---
Akhil Goyal Sept. 28, 2022, 12:56 p.m. UTC | #3
> Subject: RE: [PATCH v4 2/6] crypto/scheduler: use unified session
> 
> > Have the sym sessions changes been tested with the dpdk-test-crypto-perf
> tool ?
> 
> I have not tested for scheduler pmd. Can you root cause the issue and send a
> fix?
> Fan may have tested it.

Can this fix be taken up as a follow up patch (if not ready) and we merge this series in RC1,
As it is a big change and we cannot delay it beyond RC1.

> 
> >
> > root@silpixa00401033:build# ./app/dpdk-test-crypto-perf -l 3,4 --socket-mem
> > 4096,0 -a 0000:33:01.0,qat_sym_cipher_crc_enable=1 --vdev
> crypto_aesni_mb1
> > --vdev
> >
> "crypto_scheduler,worker=crypto_aesni_mb1,worker=0000:33:01.0_qat_sym,m
> > ode=packet-size-distr,ordering=disable,mode_param=threshold:64" -n 6 --
> force-
> > max-simd-bitwidth=512 -- --ptest throughput --silent --total-ops 3000000 --
> > burst-sz 32 --buffer-sz 105,277,1301  --imix 15,10,75 --devtype
> crypto_scheduler
> > --optype cipher-only --cipher-algo aes-docsisbpi --cipher-iv-sz 16 --cipher-op
> > encrypt --cipher-key-sz 16 --docsis-hdr-sz 17
> > EAL: Detected CPU lcores: 128
> > EAL: Detected NUMA nodes: 2
> > EAL: Detected static linkage of DPDK
> > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > EAL: Selected IOVA mode 'PA'
> > EAL: VFIO support initialized
> > EAL: Using IOMMU type 1 (Type 1)
> > EAL: Probe PCI driver: qat (8086:37c9) device: 0000:33:01.0 (socket 0)
> > CRYPTODEV: Creating cryptodev 0000:33:01.0_qat_sym
> >
> > CRYPTODEV: Initialisation parameters - name: 0000:33:01.0_qat_sym,socket
> id:
> > 0, max queue pairs: 0
> > CRYPTODEV: Creating cryptodev 0000:33:01.0_qat_asym
> >
> > CRYPTODEV: Initialisation parameters - name: 0000:33:01.0_qat_asym,socket
> > id: 0, max queue pairs: 0
> > CRYPTODEV: Creating cryptodev crypto_aesni_mb1
> >
> > CRYPTODEV: Initialisation parameters - name: crypto_aesni_mb1,socket id: 0,
> > max queue pairs: 8
> > ipsec_mb_create() line 152: IPSec Multi-buffer library version used: 1.2.0
> >
> > CRYPTODEV: Creating cryptodev crypto_scheduler
> >
> > CRYPTODEV: Initialisation parameters - name: crypto_scheduler,socket id: 0,
> > max queue pairs: 8
> > cryptodev_scheduler_create() line 138:   Scheduling mode = packet-size-distr
> > PMD:   Sched mode param (threshold = 64)
> > cryptodev_scheduler_create() line 193:   Packet ordering = disable
> > scheduler_attach_init_worker() line 45: Scheduler crypto_scheduler attached
> > worker 0000:33:01.0_qat_sym
> > scheduler_attach_init_worker() line 45: Scheduler crypto_scheduler attached
> > worker crypto_aesni_mb1
> > Allocated pool "sess_mp_0" on socket 0
> > USER1: Test run constructor failed
> >
> > > -----Original Message-----
> > > From: Akhil Goyal <gakhil@marvell.com>
> > > Sent: Monday, September 26, 2022 8:14 PM
> > > To: dev@dpdk.org
> > > Cc: thomas@monjalon.net; david.marchand@redhat.com;
> > > hemant.agrawal@nxp.com; vattunuru@marvell.com;
> > > ferruh.yigit@xilinx.com; andrew.rybchenko@oktetlabs.ru;
> > > konstantin.v.ananyev@yandex.ru; jiawenwu@trustnetic.com;
> > > yisen.zhuang@huawei.com; irusskikh@marvell.com; jerinj@marvell.com;
> > > adwivedi@marvell.com; maxime.coquelin@redhat.com; chandu@amd.com;
> > > ruifeng.wang@arm.com; ajit.khaparde@broadcom.com;
> > > anoobj@marvell.com; De Lara Guarch, Pablo
> > > <pablo.de.lara.guarch@intel.com>; matan@nvidia.com; g.singh@nxp.com;
> > > Yang, Qiming <qiming.yang@intel.com>; Wu, Wenjun1
> > > <wenjun1.wu@intel.com>; jianwang@trustnetic.com; Wu, Jingjing
> > > <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>;
> > > ndabilpuram@marvell.com; Zhang, Roy Fan <roy.fan.zhang@intel.com>;
> > > Akhil Goyal <gakhil@marvell.com>; Ji, Kai <kai.ji@intel.com>; Coyle, David
> > > <david.coyle@intel.com>; O'Sullivan, Kevin <kevin.osullivan@intel.com>
> > > Subject: [PATCH v4 2/6] crypto/scheduler: use unified session
> > >
> > > From: Fan Zhang <roy.fan.zhang@intel.com>
> > >
> > > This patch updates the scheduler PMD to use unified session data structure.
> > > Previously thanks to the private session array in cryptodev sym session there
> > > are no necessary change needed for scheduler PMD other than the way ops
> > > are enqueued/dequeued. The patch inherits the same design in the original
> > > session data structure to the scheduler PMD so the cryptodev sym session
> > > can be as a linear buffer for both session header and driver private data.
> > >
> > > With the change there are inevitable extra cost on both memory
> > > (64 bytes per session per driver type) and cycle count (set the correct session
> > > for each cop based on the worker before enqueue, and retrieve the original
> > > session after dequeue).
> > >
> > > Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > > Signed-off-by: Akhil Goyal <gakhil@marvell.com>
> > > Acked-by: Kai Ji <kai.ji@intel.com>
> > > Tested-by: Gagandeep Singh <g.singh@nxp.com>
> > > Tested-by: David Coyle <david.coyle@intel.com>
> > > Tested-by: Kevin O'Sullivan <kevin.osullivan@intel.com>
> > > ---
Kai Ji Sept. 28, 2022, 3:29 p.m. UTC | #4
> -----Original Message-----
> From: Akhil Goyal <gakhil@marvell.com>
> Sent: Wednesday, September 28, 2022 1:56 PM
> To: Ji, Kai <kai.ji@intel.com>; dev@dpdk.org
> Cc: thomas@monjalon.net; david.marchand@redhat.com;
> hemant.agrawal@nxp.com; Vamsi Krishna Attunuru
> <vattunuru@marvell.com>; ferruh.yigit@xilinx.com;
> andrew.rybchenko@oktetlabs.ru; konstantin.v.ananyev@yandex.ru;
> jiawenwu@trustnetic.com; yisen.zhuang@huawei.com; Igor Russkikh
> <irusskikh@marvell.com>; Jerin Jacob Kollanukkaran <jerinj@marvell.com>;
> Ankur Dwivedi <adwivedi@marvell.com>; maxime.coquelin@redhat.com;
> chandu@amd.com; ruifeng.wang@arm.com; ajit.khaparde@broadcom.com;
> Anoob Joseph <anoobj@marvell.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>; matan@nvidia.com; g.singh@nxp.com;
> Yang, Qiming <qiming.yang@intel.com>; Wu, Wenjun1
> <wenjun1.wu@intel.com>; jianwang@trustnetic.com; Wu, Jingjing
> <jingjing.wu@intel.com>; Xing, Beilei <beilei.xing@intel.com>; Nithin Kumar
> Dabilpuram <ndabilpuram@marvell.com>; Zhang, Roy Fan
> <roy.fan.zhang@intel.com>; Coyle, David <david.coyle@intel.com>;
> O'Sullivan, Kevin <kevin.osullivan@intel.com>
> Subject: RE: [PATCH v4 2/6] crypto/scheduler: use unified session
> 
> > Subject: RE: [PATCH v4 2/6] crypto/scheduler: use unified session
> >
> > > Have the sym sessions changes been tested with the
> > > dpdk-test-crypto-perf
> > tool ?
> >
> > I have not tested for scheduler pmd. Can you root cause the issue and
> > send a fix?
> > Fan may have tested it.
> 
> Can this fix be taken up as a follow up patch (if not ready) and we merge this
> series in RC1, As it is a big change and we cannot delay it beyond RC1.
That's fine with me, I will try to root cause this today, but there is no issue to fix it after rc1 

> 
> >
> > >
> > > root@silpixa00401033:build# ./app/dpdk-test-crypto-perf -l 3,4
> > > --socket-mem
> > > 4096,0 -a 0000:33:01.0,qat_sym_cipher_crc_enable=1 --vdev
> > crypto_aesni_mb1
> > > --vdev
> > >
> >
> "crypto_scheduler,worker=crypto_aesni_mb1,worker=0000:33:01.0_qat_sy
> m,
> > m
> > > ode=packet-size-distr,ordering=disable,mode_param=threshold:64" -n 6
> > > --
diff mbox series

Patch

diff --git a/drivers/crypto/scheduler/scheduler_failover.c b/drivers/crypto/scheduler/scheduler_failover.c
index 2a0e29fa72..7fadcf66d0 100644
--- a/drivers/crypto/scheduler/scheduler_failover.c
+++ b/drivers/crypto/scheduler/scheduler_failover.c
@@ -16,18 +16,19 @@ 
 struct fo_scheduler_qp_ctx {
 	struct scheduler_worker primary_worker;
 	struct scheduler_worker secondary_worker;
+	uint8_t primary_worker_index;
+	uint8_t secondary_worker_index;
 
 	uint8_t deq_idx;
 };
 
 static __rte_always_inline uint16_t
 failover_worker_enqueue(struct scheduler_worker *worker,
-		struct rte_crypto_op **ops, uint16_t nb_ops)
+		struct rte_crypto_op **ops, uint16_t nb_ops, uint8_t index)
 {
-	uint16_t i, processed_ops;
+	uint16_t processed_ops;
 
-	for (i = 0; i < nb_ops && i < 4; i++)
-		rte_prefetch0(ops[i]->sym->session);
+	scheduler_set_worker_session(ops, nb_ops, index);
 
 	processed_ops = rte_cryptodev_enqueue_burst(worker->dev_id,
 			worker->qp_id, ops, nb_ops);
@@ -47,13 +48,14 @@  schedule_enqueue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 		return 0;
 
 	enqueued_ops = failover_worker_enqueue(&qp_ctx->primary_worker,
-			ops, nb_ops);
+			ops, nb_ops, PRIMARY_WORKER_IDX);
 
 	if (enqueued_ops < nb_ops)
 		enqueued_ops += failover_worker_enqueue(
 				&qp_ctx->secondary_worker,
 				&ops[enqueued_ops],
-				nb_ops - enqueued_ops);
+				nb_ops - enqueued_ops,
+				SECONDARY_WORKER_IDX);
 
 	return enqueued_ops;
 }
@@ -94,7 +96,7 @@  schedule_dequeue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 	qp_ctx->deq_idx = (~qp_ctx->deq_idx) & WORKER_SWITCH_MASK;
 
 	if (nb_deq_ops == nb_ops)
-		return nb_deq_ops;
+		goto retrieve_session;
 
 	worker = workers[qp_ctx->deq_idx];
 
@@ -104,6 +106,9 @@  schedule_dequeue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 		worker->nb_inflight_cops -= nb_deq_ops2;
 	}
 
+retrieve_session:
+	scheduler_retrieve_session(ops, nb_deq_ops + nb_deq_ops2);
+
 	return nb_deq_ops + nb_deq_ops2;
 }
 
diff --git a/drivers/crypto/scheduler/scheduler_multicore.c b/drivers/crypto/scheduler/scheduler_multicore.c
index 900ab4049d..3dea850661 100644
--- a/drivers/crypto/scheduler/scheduler_multicore.c
+++ b/drivers/crypto/scheduler/scheduler_multicore.c
@@ -183,11 +183,19 @@  mc_scheduler_worker(struct rte_cryptodev *dev)
 
 	while (!mc_ctx->stop_signal) {
 		if (pending_enq_ops) {
+			scheduler_set_worker_session(
+				&enq_ops[pending_enq_ops_idx], pending_enq_ops,
+				worker_idx);
 			processed_ops =
 				rte_cryptodev_enqueue_burst(worker->dev_id,
 					worker->qp_id,
 					&enq_ops[pending_enq_ops_idx],
 					pending_enq_ops);
+			if (processed_ops < pending_deq_ops)
+				scheduler_retrieve_session(
+					&enq_ops[pending_enq_ops_idx +
+						processed_ops],
+					pending_deq_ops - processed_ops);
 			pending_enq_ops -= processed_ops;
 			pending_enq_ops_idx += processed_ops;
 			inflight_ops += processed_ops;
@@ -195,9 +203,16 @@  mc_scheduler_worker(struct rte_cryptodev *dev)
 			processed_ops = rte_ring_dequeue_burst(enq_ring, (void *)enq_ops,
 							MC_SCHED_BUFFER_SIZE, NULL);
 			if (processed_ops) {
+				scheduler_set_worker_session(enq_ops,
+					processed_ops, worker_idx);
 				pending_enq_ops_idx = rte_cryptodev_enqueue_burst(
 						worker->dev_id, worker->qp_id,
 						enq_ops, processed_ops);
+				if (pending_enq_ops_idx < processed_ops)
+					scheduler_retrieve_session(
+						enq_ops + pending_enq_ops_idx,
+						processed_ops -
+						pending_enq_ops_idx);
 				pending_enq_ops = processed_ops - pending_enq_ops_idx;
 				inflight_ops += pending_enq_ops_idx;
 			}
@@ -214,6 +229,8 @@  mc_scheduler_worker(struct rte_cryptodev *dev)
 					worker->dev_id, worker->qp_id, deq_ops,
 					MC_SCHED_BUFFER_SIZE);
 			if (processed_ops) {
+				scheduler_retrieve_session(deq_ops,
+					processed_ops);
 				inflight_ops -= processed_ops;
 				if (reordering_enabled) {
 					uint16_t j;
diff --git a/drivers/crypto/scheduler/scheduler_pkt_size_distr.c b/drivers/crypto/scheduler/scheduler_pkt_size_distr.c
index 933a5c6978..9204f6f608 100644
--- a/drivers/crypto/scheduler/scheduler_pkt_size_distr.c
+++ b/drivers/crypto/scheduler/scheduler_pkt_size_distr.c
@@ -48,34 +48,54 @@  schedule_enqueue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 	};
 	struct psd_schedule_op *p_enq_op;
 	uint16_t i, processed_ops_pri = 0, processed_ops_sec = 0;
-	uint32_t job_len;
 
 	if (unlikely(nb_ops == 0))
 		return 0;
 
 	for (i = 0; i < nb_ops && i < 4; i++) {
 		rte_prefetch0(ops[i]->sym);
-		rte_prefetch0(ops[i]->sym->session);
+		rte_prefetch0((uint8_t *)ops[i]->sym->session +
+			sizeof(struct rte_cryptodev_sym_session));
 	}
 
 	for (i = 0; (i < (nb_ops - 8)) && (nb_ops > 8); i += 4) {
+		struct scheduler_session_ctx *sess_ctx[4];
+		uint8_t target[4];
+		uint32_t job_len[4];
+
 		rte_prefetch0(ops[i + 4]->sym);
-		rte_prefetch0(ops[i + 4]->sym->session);
+		rte_prefetch0((uint8_t *)ops[i + 4]->sym->session +
+			sizeof(struct rte_cryptodev_sym_session));
 		rte_prefetch0(ops[i + 5]->sym);
-		rte_prefetch0(ops[i + 5]->sym->session);
+		rte_prefetch0((uint8_t *)ops[i + 5]->sym->session +
+			sizeof(struct rte_cryptodev_sym_session));
 		rte_prefetch0(ops[i + 6]->sym);
-		rte_prefetch0(ops[i + 6]->sym->session);
+		rte_prefetch0((uint8_t *)ops[i + 6]->sym->session +
+			sizeof(struct rte_cryptodev_sym_session));
 		rte_prefetch0(ops[i + 7]->sym);
-		rte_prefetch0(ops[i + 7]->sym->session);
+		rte_prefetch0((uint8_t *)ops[i + 7]->sym->session +
+			sizeof(struct rte_cryptodev_sym_session));
+
+		sess_ctx[0] = (void *)ops[i]->sym->session->driver_priv_data;
+		sess_ctx[1] =
+			(void *)ops[i + 1]->sym->session->driver_priv_data;
+		sess_ctx[2] =
+			(void *)ops[i + 2]->sym->session->driver_priv_data;
+		sess_ctx[3] =
+			(void *)ops[i + 3]->sym->session->driver_priv_data;
 
 		/* job_len is initialized as cipher data length, once
 		 * it is 0, equals to auth data length
 		 */
-		job_len = ops[i]->sym->cipher.data.length;
-		job_len += (ops[i]->sym->cipher.data.length == 0) *
+		job_len[0] = ops[i]->sym->cipher.data.length;
+		job_len[0] += (ops[i]->sym->cipher.data.length == 0) *
 				ops[i]->sym->auth.data.length;
 		/* decide the target op based on the job length */
-		p_enq_op = &enq_ops[!(job_len & psd_qp_ctx->threshold)];
+		target[0] = !(job_len[0] & psd_qp_ctx->threshold);
+		if (ops[i]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			ops[i]->sym->session =
+				sess_ctx[0]->worker_sess[target[0]];
+		p_enq_op = &enq_ops[target[0]];
 
 		/* stop schedule cops before the queue is full, this shall
 		 * prevent the failed enqueue
@@ -89,10 +109,14 @@  schedule_enqueue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 		sched_ops[p_enq_op->worker_idx][p_enq_op->pos] = ops[i];
 		p_enq_op->pos++;
 
-		job_len = ops[i+1]->sym->cipher.data.length;
-		job_len += (ops[i+1]->sym->cipher.data.length == 0) *
+		job_len[1] = ops[i + 1]->sym->cipher.data.length;
+		job_len[1] += (ops[i + 1]->sym->cipher.data.length == 0) *
 				ops[i+1]->sym->auth.data.length;
-		p_enq_op = &enq_ops[!(job_len & psd_qp_ctx->threshold)];
+		target[1] = !(job_len[1] & psd_qp_ctx->threshold);
+		if (ops[i + 1]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			ops[i + 1]->sym->session =
+				sess_ctx[1]->worker_sess[target[1]];
+		p_enq_op = &enq_ops[target[1]];
 
 		if (p_enq_op->pos + in_flight_ops[p_enq_op->worker_idx] ==
 				qp_ctx->max_nb_objs) {
@@ -103,10 +127,14 @@  schedule_enqueue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 		sched_ops[p_enq_op->worker_idx][p_enq_op->pos] = ops[i+1];
 		p_enq_op->pos++;
 
-		job_len = ops[i+2]->sym->cipher.data.length;
-		job_len += (ops[i+2]->sym->cipher.data.length == 0) *
-				ops[i+2]->sym->auth.data.length;
-		p_enq_op = &enq_ops[!(job_len & psd_qp_ctx->threshold)];
+		job_len[2] = ops[i + 2]->sym->cipher.data.length;
+		job_len[2] += (ops[i + 2]->sym->cipher.data.length == 0) *
+				ops[i + 2]->sym->auth.data.length;
+		target[2] = !(job_len[2] & psd_qp_ctx->threshold);
+		if (ops[i + 2]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			ops[i + 2]->sym->session =
+				sess_ctx[2]->worker_sess[target[2]];
+		p_enq_op = &enq_ops[target[2]];
 
 		if (p_enq_op->pos + in_flight_ops[p_enq_op->worker_idx] ==
 				qp_ctx->max_nb_objs) {
@@ -117,10 +145,14 @@  schedule_enqueue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 		sched_ops[p_enq_op->worker_idx][p_enq_op->pos] = ops[i+2];
 		p_enq_op->pos++;
 
-		job_len = ops[i+3]->sym->cipher.data.length;
-		job_len += (ops[i+3]->sym->cipher.data.length == 0) *
-				ops[i+3]->sym->auth.data.length;
-		p_enq_op = &enq_ops[!(job_len & psd_qp_ctx->threshold)];
+		job_len[3] = ops[i + 3]->sym->cipher.data.length;
+		job_len[3] += (ops[i + 3]->sym->cipher.data.length == 0) *
+				ops[i + 3]->sym->auth.data.length;
+		target[3] = !(job_len[3] & psd_qp_ctx->threshold);
+		if (ops[i + 3]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			ops[i + 3]->sym->session =
+				sess_ctx[1]->worker_sess[target[3]];
+		p_enq_op = &enq_ops[target[3]];
 
 		if (p_enq_op->pos + in_flight_ops[p_enq_op->worker_idx] ==
 				qp_ctx->max_nb_objs) {
@@ -133,10 +165,18 @@  schedule_enqueue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 	}
 
 	for (; i < nb_ops; i++) {
+		struct scheduler_session_ctx *sess_ctx =
+			(void *)ops[i]->sym->session->driver_priv_data;
+		uint32_t job_len;
+		uint8_t target;
+
 		job_len = ops[i]->sym->cipher.data.length;
 		job_len += (ops[i]->sym->cipher.data.length == 0) *
 				ops[i]->sym->auth.data.length;
-		p_enq_op = &enq_ops[!(job_len & psd_qp_ctx->threshold)];
+		target = !(job_len & psd_qp_ctx->threshold);
+		if (ops[i]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			ops[i]->sym->session = sess_ctx->worker_sess[target];
+		p_enq_op = &enq_ops[target];
 
 		if (p_enq_op->pos + in_flight_ops[p_enq_op->worker_idx] ==
 				qp_ctx->max_nb_objs) {
@@ -199,6 +239,7 @@  schedule_dequeue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 	if (worker->nb_inflight_cops) {
 		nb_deq_ops_pri = rte_cryptodev_dequeue_burst(worker->dev_id,
 			worker->qp_id, ops, nb_ops);
+		scheduler_retrieve_session(ops, nb_deq_ops_pri);
 		worker->nb_inflight_cops -= nb_deq_ops_pri;
 	}
 
@@ -213,6 +254,7 @@  schedule_dequeue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 		nb_deq_ops_sec = rte_cryptodev_dequeue_burst(worker->dev_id,
 				worker->qp_id, &ops[nb_deq_ops_pri],
 				nb_ops - nb_deq_ops_pri);
+		scheduler_retrieve_session(ops, nb_deq_ops_sec);
 		worker->nb_inflight_cops -= nb_deq_ops_sec;
 
 		if (!worker->nb_inflight_cops)
diff --git a/drivers/crypto/scheduler/scheduler_pmd_ops.c b/drivers/crypto/scheduler/scheduler_pmd_ops.c
index b93821783b..2bc3f5dd27 100644
--- a/drivers/crypto/scheduler/scheduler_pmd_ops.c
+++ b/drivers/crypto/scheduler/scheduler_pmd_ops.c
@@ -9,6 +9,7 @@ 
 #include <rte_cryptodev.h>
 #include <cryptodev_pmd.h>
 #include <rte_reorder.h>
+#include <rte_errno.h>
 
 #include "scheduler_pmd_private.h"
 
@@ -467,19 +468,113 @@  scheduler_pmd_sym_session_get_size(struct rte_cryptodev *dev __rte_unused)
 	return max_priv_sess_size;
 }
 
+struct scheduler_configured_sess_info {
+	uint8_t dev_id;
+	uint8_t driver_id;
+	struct rte_cryptodev_sym_session *sess;
+};
+
 static int
-scheduler_pmd_sym_session_configure(struct rte_cryptodev *dev __rte_unused,
-	struct rte_crypto_sym_xform *xform __rte_unused,
-	struct rte_cryptodev_sym_session *sess __rte_unused)
+scheduler_pmd_sym_session_configure(struct rte_cryptodev *dev,
+	struct rte_crypto_sym_xform *xform,
+	struct rte_cryptodev_sym_session *sess)
 {
+	struct scheduler_ctx *sched_ctx = dev->data->dev_private;
+	struct rte_mempool *mp = rte_mempool_from_obj(sess);
+	struct scheduler_session_ctx *sess_ctx = (void *)sess->driver_priv_data;
+	struct scheduler_configured_sess_info configured_sess[
+			RTE_CRYPTODEV_SCHEDULER_MAX_NB_WORKERS] = { 0 };
+	uint32_t i, j, n_configured_sess = 0;
+	int ret = 0;
+
+	if (mp == NULL)
+		return -EINVAL;
+
+	for (i = 0; i < sched_ctx->nb_workers; i++) {
+		struct scheduler_worker *worker = &sched_ctx->workers[i];
+		struct rte_cryptodev_sym_session *worker_sess;
+		uint8_t next_worker = 0;
+
+		for (j = 0; j < n_configured_sess; j++) {
+			if (configured_sess[j].driver_id ==
+					worker->driver_id) {
+				sess_ctx->worker_sess[i] =
+					configured_sess[j].sess;
+				next_worker = 1;
+				break;
+			}
+		}
+		if (next_worker)
+			continue;
+
+		if (rte_mempool_avail_count(mp) == 0) {
+			ret = -ENOMEM;
+			goto error_exit;
+		}
+
+		worker_sess = rte_cryptodev_sym_session_create(worker->dev_id,
+			xform, mp);
+		if (worker_sess == NULL) {
+			ret = -rte_errno;
+			goto error_exit;
+		}
+
+		worker_sess->opaque_data = (uint64_t)sess;
+		sess_ctx->worker_sess[i] = worker_sess;
+		configured_sess[n_configured_sess].driver_id =
+			worker->driver_id;
+		configured_sess[n_configured_sess].dev_id = worker->dev_id;
+		configured_sess[n_configured_sess].sess = worker_sess;
+		n_configured_sess++;
+	}
+
 	return 0;
+error_exit:
+	sess_ctx->ref_cnt = sched_ctx->ref_cnt;
+	for (i = 0; i < n_configured_sess; i++)
+		rte_cryptodev_sym_session_free(configured_sess[i].dev_id,
+			configured_sess[i].sess);
+	return ret;
 }
 
 /** Clear the memory of session so it doesn't leave key material behind */
 static void
-scheduler_pmd_sym_session_clear(struct rte_cryptodev *dev __rte_unused,
-		struct rte_cryptodev_sym_session *sess __rte_unused)
-{}
+scheduler_pmd_sym_session_clear(struct rte_cryptodev *dev,
+		struct rte_cryptodev_sym_session *sess)
+{
+	struct scheduler_ctx *sched_ctx = dev->data->dev_private;
+	struct scheduler_session_ctx *sess_ctx = (void *)sess->driver_priv_data;
+	struct scheduler_configured_sess_info deleted_sess[
+			RTE_CRYPTODEV_SCHEDULER_MAX_NB_WORKERS] = { 0 };
+	uint32_t i, j, n_deleted_sess = 0;
+
+	if (sched_ctx->ref_cnt != sess_ctx->ref_cnt) {
+		CR_SCHED_LOG(WARNING,
+			"Worker updated between session creation/deletion. "
+			"The session may not be freed fully.");
+	}
+
+	for (i = 0; i < sched_ctx->nb_workers; i++) {
+		struct scheduler_worker *worker = &sched_ctx->workers[i];
+		uint8_t next_worker = 0;
+
+		for (j = 0; j < n_deleted_sess; j++) {
+			if (deleted_sess[j].driver_id == worker->driver_id) {
+				sess_ctx->worker_sess[i] = NULL;
+				next_worker = 1;
+				break;
+			}
+		}
+		if (next_worker)
+			continue;
+
+		rte_cryptodev_sym_session_free(worker->dev_id,
+			sess_ctx->worker_sess[i]);
+
+		deleted_sess[n_deleted_sess++].driver_id = worker->driver_id;
+		sess_ctx->worker_sess[i] = NULL;
+	}
+}
 
 static struct rte_cryptodev_ops scheduler_pmd_ops = {
 		.dev_configure		= scheduler_pmd_config,
diff --git a/drivers/crypto/scheduler/scheduler_pmd_private.h b/drivers/crypto/scheduler/scheduler_pmd_private.h
index 4d33b9ab44..0e508727a4 100644
--- a/drivers/crypto/scheduler/scheduler_pmd_private.h
+++ b/drivers/crypto/scheduler/scheduler_pmd_private.h
@@ -22,7 +22,6 @@  struct scheduler_worker {
 	uint8_t dev_id;
 	uint16_t qp_id;
 	uint32_t nb_inflight_cops;
-
 	uint8_t driver_id;
 };
 
@@ -37,6 +36,8 @@  struct scheduler_ctx {
 
 	struct scheduler_worker workers[RTE_CRYPTODEV_SCHEDULER_MAX_NB_WORKERS];
 	uint32_t nb_workers;
+	/* reference count when the workers are incremented/decremented */
+	uint32_t ref_cnt;
 
 	enum rte_cryptodev_scheduler_mode mode;
 
@@ -61,6 +62,11 @@  struct scheduler_qp_ctx {
 	struct rte_ring *order_ring;
 } __rte_cache_aligned;
 
+struct scheduler_session_ctx {
+	uint32_t ref_cnt;
+	struct rte_cryptodev_sym_session *worker_sess[
+		RTE_CRYPTODEV_SCHEDULER_MAX_NB_WORKERS];
+};
 
 extern uint8_t cryptodev_scheduler_driver_id;
 
@@ -101,6 +107,118 @@  scheduler_order_drain(struct rte_ring *order_ring,
 	return nb_ops_to_deq;
 }
 
+static __rte_always_inline void
+scheduler_set_worker_session(struct rte_crypto_op **ops, uint16_t nb_ops,
+		uint8_t worker_index)
+{
+	struct rte_crypto_op **op = ops;
+	uint16_t n = nb_ops;
+
+	if (n >= 4) {
+		rte_prefetch0(op[0]->sym->session);
+		rte_prefetch0(op[1]->sym->session);
+		rte_prefetch0(op[2]->sym->session);
+		rte_prefetch0(op[3]->sym->session);
+	}
+
+	while (n >= 4) {
+		if (n >= 8) {
+			rte_prefetch0(op[4]->sym->session);
+			rte_prefetch0(op[5]->sym->session);
+			rte_prefetch0(op[6]->sym->session);
+			rte_prefetch0(op[7]->sym->session);
+		}
+
+		if (op[0]->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
+			struct scheduler_session_ctx *sess_ctx =
+				(void *)op[0]->sym->session->driver_priv_data;
+			op[0]->sym->session =
+				sess_ctx->worker_sess[worker_index];
+		}
+
+		if (op[1]->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
+			struct scheduler_session_ctx *sess_ctx =
+				(void *)op[1]->sym->session->driver_priv_data;
+			op[1]->sym->session =
+				sess_ctx->worker_sess[worker_index];
+		}
+
+		if (op[2]->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
+			struct scheduler_session_ctx *sess_ctx =
+				(void *)op[2]->sym->session->driver_priv_data;
+			op[2]->sym->session =
+				sess_ctx->worker_sess[worker_index];
+		}
+
+		if (op[3]->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
+			struct scheduler_session_ctx *sess_ctx =
+				(void *)op[3]->sym->session->driver_priv_data;
+			op[3]->sym->session =
+				sess_ctx->worker_sess[worker_index];
+		}
+
+		op += 4;
+		n -= 4;
+	}
+
+	while (n--) {
+		if (op[0]->sess_type == RTE_CRYPTO_OP_WITH_SESSION) {
+			struct scheduler_session_ctx *sess_ctx =
+				(void *)op[0]->sym->session->driver_priv_data;
+
+			op[0]->sym->session =
+				sess_ctx->worker_sess[worker_index];
+			op++;
+		}
+	}
+}
+
+static __rte_always_inline void
+scheduler_retrieve_session(struct rte_crypto_op **ops, uint16_t nb_ops)
+{
+	uint16_t n = nb_ops;
+	struct rte_crypto_op **op = ops;
+
+	if (n >= 4) {
+		rte_prefetch0(op[0]->sym->session);
+		rte_prefetch0(op[1]->sym->session);
+		rte_prefetch0(op[2]->sym->session);
+		rte_prefetch0(op[3]->sym->session);
+	}
+
+	while (n >= 4) {
+		if (n >= 8) {
+			rte_prefetch0(op[4]->sym->session);
+			rte_prefetch0(op[5]->sym->session);
+			rte_prefetch0(op[6]->sym->session);
+			rte_prefetch0(op[7]->sym->session);
+		}
+
+		if (op[0]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			op[0]->sym->session =
+				(void *)op[0]->sym->session->opaque_data;
+		if (op[1]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			op[1]->sym->session =
+				(void *)op[1]->sym->session->opaque_data;
+		if (op[2]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			op[2]->sym->session =
+				(void *)op[2]->sym->session->opaque_data;
+		if (op[3]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			op[3]->sym->session =
+				(void *)op[3]->sym->session->opaque_data;
+
+		op += 4;
+		n -= 4;
+	}
+
+	while (n--) {
+		if (op[0]->sess_type == RTE_CRYPTO_OP_WITH_SESSION)
+			op[0]->sym->session =
+				(void *)op[0]->sym->session->opaque_data;
+		op++;
+	}
+}
+
 /** device specific operations function pointer structure */
 extern struct rte_cryptodev_ops *rte_crypto_scheduler_pmd_ops;
 
diff --git a/drivers/crypto/scheduler/scheduler_roundrobin.c b/drivers/crypto/scheduler/scheduler_roundrobin.c
index ace2dec2ec..ad3f8b842a 100644
--- a/drivers/crypto/scheduler/scheduler_roundrobin.c
+++ b/drivers/crypto/scheduler/scheduler_roundrobin.c
@@ -23,16 +23,17 @@  schedule_enqueue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 			((struct scheduler_qp_ctx *)qp)->private_qp_ctx;
 	uint32_t worker_idx = rr_qp_ctx->last_enq_worker_idx;
 	struct scheduler_worker *worker = &rr_qp_ctx->workers[worker_idx];
-	uint16_t i, processed_ops;
+	uint16_t processed_ops;
 
 	if (unlikely(nb_ops == 0))
 		return 0;
 
-	for (i = 0; i < nb_ops && i < 4; i++)
-		rte_prefetch0(ops[i]->sym->session);
-
+	scheduler_set_worker_session(ops, nb_ops, worker_idx);
 	processed_ops = rte_cryptodev_enqueue_burst(worker->dev_id,
 			worker->qp_id, ops, nb_ops);
+	if (processed_ops < nb_ops)
+		scheduler_retrieve_session(ops + processed_ops,
+			nb_ops - processed_ops);
 
 	worker->nb_inflight_cops += processed_ops;
 
@@ -86,7 +87,7 @@  schedule_dequeue(void *qp, struct rte_crypto_op **ops, uint16_t nb_ops)
 
 	nb_deq_ops = rte_cryptodev_dequeue_burst(worker->dev_id,
 			worker->qp_id, ops, nb_ops);
-
+	scheduler_retrieve_session(ops, nb_deq_ops);
 	last_worker_idx += 1;
 	last_worker_idx %= rr_qp_ctx->nb_workers;