[v3,11/14] eventdev: move timer adapters memory to hugepage

Message ID 20211006065012.16508-11-pbhagavatula@marvell.com (mailing list archive)
State Superseded, archived
Delegated to: Jerin Jacob
Headers
Series [v3,01/14] eventdev: make driver interface as internal |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Pavan Nikhilesh Bhagavatula Oct. 6, 2021, 6:50 a.m. UTC
  From: Pavan Nikhilesh <pbhagavatula@marvell.com>

Move memory used by timer adapters to hugepage.
Allocate memory on the first adapter create or lookup to address
both primary and secondary process usecases.
This will prevent TLB misses if any and aligns to memory structure
of other subsystems.

Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
---
 lib/eventdev/rte_event_timer_adapter.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)
  

Comments

Carrillo, Erik G Oct. 7, 2021, 8:49 p.m. UTC | #1
Hi Pavan,

Some comments below:

> -----Original Message-----
> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> Sent: Wednesday, October 6, 2021 1:50 AM
> To: jerinj@marvell.com; Carrillo, Erik G <erik.g.carrillo@intel.com>
> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> Subject: [dpdk-dev] [PATCH v3 11/14] eventdev: move timer adapters
> memory to hugepage
> 
> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> 
> Move memory used by timer adapters to hugepage.
> Allocate memory on the first adapter create or lookup to address both
> primary and secondary process usecases.
> This will prevent TLB misses if any and aligns to memory structure of other
> subsystems.
> 
> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> ---
>  lib/eventdev/rte_event_timer_adapter.c | 24
> +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/eventdev/rte_event_timer_adapter.c
> b/lib/eventdev/rte_event_timer_adapter.c
> index ae55407042..c4dc7a5fd4 100644
> --- a/lib/eventdev/rte_event_timer_adapter.c
> +++ b/lib/eventdev/rte_event_timer_adapter.c
> @@ -33,7 +33,7 @@ RTE_LOG_REGISTER_SUFFIX(evtim_logtype,
> adapter.timer, NOTICE);
> RTE_LOG_REGISTER_SUFFIX(evtim_buffer_logtype, adapter.timer, NOTICE);
> RTE_LOG_REGISTER_SUFFIX(evtim_svc_logtype, adapter.timer.svc,
> NOTICE);
> 
> -static struct rte_event_timer_adapter
> adapters[RTE_EVENT_TIMER_ADAPTER_NUM_MAX];
> +static struct rte_event_timer_adapter *adapters;
> 
>  static const struct event_timer_adapter_ops swtim_ops;
> 
> @@ -138,6 +138,17 @@ rte_event_timer_adapter_create_ext(
>  	int n, ret;
>  	struct rte_eventdev *dev;
> 
> +	if (adapters == NULL) {
> +		adapters = rte_zmalloc("Eventdev",
> +				       sizeof(struct rte_event_timer_adapter) *
> +
> RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
> +				       RTE_CACHE_LINE_SIZE);
> +		if (adapters == NULL) {
> +			rte_errno = ENOMEM;
> +			return NULL;
> +		}
> +	}
> +
>  	if (conf == NULL) {
>  		rte_errno = EINVAL;
>  		return NULL;
> @@ -312,6 +323,17 @@ rte_event_timer_adapter_lookup(uint16_t
> adapter_id)
>  	int ret;
>  	struct rte_eventdev *dev;
> 
> +	if (adapters == NULL) {
> +		adapters = rte_zmalloc("Eventdev",
> +				       sizeof(struct rte_event_timer_adapter) *
> +
> RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
> +				       RTE_CACHE_LINE_SIZE);
> +		if (adapters == NULL) {
> +			rte_errno = ENOMEM;
> +			return NULL;
> +		}
> +	}
> +
>  	if (adapters[adapter_id].allocated)
>  		return &adapters[adapter_id]; /* Adapter is already loaded
> */
> 
> --
> 2.17.1

The rte_event_timer_adapter struct has several fields that have per-process values.  

For example, there are three fast path function pointers and each will be assigned distinct addresses for each process in a multi-process scenario.  The "allocated" field is also per-process.  With the changes above, if a secondary process did a lookup() after a primary process did a create(),  the secondary would get a reference to an object with function pointers that are invalid in the secondary process.

To fully move the adapter object table into shared hugepage memory, those "per-process" members would need to be collected into a per-process data structure that could be independently allocated for each process.  However, that would add one more pointer dereference to get to the fast path functions, and avoiding that was the original reason to put those pointers there.  This is similar to the rte_eventdev struct.

Thanks,
Erik
  
Pavan Nikhilesh Bhagavatula Oct. 8, 2021, 5:38 a.m. UTC | #2
Hi Erik,

>Hi Pavan,
>
>Some comments below:
>
>> -----Original Message-----
>> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
>> Sent: Wednesday, October 6, 2021 1:50 AM
>> To: jerinj@marvell.com; Carrillo, Erik G <erik.g.carrillo@intel.com>
>> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
>> Subject: [dpdk-dev] [PATCH v3 11/14] eventdev: move timer adapters
>> memory to hugepage
>>
>> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
>>
>> Move memory used by timer adapters to hugepage.
>> Allocate memory on the first adapter create or lookup to address both
>> primary and secondary process usecases.
>> This will prevent TLB misses if any and aligns to memory structure of
>other
>> subsystems.
>>
>> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
>> ---
>>  lib/eventdev/rte_event_timer_adapter.c | 24
>> +++++++++++++++++++++++-
>>  1 file changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/eventdev/rte_event_timer_adapter.c
>> b/lib/eventdev/rte_event_timer_adapter.c
>> index ae55407042..c4dc7a5fd4 100644
>> --- a/lib/eventdev/rte_event_timer_adapter.c
>> +++ b/lib/eventdev/rte_event_timer_adapter.c
>> @@ -33,7 +33,7 @@ RTE_LOG_REGISTER_SUFFIX(evtim_logtype,
>> adapter.timer, NOTICE);
>> RTE_LOG_REGISTER_SUFFIX(evtim_buffer_logtype, adapter.timer,
>NOTICE);
>> RTE_LOG_REGISTER_SUFFIX(evtim_svc_logtype, adapter.timer.svc,
>> NOTICE);
>>
>> -static struct rte_event_timer_adapter
>> adapters[RTE_EVENT_TIMER_ADAPTER_NUM_MAX];
>> +static struct rte_event_timer_adapter *adapters;
>>
>>  static const struct event_timer_adapter_ops swtim_ops;
>>
>> @@ -138,6 +138,17 @@ rte_event_timer_adapter_create_ext(
>>  	int n, ret;
>>  	struct rte_eventdev *dev;
>>
>> +	if (adapters == NULL) {
>> +		adapters = rte_zmalloc("Eventdev",
>> +				       sizeof(struct
>rte_event_timer_adapter) *
>> +
>> RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
>> +				       RTE_CACHE_LINE_SIZE);
>> +		if (adapters == NULL) {
>> +			rte_errno = ENOMEM;
>> +			return NULL;
>> +		}
>> +	}
>> +
>>  	if (conf == NULL) {
>>  		rte_errno = EINVAL;
>>  		return NULL;
>> @@ -312,6 +323,17 @@ rte_event_timer_adapter_lookup(uint16_t
>> adapter_id)
>>  	int ret;
>>  	struct rte_eventdev *dev;
>>
>> +	if (adapters == NULL) {
>> +		adapters = rte_zmalloc("Eventdev",
>> +				       sizeof(struct
>rte_event_timer_adapter) *
>> +
>> RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
>> +				       RTE_CACHE_LINE_SIZE);
>> +		if (adapters == NULL) {
>> +			rte_errno = ENOMEM;
>> +			return NULL;
>> +		}
>> +	}
>> +
>>  	if (adapters[adapter_id].allocated)
>>  		return &adapters[adapter_id]; /* Adapter is already
>loaded
>> */
>>
>> --
>> 2.17.1
>
>The rte_event_timer_adapter struct has several fields that have per-
>process values.
>
>For example, there are three fast path function pointers and each will
>be assigned distinct addresses for each process in a multi-process
>scenario.  The "allocated" field is also per-process.  With the changes
>above, if a secondary process did a lookup() after a primary process did
>a create(),  the secondary would get a reference to an object with
>function pointers that are invalid in the secondary process.
>

I understand, the current patch doesn't unify the memory between processes.
Instead, we zmalloc the per-process 'array' that holds the adapter objects when 
ever the process calls either create or lookup and initialize the per-process data structure.

The pointer to the adapter array is static, so when ever a process is initialized it will be NULL.


>To fully move the adapter object table into shared hugepage memory,
>those "per-process" members would need to be collected into a per-
>process data structure that could be independently allocated for each
>process.  However, that would add one more pointer dereference to
>get to the fast path functions, and avoiding that was the original reason
>to put those pointers there.  This is similar to the rte_eventdev struct.
>
>Thanks,
>Erik
  
Carrillo, Erik G Oct. 8, 2021, 3:57 p.m. UTC | #3
> -----Original Message-----
> From: Pavan Nikhilesh Bhagavatula <pbhagavatula@marvell.com>
> Sent: Friday, October 8, 2021 12:38 AM
> To: Carrillo, Erik G <erik.g.carrillo@intel.com>; Jerin Jacob Kollanukkaran
> <jerinj@marvell.com>
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH v3 11/14] eventdev: move timer adapters
> memory to hugepage
> 
> Hi Erik,
> 
> >Hi Pavan,
> >
> >Some comments below:
> >
> >> -----Original Message-----
> >> From: pbhagavatula@marvell.com <pbhagavatula@marvell.com>
> >> Sent: Wednesday, October 6, 2021 1:50 AM
> >> To: jerinj@marvell.com; Carrillo, Erik G <erik.g.carrillo@intel.com>
> >> Cc: dev@dpdk.org; Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> Subject: [dpdk-dev] [PATCH v3 11/14] eventdev: move timer adapters
> >> memory to hugepage
> >>
> >> From: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >>
> >> Move memory used by timer adapters to hugepage.
> >> Allocate memory on the first adapter create or lookup to address both
> >> primary and secondary process usecases.
> >> This will prevent TLB misses if any and aligns to memory structure of
> >other
> >> subsystems.
> >>
> >> Signed-off-by: Pavan Nikhilesh <pbhagavatula@marvell.com>
> >> ---
> >>  lib/eventdev/rte_event_timer_adapter.c | 24
> >> +++++++++++++++++++++++-
> >>  1 file changed, 23 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/lib/eventdev/rte_event_timer_adapter.c
> >> b/lib/eventdev/rte_event_timer_adapter.c
> >> index ae55407042..c4dc7a5fd4 100644
> >> --- a/lib/eventdev/rte_event_timer_adapter.c
> >> +++ b/lib/eventdev/rte_event_timer_adapter.c
> >> @@ -33,7 +33,7 @@ RTE_LOG_REGISTER_SUFFIX(evtim_logtype,
> >> adapter.timer, NOTICE);
> >> RTE_LOG_REGISTER_SUFFIX(evtim_buffer_logtype, adapter.timer,
> >NOTICE);
> >> RTE_LOG_REGISTER_SUFFIX(evtim_svc_logtype, adapter.timer.svc,
> >> NOTICE);
> >>
> >> -static struct rte_event_timer_adapter
> >> adapters[RTE_EVENT_TIMER_ADAPTER_NUM_MAX];
> >> +static struct rte_event_timer_adapter *adapters;
> >>
> >>  static const struct event_timer_adapter_ops swtim_ops;
> >>
> >> @@ -138,6 +138,17 @@ rte_event_timer_adapter_create_ext(
> >>  	int n, ret;
> >>  	struct rte_eventdev *dev;
> >>
> >> +	if (adapters == NULL) {
> >> +		adapters = rte_zmalloc("Eventdev",
> >> +				       sizeof(struct
> >rte_event_timer_adapter) *
> >> +
> >> RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
> >> +				       RTE_CACHE_LINE_SIZE);
> >> +		if (adapters == NULL) {
> >> +			rte_errno = ENOMEM;
> >> +			return NULL;
> >> +		}
> >> +	}
> >> +
> >>  	if (conf == NULL) {
> >>  		rte_errno = EINVAL;
> >>  		return NULL;
> >> @@ -312,6 +323,17 @@ rte_event_timer_adapter_lookup(uint16_t
> >> adapter_id)
> >>  	int ret;
> >>  	struct rte_eventdev *dev;
> >>
> >> +	if (adapters == NULL) {
> >> +		adapters = rte_zmalloc("Eventdev",
> >> +				       sizeof(struct
> >rte_event_timer_adapter) *
> >> +
> >> RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
> >> +				       RTE_CACHE_LINE_SIZE);
> >> +		if (adapters == NULL) {
> >> +			rte_errno = ENOMEM;
> >> +			return NULL;
> >> +		}
> >> +	}
> >> +
> >>  	if (adapters[adapter_id].allocated)
> >>  		return &adapters[adapter_id]; /* Adapter is already
> >loaded
> >> */
> >>
> >> --
> >> 2.17.1
> >
> >The rte_event_timer_adapter struct has several fields that have per-
> >process values.
> >
> >For example, there are three fast path function pointers and each will
> >be assigned distinct addresses for each process in a multi-process
> >scenario.  The "allocated" field is also per-process.  With the changes
> >above, if a secondary process did a lookup() after a primary process
> >did a create(),  the secondary would get a reference to an object with
> >function pointers that are invalid in the secondary process.
> >
> 
> I understand, the current patch doesn't unify the memory between
> processes.
> Instead, we zmalloc the per-process 'array' that holds the adapter objects
> when ever the process calls either create or lookup and initialize the per-
> process data structure.
> 
> The pointer to the adapter array is static, so when ever a process is initialized
> it will be NULL.
> 

Ah, right - I missed that.  This looks OK to me now.

One other thing: we never do a free of the array we zmalloc'd, but it looks like we could in rte_event_timer_adapter_free(), if we were freeing the last adapter instance.  

> 
> >To fully move the adapter object table into shared hugepage memory,
> >those "per-process" members would need to be collected into a per-
> >process data structure that could be independently allocated for each
> >process.  However, that would add one more pointer dereference to get
> >to the fast path functions, and avoiding that was the original reason
> >to put those pointers there.  This is similar to the rte_eventdev struct.
> >
> >Thanks,
> >Erik
  

Patch

diff --git a/lib/eventdev/rte_event_timer_adapter.c b/lib/eventdev/rte_event_timer_adapter.c
index ae55407042..c4dc7a5fd4 100644
--- a/lib/eventdev/rte_event_timer_adapter.c
+++ b/lib/eventdev/rte_event_timer_adapter.c
@@ -33,7 +33,7 @@  RTE_LOG_REGISTER_SUFFIX(evtim_logtype, adapter.timer, NOTICE);
 RTE_LOG_REGISTER_SUFFIX(evtim_buffer_logtype, adapter.timer, NOTICE);
 RTE_LOG_REGISTER_SUFFIX(evtim_svc_logtype, adapter.timer.svc, NOTICE);
 
-static struct rte_event_timer_adapter adapters[RTE_EVENT_TIMER_ADAPTER_NUM_MAX];
+static struct rte_event_timer_adapter *adapters;
 
 static const struct event_timer_adapter_ops swtim_ops;
 
@@ -138,6 +138,17 @@  rte_event_timer_adapter_create_ext(
 	int n, ret;
 	struct rte_eventdev *dev;
 
+	if (adapters == NULL) {
+		adapters = rte_zmalloc("Eventdev",
+				       sizeof(struct rte_event_timer_adapter) *
+					       RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
+				       RTE_CACHE_LINE_SIZE);
+		if (adapters == NULL) {
+			rte_errno = ENOMEM;
+			return NULL;
+		}
+	}
+
 	if (conf == NULL) {
 		rte_errno = EINVAL;
 		return NULL;
@@ -312,6 +323,17 @@  rte_event_timer_adapter_lookup(uint16_t adapter_id)
 	int ret;
 	struct rte_eventdev *dev;
 
+	if (adapters == NULL) {
+		adapters = rte_zmalloc("Eventdev",
+				       sizeof(struct rte_event_timer_adapter) *
+					       RTE_EVENT_TIMER_ADAPTER_NUM_MAX,
+				       RTE_CACHE_LINE_SIZE);
+		if (adapters == NULL) {
+			rte_errno = ENOMEM;
+			return NULL;
+		}
+	}
+
 	if (adapters[adapter_id].allocated)
 		return &adapters[adapter_id]; /* Adapter is already loaded */