[3/4] app/flow-perf: change clock measurement functions
Checks
Commit Message
The clock() function is not good practice to use for multiple
cores/threads, since it measures the CPU time used by the process
and not the wall clock time, while when running through multiple
cores/threads simultaneously, we can burn through CPU time much
faster.
As a result this commit will change the way of measurement to use
rd_tsc, and the results will be divided by the processor frequency.
Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
---
app/test-flow-perf/main.c | 16 ++++++++--------
1 file changed, 8 insertions(+), 8 deletions(-)
Comments
26/11/2020 12:15, Wisam Jaddo:
> The clock() function is not good practice to use for multiple
> cores/threads, since it measures the CPU time used by the process
> and not the wall clock time, while when running through multiple
> cores/threads simultaneously, we can burn through CPU time much
> faster.
>
> As a result this commit will change the way of measurement to use
> rd_tsc, and the results will be divided by the processor frequency.
>
> Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
> Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
> Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
> ---
> - start_batch = clock();
> + start_batch = rte_rdtsc();
Please could you try the generic wrapper rte_get_timer_cycles?
It should be the same (inline wrapper) when HPET is disabled.
rdtsc refer to an x86 instruction so I prefer a more generic API.
Can be a separate patch.
While at it, I believe more apps could be converted.
@@ -889,7 +889,7 @@ destroy_flows(int port_id, uint8_t core_id, struct rte_flow **flows_list)
rules_count_per_core = rules_count / mc_pool.cores_count;
- start_batch = clock();
+ start_batch = rte_rdtsc();
for (i = 0; i < (uint32_t) rules_count_per_core; i++) {
if (flows_list[i] == 0)
break;
@@ -907,12 +907,12 @@ destroy_flows(int port_id, uint8_t core_id, struct rte_flow **flows_list)
* for this batch.
*/
if (!((i + 1) % rules_batch)) {
- end_batch = clock();
+ end_batch = rte_rdtsc();
delta = (double) (end_batch - start_batch);
rules_batch_idx = ((i + 1) / rules_batch) - 1;
- cpu_time_per_batch[rules_batch_idx] = delta / CLOCKS_PER_SEC;
+ cpu_time_per_batch[rules_batch_idx] = delta / rte_get_tsc_hz();
cpu_time_used += cpu_time_per_batch[rules_batch_idx];
- start_batch = clock();
+ start_batch = rte_rdtsc();
}
}
@@ -985,7 +985,7 @@ insert_flows(int port_id, uint8_t core_id)
flows_list[flow_index++] = flow;
}
- start_batch = clock();
+ start_batch = rte_rdtsc();
for (counter = start_counter; counter < end_counter; counter++) {
flow = generate_flow(port_id, flow_group,
flow_attrs, flow_items, flow_actions,
@@ -1011,12 +1011,12 @@ insert_flows(int port_id, uint8_t core_id)
* for this batch.
*/
if (!((counter + 1) % rules_batch)) {
- end_batch = clock();
+ end_batch = rte_rdtsc();
delta = (double) (end_batch - start_batch);
rules_batch_idx = ((counter + 1) / rules_batch) - 1;
- cpu_time_per_batch[rules_batch_idx] = delta / CLOCKS_PER_SEC;
+ cpu_time_per_batch[rules_batch_idx] = delta / rte_get_tsc_hz();
cpu_time_used += cpu_time_per_batch[rules_batch_idx];
- start_batch = clock();
+ start_batch = rte_rdtsc();
}
}