[3/4] app/flow-perf: change clock measurement functions

Message ID 20201126111543.16928-4-wisamm@nvidia.com (mailing list archive)
State Accepted, archived
Delegated to: Thomas Monjalon
Headers
Series app/flow-perf: add multi threaded support |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Wisam Monther Nov. 26, 2020, 11:15 a.m. UTC
  The clock() function is not good practice to use for multiple
cores/threads, since it measures the CPU time used by the process
and not the wall clock time, while when running through multiple
cores/threads simultaneously, we can burn through CPU time much
faster.

As a result this commit will change the way of measurement to use
rd_tsc, and the results will be divided by the processor frequency.

Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
---
 app/test-flow-perf/main.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)
  

Comments

Thomas Monjalon Jan. 7, 2021, 2:49 p.m. UTC | #1
26/11/2020 12:15, Wisam Jaddo:
> The clock() function is not good practice to use for multiple
> cores/threads, since it measures the CPU time used by the process
> and not the wall clock time, while when running through multiple
> cores/threads simultaneously, we can burn through CPU time much
> faster.
> 
> As a result this commit will change the way of measurement to use
> rd_tsc, and the results will be divided by the processor frequency.
> 
> Signed-off-by: Wisam Jaddo <wisamm@nvidia.com>
> Reviewed-by: Alexander Kozyrev <akozyrev@nvidia.com>
> Reviewed-by: Suanming Mou <suanmingm@nvidia.com>
> ---
> -	start_batch = clock();
> +	start_batch = rte_rdtsc();

Please could you try the generic wrapper rte_get_timer_cycles?
It should be the same (inline wrapper) when HPET is disabled.
rdtsc refer to an x86 instruction so I prefer a more generic API.

Can be a separate patch.
While at it, I believe more apps could be converted.
  

Patch

diff --git a/app/test-flow-perf/main.c b/app/test-flow-perf/main.c
index 663b2e9bae..3a0e4c1951 100644
--- a/app/test-flow-perf/main.c
+++ b/app/test-flow-perf/main.c
@@ -889,7 +889,7 @@  destroy_flows(int port_id, uint8_t core_id, struct rte_flow **flows_list)
 
 	rules_count_per_core = rules_count / mc_pool.cores_count;
 
-	start_batch = clock();
+	start_batch = rte_rdtsc();
 	for (i = 0; i < (uint32_t) rules_count_per_core; i++) {
 		if (flows_list[i] == 0)
 			break;
@@ -907,12 +907,12 @@  destroy_flows(int port_id, uint8_t core_id, struct rte_flow **flows_list)
 		 * for this batch.
 		 */
 		if (!((i + 1) % rules_batch)) {
-			end_batch = clock();
+			end_batch = rte_rdtsc();
 			delta = (double) (end_batch - start_batch);
 			rules_batch_idx = ((i + 1) / rules_batch) - 1;
-			cpu_time_per_batch[rules_batch_idx] = delta / CLOCKS_PER_SEC;
+			cpu_time_per_batch[rules_batch_idx] = delta / rte_get_tsc_hz();
 			cpu_time_used += cpu_time_per_batch[rules_batch_idx];
-			start_batch = clock();
+			start_batch = rte_rdtsc();
 		}
 	}
 
@@ -985,7 +985,7 @@  insert_flows(int port_id, uint8_t core_id)
 		flows_list[flow_index++] = flow;
 	}
 
-	start_batch = clock();
+	start_batch = rte_rdtsc();
 	for (counter = start_counter; counter < end_counter; counter++) {
 		flow = generate_flow(port_id, flow_group,
 			flow_attrs, flow_items, flow_actions,
@@ -1011,12 +1011,12 @@  insert_flows(int port_id, uint8_t core_id)
 		 * for this batch.
 		 */
 		if (!((counter + 1) % rules_batch)) {
-			end_batch = clock();
+			end_batch = rte_rdtsc();
 			delta = (double) (end_batch - start_batch);
 			rules_batch_idx = ((counter + 1) / rules_batch) - 1;
-			cpu_time_per_batch[rules_batch_idx] = delta / CLOCKS_PER_SEC;
+			cpu_time_per_batch[rules_batch_idx] = delta / rte_get_tsc_hz();
 			cpu_time_used += cpu_time_per_batch[rules_batch_idx];
-			start_batch = clock();
+			start_batch = rte_rdtsc();
 		}
 	}