[v2,0/3] reimplement rwlock and add relevant perf test case

Message ID 1547557979-153169-1-git-send-email-joyce.kong@arm.com
Headers show
Series
  • reimplement rwlock and add relevant perf test case
Related show

Message

Joyce Kong Jan. 15, 2019, 1:12 p.m.
v2:
	Rebase and modify the rwlock test case to address the comments in v1.

v1: reimplement rwlock with __atomic builtins, and add a rwlock perf test
    on all available cores to benchmark the improvement.

	We tested the patches on three arm64 platforms.
	ThundeX2 gained 20% performance, Qualcomm gained 36% and
	the 4-Cortex-A72 Marvell MACCHIATObin gained 19.6%.
	
	Below is the detailed test result on ThunderX2:

	*** rwlock_autotest without __atomic builtins *** Rwlock Perf Test on 128 cores...
	Core [0] count = 281
	Core [1] count = 252
	Core [2] count = 290
	Core [3] count = 259
	Core [4] count = 287
	...
	Core [209] count = 3
	Core [210] count = 31
	Core [211] count = 120
	Total count = 18537

	*** rwlock_autotest with __atomic builtins *** Rwlock Perf Test on 128 cores...
	Core [0] count = 346
	Core [1] count = 355
	Core [2] count = 259
	Core [3] count = 285
	Core [4] count = 320
	...
	Core [209] count = 2
	Core [210] count = 23
	Core [211] count = 63
	Total count = 22194

Gavin Hu (1):
  rwlock: reimplement with __atomic builtins

Joyce Kong (2):
  test/rwlock: add perf test case
  test/rwlock: amortize the cost of getting time

 lib/librte_eal/common/include/generic/rte_rwlock.h | 16 ++---
 test/test/test_rwlock.c                            | 75 ++++++++++++++++++++++
 2 files changed, 83 insertions(+), 8 deletions(-)