From patchwork Thu Nov 19 13:51:15 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerin Jacob X-Patchwork-Id: 8996 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id DE9648DA6; Thu, 19 Nov 2015 14:52:09 +0100 (CET) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bon0054.outbound.protection.outlook.com [157.56.111.54]) by dpdk.org (Postfix) with ESMTP id 38A768DA5 for ; Thu, 19 Nov 2015 14:52:08 +0100 (CET) Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Jerin.Jacob@caviumnetworks.com; Received: from localhost.caveonetworks.com (111.93.218.67) by BLUPR0701MB1972.namprd07.prod.outlook.com (10.163.121.23) with Microsoft SMTP Server (TLS) id 15.1.325.17; Thu, 19 Nov 2015 13:52:04 +0000 From: Jerin Jacob To: Date: Thu, 19 Nov 2015 19:21:15 +0530 Message-ID: <1447941075-27268-1-git-send-email-jerin.jacob@caviumnetworks.com> X-Mailer: git-send-email 2.1.0 MIME-Version: 1.0 X-Originating-IP: [111.93.218.67] X-ClientProxiedBy: MAXPR01CA0028.INDPRD01.PROD.OUTLOOK.COM (25.164.147.35) To BLUPR0701MB1972.namprd07.prod.outlook.com (25.163.121.23) X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 2:XW+8ESjGLsB831C6Rx0sLMYj6QbeQDCJPqfnAd9K9xA7rIJtQdmQ95r+vgvDeuhNP5rv3FUnajBA9L08NHezA3jhQiIY+BDUa4vZc23lj4gmu0v1d9BAkZW30kD9ozoKxKTy8fG+1UETDtoODDqx/5D2PVOZTZrCGCH1irYH4kw=; 3:qzJK9+7A9h52qloUW6R++FIPXBoR5uYjGDGsFILL3H+rMNSD/CGVaNUPcobgrwyP/ztD0XX1SgyLkCoQ91nP3lP2d2D9p3U5CDDppMuwIWbhlHRkiJurH0Ssneh7g8JpeUYT4MlnyFq/Vo4PeL5CdA==; 25:0iFgv/nmcCOMJHiKg/SHT99yNISC/E5rz7XuEgiS0ZaZddo0f/vN2eGmS6tr+2BcIuUs0SAZ4Pghftkajezek7rOYsfRHIIJiBajEVw59ZatKiEKnGG08FX75oB13PeEJs83zeHTxuLdZBzzKGGTIY8qvkVaLjh+Ui8Du6R6o7WixsoCC6+PlW8QWbEXnqUNGQSt/uKKMz8kYnD+AdhWsyk+5qhgCTv+gi1ICw0nbHcD5oIT4SGr3EBoF0Dp2t1HYHd1DWrIDRp+U2iToQIkxg== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR0701MB1972; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 20:gFjx3Iikgd4ggsTj8Hm8H3G1ipAXUsDEBo9jIbP/afkYkYpq4KVcuM5mqt8TfSnyqHT9WYno/0l5ilTSwngPC858ivucsyhg1zqJgDdEU5wtCCVTLDQnO0X2C3ULvT9x074SsSkVHGo1+hD11cvBKaReJ7BXDjGN8k5XWZoJqZYAkABZF6hSUjrPriGmGIwcISoRDBhzQwEFp2XUu+YfdIUJkOj+5j2I3jU/ya6EnQyo/9vHJDa7bZUVvNXcKAP1W9YBeFnTkXa5rw8Uqs6fki76AnYZ4zqgZdZVjiUbVq34I9MFqVFgvIDOugM3uTkOZva4HO3UsUuypuD8iuKbY4i+Qc6+DeKur+0TFktk3aYrPpozs6MI6RVd4wIWiV8GuaXXLP90Ypi+SASs1n+koAhLyus5x92mZ5yNHCzgzcY0S3IsOaFcKQ2iadAtU05MZkpHJ3Usk0jEh+GiOR7bSpkeIU4DHbeAVFW8HADttxesojq9iCOM4A3GDS5p5+B5NDjkMISlavUc79UBszDN+CLYKBmgLdtM6ZS60j/lJW9RZvnLQufXy/fgs8BsSWaa7q1Cf7lVRUESjR6JYryR20vsVPGgllmxPpJyn9bo7sQ= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(236414709691187); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(10201501046)(3002001); SRVR:BLUPR0701MB1972; BCL:0; PCL:0; RULEID:; SRVR:BLUPR0701MB1972; X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 4:Ux/ADTrb8HBQiHyc5MB1nWwhHnPm4XiNtHo5AZhd3SEbXRscnFcDaOGpOhmP9SfCZ6gbmJm1BsnxXlBZDmjzohCxKXhO2/0NeVCkmpTBGcNADZykO1pC/OFxO2WtHiJP6LsbA7rn628vKfFEn1DWpu7n9Ec6G4tuNgyp9Wn4ZG1m/rq8C1H6EgU9fE4X7DhMIr+1VHOha+kmZ3FXjfonI//1hCjgSYaP23NILCwsKAMybZ16MawXo8UbhXCECzfvuKMmRdPyHelxYgbFhpiCisuF4VbQwEuiS6lEgVXN0QR2IZ1PFUY4X3wRjkia/5MvV23guWK57z7fnpLmXDt3EXhFPJu3OwcV9yqLzezqKWSGwIBu6NWoiX3hHuTJrrAd X-Forefront-PRVS: 07658B8EA3 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6009001)(6069001)(199003)(189002)(81156007)(40100003)(5004730100002)(106356001)(53416004)(105586002)(66066001)(97736004)(42186005)(5001960100002)(6116002)(50226001)(86362001)(5009440100003)(3846002)(47776003)(5007970100001)(76506005)(4001430100002)(19580395003)(36756003)(229853001)(189998001)(107886002)(33646002)(2351001)(110136002)(19580405001)(101416001)(69596002)(50466002)(92566002)(5003940100001)(5008740100001)(87976001)(586003)(50986999)(77096005)(48376002)(122386002)(7099028); DIR:OUT; SFP:1101; SCL:1; SRVR:BLUPR0701MB1972; H:localhost.caveonetworks.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; Received-SPF: None (protection.outlook.com: caviumnetworks.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BLUPR0701MB1972; 23:SZZlMt/Tqq44Ib8PtoFNcS+oI+ezv1hF6IsN98t?= =?us-ascii?Q?uqmTraj6imqTpUzeBTiH6Jxh2PmjpS8spTYxPQIC6uV30Npk2um9RfUQMqAz?= =?us-ascii?Q?NpD8OfgvYDnd8OdFsPr14BV7UVp/1Gp/25A90+C1HKKObLRRXAK0AxVi+frH?= =?us-ascii?Q?Zc/z5zP7nyKT9pUEvGuWPTx4OVXTgm+Tb3oypTptAhezlfy2IH2jix0gQ7oz?= =?us-ascii?Q?ASPxLUenG7ruDuOIXhJjdv1vRxYIpP1SeiqlLUsmHHVyrZt7PN91T3b9orej?= =?us-ascii?Q?0eXJxR4mCl7YV1aue2C+p6rldTDpPArmF/Od1LsAUmQziwNgofwCKkyMYxLI?= =?us-ascii?Q?cPoWhXnCvWREx9cZ/JRKj4fmEAuJ+dcclDeWFvBrGwZ2BMiS+CbA+GfrgJoa?= =?us-ascii?Q?sArScJZ6F1ARx8v8ceAiC5rCarVph6dSOjx0kEusREZn/vxWlh4iUFSXK7NO?= =?us-ascii?Q?yZN0fNQra8t2IRBEURE9FEOpY/R4LvsnRBoi62q/FYcwoOkSnajZ2v7B4nIG?= =?us-ascii?Q?oUBDVSjYvLVsgZnshYKHAPuQ/P1eNA4t0eNzgqM9QiqTGbQz49cbKvTkuau7?= =?us-ascii?Q?rLmyomyMl5kzeJcapzI8YzDFwPTC+0h8+KIUkopjFLefGQYc2zfQKBzky4Np?= =?us-ascii?Q?qaVfFyhb0dum9ncscB7Zq3CwSUuJsqWK3YUDc45eMvnq4Sa+NX2560wHzM2C?= =?us-ascii?Q?/Y/FdCh64yZgElQ8VPEXOGDYUBIDoSJTzyBhz8n/KYM74fB4nLwr4rGXpV9n?= =?us-ascii?Q?5PP+Ga2vWpp5mOCp0jHCNi2I28L10mk3wvUF0KCmKgNF4NomRZZ+cspJVYYG?= =?us-ascii?Q?EYFlF6RsbA5NnVNHJaNo5oBDocEOMO5hUMykX+jbwu68VPBpH4egW6Fq+WMz?= =?us-ascii?Q?zyKITSRm5xpFq+96+RZ1UdbFuQaAYyTMMZrlIuP1/fzCDLsYmuoY8cB1fkBg?= =?us-ascii?Q?HnroXmKZWLEvyDz+JRws02RGSN0Ut/7EYDP+nzvLDxzT61HK4vmAsGGa9Mpg?= =?us-ascii?Q?JsPDTaYRSY3x8PQj/90aKghcSKMSkfT+R3S716vYO9YDOgkWzVJ/g3/InYnc?= =?us-ascii?Q?PHxE4xV8rE8rKzK7uTjhgHK1c6fDVeVaP2Adg/2BJwaCJMfYRg7n5X2ZiEki?= =?us-ascii?Q?w9PMKkco7ibjE3mAQ9YLs19a0TqjX20DrBy2Q3ERLpC2hz18JeihWdXuOlmM?= =?us-ascii?Q?4MMGY+WJXP5NcJJsTS1RIc2NNmZSX3PI2JIQH?= X-Microsoft-Exchange-Diagnostics: 1; BLUPR0701MB1972; 5:k2oK35CusMV40CbUeyaWpRziNtWE/WbVuXwisg3ChM5JtQlpIO58ExTDMbEBRe1Kh5Nt+ffglbC/lMox0haZZYNph4qtPeH4iWKvonQRXoPp6bJGU1iQNE+r9esLnJzSKoy3QOYdFc/ZohfgazKW1Q==; 24:14wZ7e+NduoPn4dpz5hLa6is56K1kqnKEoKfNe23YkzsbQ1/O5axOmYiG6NmpXkmiTemSmXULBI/Y2wZJWPZFUAPG7/vUw4Tj42EUrCkGXY=; 20:CudlAluCiN8jTk6DFUdva/ktX3OC5WH3+tm3jbWDKSyhGUzoW5v3SXQbPyRGVzwZOzpPRUV6iza0OugEzMIEPQ== SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Nov 2015 13:52:04.6128 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR0701MB1972 Subject: [dpdk-dev] [PATCH] hash: replace libc memcmp with optimized memory compare functions for arm64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The following measurements shows improvement over the default libc memcmp function Length(B) by X% over libc memcmp 16 149.57% 32 122.7% 48 104.96% 64 98.21% 80 93.75% 96 90.55% 112 110.48% 128 137.24% Signed-off-by: Jerin Jacob --- lib/librte_hash/rte_cmp_arm64.h | 114 ++++++++++++++++++++++++++++++++++++++ lib/librte_hash/rte_cuckoo_hash.c | 7 ++- 2 files changed, 120 insertions(+), 1 deletion(-) create mode 100644 lib/librte_hash/rte_cmp_arm64.h diff --git a/lib/librte_hash/rte_cmp_arm64.h b/lib/librte_hash/rte_cmp_arm64.h new file mode 100644 index 0000000..6fd937b --- /dev/null +++ b/lib/librte_hash/rte_cmp_arm64.h @@ -0,0 +1,114 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 Cavium networks. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* Functions to compare multiple of 16 byte keys (up to 128 bytes) */ +static int +rte_hash_k16_cmp_eq(const void *key1, const void *key2, + size_t key_len __rte_unused) +{ + uint64_t x0, x1, y0, y1; + + asm volatile( + "ldp %x[x1], %x[x0], [%x[p1]]" + : [x1]"=r"(x1), [x0]"=r"(x0) + : [p1]"r"(key1) + ); + asm volatile( + "ldp %x[y1], %x[y0], [%x[p2]]" + : [y1]"=r"(y1), [y0]"=r"(y0) + : [p2]"r"(key2) + ); + x0 ^= y0; + x1 ^= y1; + return !(x0 == 0 && x1 == 0); +} + +static int +rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k16_cmp_eq(key1, key2, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 16, + (const char *) key2 + 16, key_len); +} + +static int +rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k16_cmp_eq(key1, key2, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 16, + (const char *) key2 + 16, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 32, + (const char *) key2 + 32, key_len); +} + +static int +rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k32_cmp_eq(key1, key2, key_len) || + rte_hash_k32_cmp_eq((const char *) key1 + 32, + (const char *) key2 + 32, key_len); +} + +static int +rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len); +} + +static int +rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k32_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len); +} + +static int +rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k32_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 96, + (const char *) key2 + 96, key_len); +} + +static int +rte_hash_k128_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k64_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len); +} diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c index 1e970de..e6520dd 100644 --- a/lib/librte_hash/rte_cuckoo_hash.c +++ b/lib/librte_hash/rte_cuckoo_hash.c @@ -63,6 +63,10 @@ #include "rte_cmp_x86.h" #endif +#if defined(RTE_ARCH_ARM64) +#include "rte_cmp_arm64.h" +#endif + TAILQ_HEAD(rte_hash_list, rte_tailq_entry); static struct rte_tailq_elem rte_hash_tailq = { @@ -280,7 +284,8 @@ rte_hash_create(const struct rte_hash_parameters *params) * If x86 architecture is used, select appropriate compare function, * which may use x86 instrinsics, otherwise use memcmp */ -#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_X32) +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) ||\ + defined(RTE_ARCH_X86_X32) || defined(RTE_ARCH_ARM64) /* Select function to compare keys */ switch (params->key_len) { case 16: