From patchwork Mon Nov 23 18:45:35 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerin Jacob X-Patchwork-Id: 9061 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 422B58EA1; Mon, 23 Nov 2015 19:46:43 +0100 (CET) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1bon0057.outbound.protection.outlook.com [157.56.111.57]) by dpdk.org (Postfix) with ESMTP id EC9318E9F for ; Mon, 23 Nov 2015 19:46:40 +0100 (CET) Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Jerin.Jacob@caviumnetworks.com; Received: from localhost.caveonetworks.com (122.167.53.209) by BY2PR0701MB1975.namprd07.prod.outlook.com (10.163.155.21) with Microsoft SMTP Server (TLS) id 15.1.331.20; Mon, 23 Nov 2015 18:46:36 +0000 From: Jerin Jacob To: Date: Tue, 24 Nov 2015 00:15:35 +0530 Message-ID: <1448304338-22767-2-git-send-email-jerin.jacob@caviumnetworks.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1448304338-22767-1-git-send-email-jerin.jacob@caviumnetworks.com> References: <1448304338-22767-1-git-send-email-jerin.jacob@caviumnetworks.com> MIME-Version: 1.0 X-Originating-IP: [122.167.53.209] X-ClientProxiedBy: MAXPR01CA0022.INDPRD01.PROD.OUTLOOK.COM (25.164.147.29) To BY2PR0701MB1975.namprd07.prod.outlook.com (25.163.155.21) X-Microsoft-Exchange-Diagnostics: 1; BY2PR0701MB1975; 2:05ZINarKZmezOFjiVbb/hpYdb/+Z7THfUU9xxkx75xt8hXx1scpJr+DnMqSYSKKM7uI7IkARiKMCoBxXXflF71q7Yb1N4zB6Y+7J8dvtiUU1npPLwjZXOP+bX1xx8adB1r08W1U3WybQVtsff20xPw==; 3:4E6JFs2A3qnQcgGvLNf7IZ19DKXiP9pPqT6MxddDmfjJAT1QUz9STpV97ZYjefzDozJa+b/tNVdeVjG/lIK8ucutufTeUV3APDBiLj2erxdvVxe/LemA50y5SXfdioZG; 25:wd5ThiT5jDR9fZ92cY1QA5AIkLC50hXhZ05PBEjIZbWV7mP2F8qumVHb6ZNs0ret5DryNq9l0Aex0RDHMJev0SA5Ao8s80Qxt+PwtZGt5xDZzmxCVNH5xfQ5Faux3ddjnwna7LAP8V6oHZMHMkLsnxvUKIT7DZQxRLPLCwPyycKq1gMsmTULghzMIigKg1BtAE6zmrHISEnYFTZCw1wSdER77kKVujB669ZTluCXPp7sfbG48jqHyTDdpeLdu2ldZLiEIkpopbjmoHAi5NJXuQ== X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BY2PR0701MB1975; X-Microsoft-Exchange-Diagnostics: 1; BY2PR0701MB1975; 20:uk+SkjK6bCUVtuzjBl2uROO74tOspfBcC89b+vvdVsjB9fudjJHyXWhIngL7kyy2OmxKlADMf3bpnBBwUjxVQ/NZT4Dubv0q5XnyndRipjKo6efLepL/ArC1fc5hl4CBf6dY9m2HT+FTiKMywcwej8ukmVRE4c4gXPSLoVIBdM8tFkLVUpGL6gtxwiANoDNMSAd88QFkXHmA5J19Ww4CSTKgNWB6U/LdvHPkI/GCPqR9IIDkrGH6JfLMMNW5G/JgywZ/BcpaRGmygTBsHlD6E0GOMv2l6J/2M6GJZ9xpWgx/6r1h8Jvi3i9Kt2RFBVWpkMbinw3gbzr6G/wPFwomwiB+sk0s5/o2bIxE5LnOxQCxRnFFr/BtYHD6B2f5wikIdSiHw3SGQMY9oTQJWZ0Dt+gccPPYva7bEOi+LBgGhU3qn9WCNFtLCsKnXJf1Sn52KULKTVfW3iRWZp4GilrJV2wYR1UoFOV9Fzpift3vTwr7DtIpXT5lqK6ykEJ47NPRREwnu0Mz2gMT7BWjOqRvPJJ8Tb1Tywhw4aIKiJqrTP8dze2PVGNLDH+nkABUv0l8iCQNBC6OpRDx/aDk20W15w3JCXOHojelsVIeFRlmcH4= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(236414709691187); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(5005006)(520078)(8121501046)(3002001)(10201501046); SRVR:BY2PR0701MB1975; BCL:0; PCL:0; RULEID:; SRVR:BY2PR0701MB1975; X-Microsoft-Exchange-Diagnostics: 1; BY2PR0701MB1975; 4:2j5gPf8yAH3eHPp77hI3R64oHEWef5fQ1krwtrmP6mVzumVVrCo3//NpsEpD+mF3/0hF3WVJyIRmxmd702GjCYqOE0HOcYWenbb+AmN/HSWlVxU/24o2MeCppBMuVrk4bnNuDT/NrZBohNZ8V/ML8vbWG3syjSISN02J6U5FyMuAkB3WNeOjxu8IUagEcfw5lGn0YFVOfUtg/u2sSrMpHy+OI5n2Hbm0VuB9N9N/iIQNyfLNlRyT2sKw7LM51TP2Ig/8r6fQXujTwTRVB60rK5zcWkk6aYNc2WkfY3CjRghI+8Uj9/lYxpJ39V8/ZfOn8ze6KRVPO5RFsdBsmnol9ZGEe0mLPAu7Tq/ykx79AgZoX6L3JFr9FBz9Js3E6LP4kgRFcx7QJl56GjL+9OS2SEDEBc7KIJmDHS5mWDFcBA6i+lSqiHiZSq5fXMfoQddP X-Forefront-PRVS: 07697999E6 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(6009001)(6069001)(199003)(189002)(107886002)(92566002)(69596002)(48376002)(5008740100001)(50986999)(101416001)(19580405001)(47776003)(42186005)(19580395003)(76506005)(97736004)(586003)(4001430100002)(66066001)(6116002)(3846002)(106356001)(40100003)(5004730100002)(36756003)(50226001)(33646002)(87976001)(86362001)(5007970100001)(50466002)(81156007)(77096005)(2950100001)(122386002)(5001960100002)(5001920100001)(110136002)(189998001)(2351001)(76176999)(5003940100001)(105586002)(229853001)(53416004)(7099028); DIR:OUT; SFP:1101; SCL:1; SRVR:BY2PR0701MB1975; H:localhost.caveonetworks.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; Received-SPF: None (protection.outlook.com: caviumnetworks.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BY2PR0701MB1975; 23:UWpagnSVKwaS1AdZM366OtPK3/OmP5PO9aS1mRh?= =?us-ascii?Q?mdo/DqmUd73hxG6O7lvYp03u/LQ0VPq7zUDTVZ55oLqBY91PrtsPRk0ecp67?= =?us-ascii?Q?a+g1OQTitJujuYm1eDnOuXpTPwQEnb3JT5f2DxwMN+fC1qOmQLvqUPOZHudy?= =?us-ascii?Q?vFcD81cYGlw98de9xN27SV46qktovbj70sNWfOqtRqtWimINIobm4vwlgmdt?= =?us-ascii?Q?mdY2qQcmLvv6qBgXCZNbzhCwluT8LPvPgEZzQvUsVNFGoIt8ppJ1VV5kDaCz?= =?us-ascii?Q?AvDRJAiosyOnZMTGVn3J+y4P1dxplXGlI7IRAYxVL3A+OMHyAnG0zGA1fvbe?= =?us-ascii?Q?ZzoECw9/J9T/ogQ6JoQfJdNy+z5QDAY283FCgzvbizsc5YEl5OeHOX4LGUlA?= =?us-ascii?Q?beCtr2zbEPyDnj5Z7tH1ydKoBemAt4M1MCir0UJaTXYOHo4APSR8IFVTJIUg?= =?us-ascii?Q?nHxm+QaQKNFb6Ox8P3iky5fTf7s1KynSlPjyIeqsz+BnZRbXSzQkuzRuROK3?= =?us-ascii?Q?8HKAYgHqN+5BHE4Y41W9yQX2k3/GEMQazZFdY2CkDYaFvJ3nT2cXGW/3r0ic?= =?us-ascii?Q?M4FzHodreg/IPwYR+dDAG73b/jOzITUDrwHupXwXQGcr+SUH6q6Oc607U6uJ?= =?us-ascii?Q?oG2A97iK/E5bzPkSMCLfdxh0iwfO1FueF8l8tZxfRdBa5RkFGVrINM0B1/5g?= =?us-ascii?Q?MqtPoyMYZXRtlSegl/MY9LYgE2KIM/KWSdml2NorDd6r9jLyJ8i+KJbnTXAl?= =?us-ascii?Q?opY7MD6XqMIkoRuHMGtfVc+vCbbVkUPmZLtJ97PFHuauLiAfIvKs7TNyWpaG?= =?us-ascii?Q?vey2i7dpJHYMkHB7qQ+nzIN0GDo2ZwiuE/aEdHdS1Z+8j7Asx8zxgDg59Vys?= =?us-ascii?Q?EpmajfnQupYRUP4jjSaxJPw+h4Fg7zJO0ot27SLxuFjlxO8s6NmjOn3AAszk?= =?us-ascii?Q?KE+K7fMPP6ebfm7Yefv0wQjmo2XLp3ZMAx8KqykxklrCAjwA+bbrzRMvTO3K?= =?us-ascii?Q?0gZbug9fpnOrrb66nUIcOmKXPu5Kl/qzfP7KF7lGHjMva/dZjhx0i6g4shZb?= =?us-ascii?Q?u3Qbw37oLl4wSO4S3p3m6msSUiu76KxLGCg9F8L5jpH1EHaMc5DZoWpahZnD?= =?us-ascii?Q?Hzrna5SbntTGAaGhwmAh1m1XKRbjXE4h4ENsM0ed1GYwTWPRgbJv6sn+L2ZZ?= =?us-ascii?Q?NAl9+tmfvVWEn+ifLpyj+AepwCMpoKk72XdqHrGcb0vyvPKr72I/5x43uD2v?= =?us-ascii?Q?f76UZvp3Dk4hgwC+eca8=3D?= X-Microsoft-Exchange-Diagnostics: 1; BY2PR0701MB1975; 5:z5JdBD/n9hbc8Jz+xtsDa044hs2kz4pVTjSNsY54HRh6OGDBQoO3b4lThcaDk91sUW0xCPLUUvvT5FHRD6wDNWqro12B93Jn9axeljVWS+0Bt9HWErFhlVez0Q25iIH16dG2y9m7rHYTUeffYSj55g==; 24:YINo7oMVmPhR6G+07JSVpmFCwW7IdZCHIpkZCtYmmP/UFXXeYClDjxdRyL57nfWZ0b0eBkcCOqaM+bq7/BJUniV05h7v1qQW7K0z6Y8mvAQ= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Nov 2015 18:46:36.8293 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR0701MB1975 Subject: [dpdk-dev] [PATCH 1/4] hash: replace libc memcmp with optimized memory compare functions for arm64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The following measurements shows improvement over the default libc memcmp function Length(B) by X% over libc memcmp 16 149.57% 32 122.7% 48 104.96% 64 98.21% 80 93.75% 96 90.55% 112 110.48% 128 137.24% Signed-off-by: Jerin Jacob --- lib/librte_hash/rte_cmp_arm64.h | 114 ++++++++++++++++++++++++++++++++++++++ lib/librte_hash/rte_cuckoo_hash.c | 7 ++- 2 files changed, 120 insertions(+), 1 deletion(-) create mode 100644 lib/librte_hash/rte_cmp_arm64.h diff --git a/lib/librte_hash/rte_cmp_arm64.h b/lib/librte_hash/rte_cmp_arm64.h new file mode 100644 index 0000000..6fd937b --- /dev/null +++ b/lib/librte_hash/rte_cmp_arm64.h @@ -0,0 +1,114 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2015 Cavium networks. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Cavium networks nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +/* Functions to compare multiple of 16 byte keys (up to 128 bytes) */ +static int +rte_hash_k16_cmp_eq(const void *key1, const void *key2, + size_t key_len __rte_unused) +{ + uint64_t x0, x1, y0, y1; + + asm volatile( + "ldp %x[x1], %x[x0], [%x[p1]]" + : [x1]"=r"(x1), [x0]"=r"(x0) + : [p1]"r"(key1) + ); + asm volatile( + "ldp %x[y1], %x[y0], [%x[p2]]" + : [y1]"=r"(y1), [y0]"=r"(y0) + : [p2]"r"(key2) + ); + x0 ^= y0; + x1 ^= y1; + return !(x0 == 0 && x1 == 0); +} + +static int +rte_hash_k32_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k16_cmp_eq(key1, key2, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 16, + (const char *) key2 + 16, key_len); +} + +static int +rte_hash_k48_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k16_cmp_eq(key1, key2, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 16, + (const char *) key2 + 16, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 32, + (const char *) key2 + 32, key_len); +} + +static int +rte_hash_k64_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k32_cmp_eq(key1, key2, key_len) || + rte_hash_k32_cmp_eq((const char *) key1 + 32, + (const char *) key2 + 32, key_len); +} + +static int +rte_hash_k80_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len); +} + +static int +rte_hash_k96_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k32_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len); +} + +static int +rte_hash_k112_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k32_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len) || + rte_hash_k16_cmp_eq((const char *) key1 + 96, + (const char *) key2 + 96, key_len); +} + +static int +rte_hash_k128_cmp_eq(const void *key1, const void *key2, size_t key_len) +{ + return rte_hash_k64_cmp_eq(key1, key2, key_len) || + rte_hash_k64_cmp_eq((const char *) key1 + 64, + (const char *) key2 + 64, key_len); +} diff --git a/lib/librte_hash/rte_cuckoo_hash.c b/lib/librte_hash/rte_cuckoo_hash.c index 1e970de..e6520dd 100644 --- a/lib/librte_hash/rte_cuckoo_hash.c +++ b/lib/librte_hash/rte_cuckoo_hash.c @@ -63,6 +63,10 @@ #include "rte_cmp_x86.h" #endif +#if defined(RTE_ARCH_ARM64) +#include "rte_cmp_arm64.h" +#endif + TAILQ_HEAD(rte_hash_list, rte_tailq_entry); static struct rte_tailq_elem rte_hash_tailq = { @@ -280,7 +284,8 @@ rte_hash_create(const struct rte_hash_parameters *params) * If x86 architecture is used, select appropriate compare function, * which may use x86 instrinsics, otherwise use memcmp */ -#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) || defined(RTE_ARCH_X86_X32) +#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) ||\ + defined(RTE_ARCH_X86_X32) || defined(RTE_ARCH_ARM64) /* Select function to compare keys */ switch (params->key_len) { case 16: