From patchwork Wed Jul 24 07:53:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Mattias_R=C3=B6nnblom?= X-Patchwork-Id: 142679 X-Patchwork-Delegate: thomas@monjalon.net Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0A69045698; Wed, 24 Jul 2024 10:22:22 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D5E3F42E24; Wed, 24 Jul 2024 10:22:11 +0200 (CEST) Received: from EUR05-VI1-obe.outbound.protection.outlook.com (mail-vi1eur05on2040.outbound.protection.outlook.com [40.107.21.40]) by mails.dpdk.org (Postfix) with ESMTP id B12D6427B8 for ; Wed, 24 Jul 2024 10:03:44 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=f710TswLDTmFWgwCiqbDGCA82ss2WJw1PWTP0W02oppcfmlAtiYE52y6wFgQIVqAhJNZQ7Ib38SWWOCnvzfVczV5IQ05oEPKhzFhz4vB6Nby+shfVlM4qUxQ56dk6Z92IGpbdxQaq+onZ/HL3BxbbtDtcaWU7g8uKVsOcQS0vZtq+4n1Ru/6L6W5wgq/4Vq86tg+u/NJ/nbyvNo91Uk/dBtO7w6RQehNY7NAMIy0MfGpoRFHoN1Iim8OK8FHm0JbF2u5ES0yHrssAr274ftn9TQrXvfqet7L8RZPDnNcqpD9wgzzMciSh1qXrLVBB4z+LIxStH+ueHMqmo9LfOA4/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=z22bXtS4XrtbkUJgM4b26Ti/93OMNhkH5v3afQAXhLo=; b=eDrscThs1qTGg1jsFbmbPOPVlhNsLeYuWD/F1JtaRjdLKFM++5mmt/NtN/XWofmQtsBQUGpjiKLDaKrQsoBAU5KzOIMRJ3tRRk2/yRkuQTn8Vwjbiatsw4rfGEqQoIeNJ5rCDwOE8IxmacqzYCC7LwD77331o3BlwTOp/a9UpreBGSAK50NiXtljbKYaMb90GsyFYPKvpD0fvE/H866TSb8UTBCX8Qr5HdocRXMzcpAKD0coOgYhDvp2/QaOXDdF8wXkIAkNMFOhVGAPbv4YtDSw/UR8QI9Du+q1cjFUY7Jd3T0b5C+Gr0RktRKf0YyZE6fUSxkxk8mGMz2bdMLePA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 192.176.1.74) smtp.rcpttodomain=dpdk.org smtp.mailfrom=ericsson.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=ericsson.com; dkim=none (message not signed); arc=none (0) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ericsson.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=z22bXtS4XrtbkUJgM4b26Ti/93OMNhkH5v3afQAXhLo=; b=GqhvfcWdBASH7P0f5aI7UlE0PEQDzMmXpz11iXmKV0xbFDBnzSFYUPmnZPhoKPNXTYhgbxG3WFGA4YQhjReD5aeJcYpGopipNvMdAKzPeVFtl/0kGlAkq6tNceIQYUubNnlyx3RIusznFskLayZTyigweh+qq/X9g6TKJgi2tXp/5AQZgxfiDFfIj8CVozJE/yfX2WM/bbqMJ16aKLhmKxCnMV6w2ow0vigguq277uAl8NpDPNediXuOC5xZd5y7i1NFkMPGL4j0KoIDq1/vMnh1TbjNucMHWG5o5slDZMryOaH438o7iGwjFGyIFucOZz4/7Fy1uLpzdHSzPExKTQ== Received: from AS4P250CA0011.EURP250.PROD.OUTLOOK.COM (2603:10a6:20b:5df::13) by VI1PR07MB6671.eurprd07.prod.outlook.com (2603:10a6:800:184::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7807.7; Wed, 24 Jul 2024 08:03:43 +0000 Received: from AM2PEPF0001C711.eurprd05.prod.outlook.com (2603:10a6:20b:5df:cafe::b5) by AS4P250CA0011.outlook.office365.com (2603:10a6:20b:5df::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.16 via Frontend Transport; Wed, 24 Jul 2024 08:03:42 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 192.176.1.74) smtp.mailfrom=ericsson.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=ericsson.com; Received-SPF: Pass (protection.outlook.com: domain of ericsson.com designates 192.176.1.74 as permitted sender) receiver=protection.outlook.com; client-ip=192.176.1.74; helo=oa.msg.ericsson.com; pr=C Received: from oa.msg.ericsson.com (192.176.1.74) by AM2PEPF0001C711.mail.protection.outlook.com (10.167.16.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7784.11 via Frontend Transport; Wed, 24 Jul 2024 08:03:42 +0000 Received: from seliicinfr00049.seli.gic.ericsson.se (153.88.142.248) by smtp-central.internal.ericsson.com (100.87.178.60) with Microsoft SMTP Server id 15.2.1544.11; Wed, 24 Jul 2024 10:03:41 +0200 Received: from breslau.. (seliicwb00002.seli.gic.ericsson.se [10.156.25.100]) by seliicinfr00049.seli.gic.ericsson.se (Postfix) with ESMTP id CBC18380070; Wed, 24 Jul 2024 10:03:41 +0200 (CEST) From: =?utf-8?q?Mattias_R=C3=B6nnblom?= To: CC: =?utf-8?q?Mattias_R=C3=B6nnblom?= , =?utf-8?q?Mor?= =?utf-8?q?ten_Br=C3=B8rup?= , "Stephen Hemminger" , David Marchand , Pavan Nikhilesh , Bruce Richardson , =?utf-8?q?Mattias_R=C3=B6nnb?= =?utf-8?q?lom?= Subject: [PATCH v5 6/6] vhost: optimize memcpy routines when cc memcpy is used Date: Wed, 24 Jul 2024 09:53:57 +0200 Message-ID: <20240724075357.546248-7-mattias.ronnblom@ericsson.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240724075357.546248-1-mattias.ronnblom@ericsson.com> References: <20240620175731.420639-2-mattias.ronnblom@ericsson.com> <20240724075357.546248-1-mattias.ronnblom@ericsson.com> MIME-Version: 1.0 X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: AM2PEPF0001C711:EE_|VI1PR07MB6671:EE_ X-MS-Office365-Filtering-Correlation-Id: b71a077b-9df9-4148-822f-08dcabb722b6 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|1800799024|36860700013|376014|82310400026; X-Microsoft-Antispam-Message-Info: =?utf-8?q?yyK6mZBIAutKglcMAcTIaaRAbDW1fNs?= =?utf-8?q?ft+BK0nXNokffgi5oP2vvUsCWFRdYW6DVSE2kN6uBtFAXQwz1JN0wqoov8xZ57P6p?= =?utf-8?q?sMFou2JlO4lVV6zxXrJp9IWJrj4/91gks27ec3eP1/VTUl3dzvwJedYKwg54UJsKZ?= =?utf-8?q?XdLespemkz/HFWUEsRTy93RvIPxe44IRZwiRWYQRFU+0SQGB24/p18bunTaT4aI3p?= =?utf-8?q?dgQaDuMKuLbVoxxTs+zSNHO7ebFM3JHzyTPwNRcn3DCvv8RdbB6juLha4LVYE0mcr?= =?utf-8?q?orrLGFWxnI48zWNgzXtxivOzLgRk+PaEAWZcgC6lzDFaBBcT0HlQhJooTo7vcETBj?= =?utf-8?q?rCietzdc82rDC2jvmDAjJic6AlbOaYZ9NxtWmAdRsE+4fJfCiJW5gOarBHkf/P1np?= =?utf-8?q?xOyFFN+QmzeCDrmv9YQJKWO2fczzWDEOZ1DUl5nmTfuf/tbQjQPl7JTgRtZZbOB+z?= =?utf-8?q?D2A6AKDebqmsb9IsZORby8BYbnHYP2myI+pRDdbE96dzmFp0Byk06ypF/51xh+8Ed?= =?utf-8?q?cxK4nQFAKzSElJeovPRgZwkGV/BLQmhx4Bt5dccumBMpAw6U8Q9xvtRwS2NpdOK3p?= =?utf-8?q?WXU0e1gVTh6GSmTxxBp6sharccXvNtxgMVGZRZy/vBFilq1DrrZJtnkJtRo3YjXOs?= =?utf-8?q?wO9KdD0qVVIeNeHYt8yxHJlPlmsBmweWyhu0qJTI1IuXO5Odj4LCISbWWVx0V0Z2j?= =?utf-8?q?S2ANbOdwVync9rO1cZEBK6127T2vv1/bYestfjSEk24ZTjstIsRQC40+w+tY8g+1n?= =?utf-8?q?64nWrnqYxQSy8sNufJIdTKs4jMSmN6Qq9gIf/RNbymNfZuIPyS0+IdGFJD6MlKI6g?= =?utf-8?q?GghO1diYr2CrFa7Lf7ZmOOlQROrfifH40fomgGbUL5G2BbZRuAAl+iqwBhynawECj?= =?utf-8?q?gB4vfEvVwAMX45UfdMau+OO/wnvxkAvkcy36CyFKOTCWu6NqXhteb85mavG/YMMge?= =?utf-8?q?cWPT9Ln+t3BeQ37vbe9rgRvkiOVct36IQAOpUHjFa7IwYZZCldJnJDNI7YznWZTBO?= =?utf-8?q?gi5zb243IMhZJIgcgxWuNBspII4OrvgvpxI5qlvy0ZlljWb4b+EeKy7M79hTbySTf?= =?utf-8?q?f9Ad/0hFtJ9lSILSL19iFpC9Tx11pfMFLD9XRqPh38ID4C+sxwEyIcapeFXp1bKBv?= =?utf-8?q?pPL347MNevPHUJl5S+YolC0IbqtqIHggetRvkxjGQ1uZLQeOAe4qI/1HMioHZBpbq?= =?utf-8?q?TUYZK4Yi3ShV9tW3nK5L8Y1byIAvZFHXh42TWcP3ba0Jxwl+3ZrT06AW2QyV+ENiL?= =?utf-8?q?q7hFPDwLoWXX8uA14sisZdi3YjFrbBL4zpuefpP1tVxyZcMPAL1ER6Xvv+wBykxvB?= =?utf-8?q?RhTVJaY4yS19xM0hBi+uV1AN+w1yZj5hIq6ivhu5UREsoQmCFn996B5sgIDtf11Xs?= =?utf-8?q?tBEVsgRWDMa?= X-Forefront-Antispam-Report: CIP:192.176.1.74; CTRY:SE; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:oa.msg.ericsson.com; PTR:office365.se.ericsson.net; CAT:NONE; SFS:(13230040)(1800799024)(36860700013)(376014)(82310400026); DIR:OUT; SFP:1101; X-OriginatorOrg: ericsson.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 24 Jul 2024 08:03:42.8893 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: b71a077b-9df9-4148-822f-08dcabb722b6 X-MS-Exchange-CrossTenant-Id: 92e84ceb-fbfd-47ab-be52-080c6b87953f X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=92e84ceb-fbfd-47ab-be52-080c6b87953f; Ip=[192.176.1.74]; Helo=[oa.msg.ericsson.com] X-MS-Exchange-CrossTenant-AuthSource: AM2PEPF0001C711.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR07MB6671 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org In build where use_cc_memcpy is set to true, the vhost user PMD suffers a large performance drop on Intel P-cores for small packets, at least when built by GCC and (to a much lesser extent) clang. This patch addresses that issue by using a custom virtio memcpy()-based packet copying routine. Performance results from a Raptor Lake @ 3,2 GHz: GCC 12.3.0 64 bytes packets Core Mode Mpps E RTE memcpy 9.5 E cc memcpy 9.7 E cc memcpy+pktcpy 9.0 P RTE memcpy 16.4 P cc memcpy 13.5 P cc memcpy+pktcpy 16.2 GCC 12.3.0 1500 bytes packets Core Mode Mpps P RTE memcpy 5.8 P cc memcpy 5.9 P cc memcpy+pktcpy 5.9 clang 15.0.7 64 bytes packets Core Mode Mpps P RTE memcpy 13.3 P cc memcpy 12.9 P cc memcpy+pktcpy 13.9 "RTE memcpy" is use_cc_memcpy=false, "cc memcpy" is use_cc_memcpy=true and "pktcpy" is when this patch is applied. Signed-off-by: Mattias Rönnblom Acked-by: Morten Brørup --- lib/vhost/virtio_net.c | 37 +++++++++++++++++++++++++++++++++++-- 1 file changed, 35 insertions(+), 2 deletions(-) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 370402d849..63571587a8 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -231,6 +231,39 @@ vhost_async_dma_check_completed(struct virtio_net *dev, int16_t dma_id, uint16_t return nr_copies; } +/* The code generated by GCC (and to a lesser extent, clang) with just + * a straight memcpy() to copy packets is less than optimal on Intel + * P-cores, for small packets. Thus the need of this specialized + * memcpy() in builds where use_cc_memcpy is set to true. + */ +#if defined(RTE_USE_CC_MEMCPY) && defined(RTE_ARCH_X86_64) +static __rte_always_inline void +pktcpy(void *restrict in_dst, const void *restrict in_src, size_t len) +{ + void *dst = __builtin_assume_aligned(in_dst, 16); + const void *src = __builtin_assume_aligned(in_src, 16); + + if (len <= 256) { + size_t left; + + for (left = len; left >= 32; left -= 32) { + memcpy(dst, src, 32); + dst = RTE_PTR_ADD(dst, 32); + src = RTE_PTR_ADD(src, 32); + } + + memcpy(dst, src, left); + } else + memcpy(dst, src, len); +} +#else +static __rte_always_inline void +pktcpy(void *dst, const void *src, size_t len) +{ + rte_memcpy(dst, src, len); +} +#endif + static inline void do_data_copy_enqueue(struct virtio_net *dev, struct vhost_virtqueue *vq) __rte_shared_locks_required(&vq->iotlb_lock) @@ -240,7 +273,7 @@ do_data_copy_enqueue(struct virtio_net *dev, struct vhost_virtqueue *vq) int i; for (i = 0; i < count; i++) { - rte_memcpy(elem[i].dst, elem[i].src, elem[i].len); + pktcpy(elem[i].dst, elem[i].src, elem[i].len); vhost_log_cache_write_iova(dev, vq, elem[i].log_addr, elem[i].len); PRINT_PACKET(dev, (uintptr_t)elem[i].dst, elem[i].len, 0); @@ -257,7 +290,7 @@ do_data_copy_dequeue(struct vhost_virtqueue *vq) int i; for (i = 0; i < count; i++) - rte_memcpy(elem[i].dst, elem[i].src, elem[i].len); + pktcpy(elem[i].dst, elem[i].src, elem[i].len); vq->batch_copy_nb_elems = 0; }