From patchwork Thu Jul 14 07:52:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: abhimanyu.saini@xilinx.com X-Patchwork-Id: 113962 Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id CBD3DA00C5; Thu, 14 Jul 2022 09:52:48 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id D8CE042B81; Thu, 14 Jul 2022 09:52:30 +0200 (CEST) Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2041.outbound.protection.outlook.com [40.107.93.41]) by mails.dpdk.org (Postfix) with ESMTP id E7E7E42B90 for ; Thu, 14 Jul 2022 09:52:28 +0200 (CEST) ARC-Seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Bu1J6X1fqav5Cc6CQtXy5wbEtJPwXsO5K3sdFG6ePAEYvjdFWqwlNjGHQosWXsTDqBpY5fJxjs2v79RqtuTnhFK0stiBMf75RC31Mz2AUr+AH+35Gx5Rf4IdEfD4LCFJclcR+/IyKmlPlbEuNoPqehWfjeZHkFJFINHMB51Q39KuDjgQUWypXXkSMRgPu5xXs2oBTjfoJxdD0iF1o2Tbzk/yuUqR9HINT9vAIU3BKevVujNt1hdRnpvhStgfBIDpIBYzlSIrmS1e3cHG+CVrS/tZm/JgDe0NEFgmEDDyko5YQ69QuCMnbi7lvVqodBv2/S1E60tDe8UEPM2Ug0Q8QA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vv5odaRCe1AIxidDAkn84wmaW6+YpJvXl6So1duxD2Y=; b=XKAjV3g9fasIOWiQUcJOppnnEAylgNYcYAqXdSoPoFnSgSX+mkXksmYsnyw7DH64KQ7r5YPYW5JGtRwlVx3EriS1vUkc0/Vj5mpYeCcvrqNrTQkvsG0a7Q4bth+e13XZ51HJ9Ej1ow5ZyLbBaneUStv53jO/HuciS4MMj823A+yWq+R+vWUYxki4CiKQWXoDec+ZGec5SYjC8/NH3XJyyogjVADvO8YE3avy0Ii93yss9tc1p9Wmp6907hP4dedK0ukXVOeNs/2UWLdJ+w8NMh0k/ITBc/zkpePN2Q/3BMRzn7f+ARz8KGNqsRjUDkU+5Xa6vtOM8Ir2x00ipA7qZw== ARC-Authentication-Results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 20.83.241.18) smtp.rcpttodomain=dpdk.org smtp.mailfrom=amd.com; dmarc=fail (p=none sp=none pct=100) action=none header.from=xilinx.com; dkim=none (message not signed); arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=xilinx.com] dmarc=[1,1,header.from=xilinx.com]) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amdcloud.onmicrosoft.com; s=selector1-amdcloud-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vv5odaRCe1AIxidDAkn84wmaW6+YpJvXl6So1duxD2Y=; b=YU2i109QQN4m5mt60lOU4yXlInPksH1DvTRTShdfYSRddVpmNkc8LXdErahoC49MaPS5bM+yIX2ics0EGWKD3TjJRSHnoppQkjPZXnioXnNLbJBWt5fXFKCoa31w+WfAH4h9lerhhHDs81awUFQo4KMbsCqHGr7Ij1CK8qHRnRzJnwk7u+ct2YWoVb6UQ6IOWvnNAapM9IKTKFMd76kt5DzWb+EtMb6Cm8Yijp8kYqrQ261w5MNUIb2zsjDmV75nY7iyAEzODzU0U8gtPx6GMzwbS3FxE7jrupRZDU9rSOeeQiszof6ULhv0jTsw7HQB0t6J9DGplxcXp2+ulYmKzg== Received: from BN0PR04CA0172.namprd04.prod.outlook.com (2603:10b6:408:eb::27) by BL0PR12MB4708.namprd12.prod.outlook.com (2603:10b6:208:8d::24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.20; Thu, 14 Jul 2022 07:52:26 +0000 Received: from BN8NAM11FT048.eop-nam11.prod.protection.outlook.com (2603:10b6:408:eb:cafe::33) by BN0PR04CA0172.outlook.office365.com (2603:10b6:408:eb::27) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.17 via Frontend Transport; Thu, 14 Jul 2022 07:52:26 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 20.83.241.18) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=fail action=none header.from=xilinx.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 20.83.241.18 as permitted sender) receiver=protection.outlook.com; client-ip=20.83.241.18; helo=mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net; pr=C Received: from mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net (20.83.241.18) by BN8NAM11FT048.mail.protection.outlook.com (10.13.177.117) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.12 via Frontend Transport; Thu, 14 Jul 2022 07:52:26 +0000 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12lp2049.outbound.protection.outlook.com [104.47.66.49]) by mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net (Postfix) with ESMTPS id 5C1AB41F5D; Thu, 14 Jul 2022 07:52:25 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=MZP4G5yN68krydmPTglYShvpsdOqOVMFS2/VQERufNfvY1yCnp4C7GzDE0fso+hPq9pd4ROiDD9eqFxUhSsqkEh1edwubgGV9ZVKFPbBSSzQmadRpCXnqZtzYh9JjRKXZyrZdLy7XpOc0z5OZ2qq5bjj41HVO5Ih3XixVbKV7FTo2g6WTIkPcxM/jbzykjwBfx2MQ7uR/xHExJgF2ftUMKSOrmvF9OZBtKyVJHcigVCE3XPXeC4qCUWWTCqBYD28hmGnPO+RZ32eYqQTmJUEhp+HxvNEmCRljwfRPJ644Eboom9q/S3rrystVklmc+8xUPRHhPal3Pj0bPwmmga/fg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=vv5odaRCe1AIxidDAkn84wmaW6+YpJvXl6So1duxD2Y=; b=gW7HkkICTeQy+NBlJ/o2Xf8ZW0w2SAFsjLW5X/VFGeJdvoXZWUl3gxLzn2RsPkCFffGMskJhmjqm+Ktm1ZnXbvW5MKrcHzoFTj3yMwHWE55R/E0G8lnU8VY9x/SvFSckfjr3yTMnZnrR1NVMglbkCTEpIzGOAcb+WkqM2AfeD7auUXFQLoynv9eQG9lkKB9QyUX59nP4tvfiZH9G1Ju85/N/FCt0HKLN2ypEuAsagRKVqiphBlZdkOvfc76JG91dD2CL9hLBoznc5JHpOnEQd06iOa7yVUwHGBn+qux909gDABRC43mkhmLSGpW13w8O8YwWqVJXvZ0FlVpeY40ycg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 149.199.62.198) smtp.rcpttodomain=dpdk.org smtp.mailfrom=xilinx.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=xilinx.com; dkim=none (message not signed); arc=none Received: from DM6PR02CA0045.namprd02.prod.outlook.com (2603:10b6:5:177::22) by BYAPR02MB4182.namprd02.prod.outlook.com (2603:10b6:a02:fb::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5438.12; Thu, 14 Jul 2022 07:52:21 +0000 Received: from DM3NAM02FT039.eop-nam02.prod.protection.outlook.com (2603:10b6:5:177:cafe::a) by DM6PR02CA0045.outlook.office365.com (2603:10b6:5:177::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5417.25 via Frontend Transport; Thu, 14 Jul 2022 07:52:21 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 149.199.62.198) smtp.mailfrom=xilinx.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=xilinx.com; Received: from xsj-pvapexch01.xlnx.xilinx.com (149.199.62.198) by DM3NAM02FT039.mail.protection.outlook.com (10.13.5.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5438.12 via Frontend Transport; Thu, 14 Jul 2022 07:52:20 +0000 Received: from xsj-pvapexch02.xlnx.xilinx.com (172.19.86.41) by xsj-pvapexch01.xlnx.xilinx.com (172.19.86.40) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2176.14; Thu, 14 Jul 2022 00:52:19 -0700 Received: from smtp.xilinx.com (172.19.127.96) by xsj-pvapexch02.xlnx.xilinx.com (172.19.86.41) with Microsoft SMTP Server id 15.1.2176.14 via Frontend Transport; Thu, 14 Jul 2022 00:52:19 -0700 Envelope-to: dev@dpdk.org, chenbo.xia@intel.com, maxime.coquelin@redhat.com, andrew.rybchenko@oktetlabs.ru, absaini@amd.com Received: from [10.170.66.118] (port=50692 helo=xndengvm004118.xilinx.com) by smtp.xilinx.com with esmtp (Exim 4.90) (envelope-from ) id 1oBtdz-0002Lo-73; Thu, 14 Jul 2022 00:52:19 -0700 From: To: CC: , , , Abhimanyu Saini Subject: [PATCH 5/5] vdpa/sfc: Add support for SW assisted live migration Date: Thu, 14 Jul 2022 13:22:02 +0530 Message-ID: <20220714075202.31826-5-asaini@xilinx.com> X-Mailer: git-send-email 2.25.0 In-Reply-To: <20220714075202.31826-1-asaini@xilinx.com> References: <20220708080135.31254-1-asaini@xilinx.com> <20220714075202.31826-1-asaini@xilinx.com> MIME-Version: 1.0 X-EOPAttributedMessage: 1 X-MS-Office365-Filtering-Correlation-Id: a18e6bd8-f0f5-4618-ca49-08da656dcb58 X-MS-TrafficTypeDiagnostic: BYAPR02MB4182:EE_|BN8NAM11FT048:EE_|BL0PR12MB4708:EE_ X-MS-Exchange-SenderADCheck: 0 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: EGN6L86yyCBu5uu3SwCcjGXcOXksT7TBopYMaHBEUQ+v/irNgRpu7+nnwxt0IWNgmkaaxoE/Qh/xF+1CzYWpyejjMK9YjhEdlNGZ7LD6z/L7gCohmR3HGhKSBnsQOoXPjCaClsA5tef/ul04NLFbtK3z2NfVBe1FXCI7TtE5ufeLk+AInMfLvKkyvphPxekwbnHe5L/N9uxOjTACp+4cdFmEfsUR328Dk8VV7K/OkHM+ler6qMUiEJ9Kb8BIRw33kqkEl5m3+5vyhqq2a0597x3nl3LtdrbKFxdinTvkFqH2PTpTqIDb5FzbZMitXfwfrphoQ9vw62SNp8dkguwzlnl3J4UaZRV+QuGdmB5wuioKuTHwLZ5Q/XpCng9zFC3l7X+Ml6SdMtnnkCSnSWTCW60ppleN8eP5962MzMTqkE7EhJWLrnkpdDPPAqPjCRfxyLoZX+GBPi3qGMQWCt4h2T77LzoLsHnk8gejSvq2mE7SaCFRqB90aTNDeuVeU6dW9dyyU0ib4LrddNuLpGPxzw90f9YqT5PtYF2pY/WUMo3lVcB5267c2TVBDaBWg1/Hi6S+bbC5Z0myKf5C6rgQHQqLF0EfC0ItaNaxXKpfwkQ/MBzAW5MCijX9ck37ws8g4F+VJ0TiiObWYpD+c9/JouEJur1aoCajBBTY5ZHfPgg3ghlg6fugkaZTLqdZOkkrKY34AQZ0StD+yOvFBu7LgRSSpgrSu+POgrRJ2kXPu9uBm9oBEZXy5A4rWx3d36HzOxmbXz3/ccpYxzopszmo3h/fGzwEu5NiDQuROBGftG9imEIrto/YFn9lJGwyJgNkE+iDSx+Qjbw29bkWTbl2Mf59JK/NAzN5djARsiQkcxU= X-Forefront-Antispam-Report-Untrusted: CIP:149.199.62.198; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:xsj-pvapexch01.xlnx.xilinx.com; PTR:unknown-62-198.xilinx.com; CAT:NONE; SFS:(13230016)(4636009)(396003)(136003)(376002)(39860400002)(346002)(40470700004)(36840700001)(46966006)(83380400001)(2616005)(47076005)(7636003)(8676002)(426003)(336012)(1076003)(5660300002)(186003)(4326008)(356005)(40480700001)(30864003)(8936002)(2876002)(2906002)(9786002)(7696005)(82310400005)(26005)(70586007)(82740400003)(41300700001)(54906003)(6916009)(70206006)(6666004)(316002)(36756003)(36860700001)(40460700003)(478600001)(102446001); DIR:OUT; SFP:1101; X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR02MB4182 X-MS-Exchange-Transport-CrossTenantHeadersStripped: BN8NAM11FT048.eop-nam11.prod.protection.outlook.com X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id-Prvs: 35ca9b71-8d80-4bb5-7847-08da656dc82e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: SR4EfwD82Q1xbojaDf1XVWiPESlQ1mRVWLjLSoboZx5A4hUJ9+cp+r1Y2d0/mCoE2BcntbEVUnmjYEvneugL/r/n2pXdvdy1JFSQR9wIJqPa/fLqiU6fcoxgD+a9xjFcEfXaDn3OGmvQPQK6wyFErEZXia5GNrUQcRy6147z+AnwKQ/spZQmHROMyiOnOaA5I309NnPfAmleBWVR1xf4qLkwVhjfIw2VENiUUz6s7LSjvEh4LIRmKq7wKHgzj26tBHphfquSqtjanG65YDGWI1fERnqLxeJYdYkceD6C54gqOaLFFhZlPjEXZ77k8gEvT0XAyNVbrhtVZ/Ry7TKVBLGUgaTuckfdIfQTAGGkRP5N/fAaGRxYM1lWlPjZT0+kXHowlpgo3HzkrCPw7KwzID3pZTX8MeKQ0k45WhjTITV6igasiqEt4XLlZjsP2VY0OMe4LkL5mAm3zCPg3SXVOdbmUVgYHTH/zdvH8LdgTZEpWVtlkAFjHrOlT1uflJsKnhehjd/VgXgbNeU7meI5FSrmwebKvpffeCuYNfwf3TF4ja0unNjynS5iRKY+KKV7E0cAC4MrClw317Ux54C0cXFXS3PKej4QYodMflCCGAsLp2vfxswCIf+ZjGinNQrWtcWfIMEojqQoZw3tz9LP/wdBg4GBzwmbYJCKXrB/lGcII7gmpVL8ciZwYTg2zDTLeTjA6k8D46lMfHxRJaXboSchkFQq7YtLcdz+1Lky5Zckj8pu3nVFb+TcgAnvpUgPul5jYbgA54tZehCCnZIF5IOPGyA3DTij4RtVINLXDllLN3wEcz+P9Tp9wvjZVg3UZZCEkHkd9fuSjl8avDjLAfkpR/EN+ZoV98azG3vnds8= X-Forefront-Antispam-Report: CIP:20.83.241.18; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net; PTR:InfoDomainNonexistent; CAT:NONE; SFS:(13230016)(4636009)(376002)(39860400002)(346002)(136003)(396003)(46966006)(36840700001)(42882007)(8676002)(336012)(2876002)(47076005)(83170400001)(83380400001)(40480700001)(9786002)(426003)(54906003)(4326008)(82740400003)(6916009)(26005)(7696005)(70206006)(41300700001)(81166007)(6666004)(8936002)(2906002)(1076003)(2616005)(5660300002)(82310400005)(30864003)(36756003)(36860700001)(478600001)(186003)(316002)(102446001)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 14 Jul 2022 07:52:26.1906 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: a18e6bd8-f0f5-4618-ca49-08da656dcb58 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d; Ip=[20.83.241.18]; Helo=[mailrelay000000.14r1f435wfvunndds3vy4cdalc.xx.internal.cloudapp.net] X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: TreatMessagesAsInternal-BN8NAM11FT048.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL0PR12MB4708 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org From: Abhimanyu Saini In SW assisted live migration, vDPA driver will stop all virtqueues and setup up SW vrings to relay the communication between the virtio driver and the vDPA device using an event driven relay thread This will allow vDPA driver to help on guest dirty page logging for live migration. Signed-off-by: Abhimanyu Saini --- drivers/vdpa/sfc/sfc_vdpa.h | 1 + drivers/vdpa/sfc/sfc_vdpa_ops.c | 336 +++++++++++++++++++++++++++++++++++++--- drivers/vdpa/sfc/sfc_vdpa_ops.h | 15 +- 3 files changed, 329 insertions(+), 23 deletions(-) diff --git a/drivers/vdpa/sfc/sfc_vdpa.h b/drivers/vdpa/sfc/sfc_vdpa.h index daeb27d..ae522ca 100644 --- a/drivers/vdpa/sfc/sfc_vdpa.h +++ b/drivers/vdpa/sfc/sfc_vdpa.h @@ -18,6 +18,7 @@ #define SFC_VDPA_MAC_ADDR "mac" #define SFC_VDPA_DEFAULT_MCDI_IOVA 0x200000000000 +#define SFC_SW_VRING_IOVA 0x300000000000 /* Broadcast & Unicast MAC filters are supported */ #define SFC_MAX_SUPPORTED_FILTERS 3 diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.c b/drivers/vdpa/sfc/sfc_vdpa_ops.c index 426c7ac..daf1db0 100644 --- a/drivers/vdpa/sfc/sfc_vdpa_ops.c +++ b/drivers/vdpa/sfc/sfc_vdpa_ops.c @@ -4,10 +4,13 @@ #include #include +#include #include +#include #include #include +#include #include #include #include @@ -33,7 +36,9 @@ */ #define SFC_VDPA_DEFAULT_FEATURES \ ((1ULL << VHOST_USER_F_PROTOCOL_FEATURES) | \ - (1ULL << VIRTIO_NET_F_MQ)) + (1ULL << VIRTIO_NET_F_MQ) | \ + (1ULL << VHOST_F_LOG_ALL) | \ + (1ULL << VIRTIO_NET_F_GUEST_ANNOUNCE)) #define SFC_VDPA_MSIX_IRQ_SET_BUF_LEN \ (sizeof(struct vfio_irq_set) + \ @@ -42,6 +47,142 @@ /* It will be used for target VF when calling function is not PF */ #define SFC_VDPA_VF_NULL 0xFFFF +#define SFC_VDPA_DECODE_FD(data) (data.u64 >> 32) +#define SFC_VDPA_DECODE_QID(data) (data.u32 >> 1) +#define SFC_VDPA_DECODE_EV_TYPE(data) (data.u32 & 1) + +/* + * Create q_num number of epoll events for kickfd interrupts + * and q_num/2 events for callfd interrupts. Round up the + * total to (q_num * 2) number of events. + */ +#define SFC_VDPA_SW_RELAY_EVENT_NUM(q_num) (q_num * 2) + +static inline uint64_t +sfc_vdpa_encode_ev_data(int type, uint32_t qid, int fd) +{ + SFC_VDPA_ASSERT(fd > UINT32_MAX || qid > UINT32_MAX / 2); + return type | (qid << 1) | (uint64_t)fd << 32; +} + +static inline void +sfc_vdpa_queue_relay(struct sfc_vdpa_ops_data *ops_data, uint32_t qid) +{ + rte_vdpa_relay_vring_used(ops_data->vid, qid, &ops_data->sw_vq[qid]); + rte_vhost_vring_call(ops_data->vid, qid); +} + +static void* +sfc_vdpa_sw_relay(void *data) +{ + uint64_t buf; + uint32_t qid, q_num; + struct epoll_event ev; + struct rte_vhost_vring vring; + int nbytes, i, ret, fd, epfd, nfds = 0; + struct epoll_event events[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; + struct sfc_vdpa_ops_data *ops_data = (struct sfc_vdpa_ops_data *)data; + + q_num = rte_vhost_get_vring_num(ops_data->vid); + epfd = epoll_create(SFC_VDPA_SW_RELAY_EVENT_NUM(q_num)); + if (epfd < 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "failed to create epoll instance"); + goto fail_epoll; + } + ops_data->epfd = epfd; + + vring.kickfd = -1; + for (qid = 0; qid < q_num; qid++) { + ev.events = EPOLLIN | EPOLLPRI; + ret = rte_vhost_get_vhost_vring(ops_data->vid, qid, &vring); + if (ret != 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "rte_vhost_get_vhost_vring error %s", + strerror(errno)); + goto fail_vring; + } + + ev.data.u64 = sfc_vdpa_encode_ev_data(0, qid, vring.kickfd); + if (epoll_ctl(epfd, EPOLL_CTL_ADD, vring.kickfd, &ev) < 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "epoll add error: %s", + strerror(errno)); + goto fail_epoll_add; + } + } + + /* + * Register intr_fd created by vDPA driver in lieu of qemu's callfd + * to intercept rx queue notification. So that we can monitor rx + * notifications and issue rte_vdpa_relay_vring_used() + */ + for (qid = 0; qid < q_num; qid += 2) { + fd = ops_data->intr_fd[qid]; + ev.events = EPOLLIN | EPOLLPRI; + ev.data.u64 = sfc_vdpa_encode_ev_data(1, qid, fd); + if (epoll_ctl(epfd, EPOLL_CTL_ADD, fd, &ev) < 0) { + sfc_vdpa_log_init(ops_data->dev_handle, + "epoll add error: %s", + strerror(errno)); + goto fail_epoll_add; + } + sfc_vdpa_queue_relay(ops_data, qid); + } + + /* + * virtio driver in VM was continuously sending queue notifications + * while were setting up software vrings and hence the HW misses + * these doorbell notifications. Since, it is safe to send duplicate + * doorbell, send another doorbell from vDPA driver. + */ + for (qid = 0; qid < q_num; qid++) + rte_write16(qid, ops_data->vq_cxt[qid].doorbell); + + for (;;) { + nfds = epoll_wait(epfd, events, + SFC_VDPA_SW_RELAY_EVENT_NUM(q_num), -1); + if (nfds < 0) { + if (errno == EINTR) + continue; + sfc_vdpa_log_init(ops_data->dev_handle, + "epoll_wait return fail\n"); + goto fail_epoll_wait; + } + + for (i = 0; i < nfds; i++) { + fd = SFC_VDPA_DECODE_FD(events[i].data); + /* Ensure kickfd is not busy before proceeding */ + for (;;) { + nbytes = read(fd, &buf, 8); + if (nbytes < 0) { + if (errno == EINTR || + errno == EWOULDBLOCK || + errno == EAGAIN) + continue; + } + break; + } + + qid = SFC_VDPA_DECODE_QID(events[i].data); + if (SFC_VDPA_DECODE_EV_TYPE(events[i].data)) + sfc_vdpa_queue_relay(ops_data, qid); + else + rte_write16(qid, ops_data->vq_cxt[qid].doorbell); + } + } + + return NULL; + +fail_epoll: +fail_vring: +fail_epoll_add: +fail_epoll_wait: + close(epfd); + ops_data->epfd = -1; + return NULL; +} + static int sfc_vdpa_get_device_features(struct sfc_vdpa_ops_data *ops_data) { @@ -99,7 +240,7 @@ static int sfc_vdpa_enable_vfio_intr(struct sfc_vdpa_ops_data *ops_data) { - int rc; + int rc, fd; int *irq_fd_ptr; int vfio_dev_fd; uint32_t i, num_vring; @@ -131,6 +272,17 @@ return -1; irq_fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = vring.callfd; + if (ops_data->sw_fallback_mode && !(i & 1)) { + fd = eventfd(0, EFD_NONBLOCK | EFD_CLOEXEC); + if (fd < 0) { + sfc_vdpa_err(ops_data->dev_handle, + "failed to create eventfd"); + goto fail_eventfd; + } + ops_data->intr_fd[i] = fd; + irq_fd_ptr[RTE_INTR_VEC_RXTX_OFFSET + i] = fd; + } else + ops_data->intr_fd[i] = -1; } rc = ioctl(vfio_dev_fd, VFIO_DEVICE_SET_IRQS, irq_set); @@ -138,16 +290,26 @@ sfc_vdpa_err(ops_data->dev_handle, "error enabling MSI-X interrupts: %s", strerror(errno)); - return -1; + goto fail_ioctl; } return 0; + +fail_ioctl: +fail_eventfd: + for (i = 0; i < num_vring; i++) { + if (ops_data->intr_fd[i] != -1) { + close(ops_data->intr_fd[i]); + ops_data->intr_fd[i] = -1; + } + } + return -1; } static int sfc_vdpa_disable_vfio_intr(struct sfc_vdpa_ops_data *ops_data) { - int rc; + int rc, i; int vfio_dev_fd; struct vfio_irq_set irq_set; void *dev; @@ -161,6 +323,12 @@ irq_set.index = VFIO_PCI_MSIX_IRQ_INDEX; irq_set.start = 0; + for (i = 0; i < ops_data->vq_count; i++) { + if (ops_data->intr_fd[i] >= 0) + close(ops_data->intr_fd[i]); + ops_data->intr_fd[i] = -1; + } + rc = ioctl(vfio_dev_fd, VFIO_DEVICE_SET_IRQS, &irq_set); if (rc) { sfc_vdpa_err(ops_data->dev_handle, @@ -223,12 +391,15 @@ static int sfc_vdpa_virtq_start(struct sfc_vdpa_ops_data *ops_data, int vq_num) { - int rc; + int rc, fd; + uint64_t size; uint32_t doorbell; efx_virtio_vq_t *vq; + void *vring_buf, *dev; struct sfc_vdpa_vring_info vring; efx_virtio_vq_cfg_t vq_cfg; efx_virtio_vq_dyncfg_t vq_dyncfg; + uint64_t sw_vq_iova = ops_data->sw_vq_iova; vq = ops_data->vq_cxt[vq_num].vq; if (vq == NULL) @@ -241,6 +412,33 @@ goto fail_vring_info; } + if (ops_data->sw_fallback_mode) { + size = vring_size(vring.size, rte_mem_page_size()); + size = RTE_ALIGN_CEIL(size, rte_mem_page_size()); + vring_buf = rte_zmalloc("vdpa", size, rte_mem_page_size()); + vring_init(&ops_data->sw_vq[vq_num], vring.size, vring_buf, + rte_mem_page_size()); + + dev = ops_data->dev_handle; + fd = sfc_vdpa_adapter_by_dev_handle(dev)->vfio_container_fd; + rc = rte_vfio_container_dma_map(fd, + (uint64_t)(uintptr_t)vring_buf, + sw_vq_iova, size); + + /* Direct I/O for Tx queue, relay for Rx queue */ + if (!(vq_num & 1)) + vring.used = sw_vq_iova + + (char *)ops_data->sw_vq[vq_num].used - + (char *)ops_data->sw_vq[vq_num].desc; + + ops_data->sw_vq[vq_num].used->idx = vring.last_used_idx; + ops_data->sw_vq[vq_num].avail->idx = vring.last_avail_idx; + + ops_data->vq_cxt[vq_num].sw_vq_iova = sw_vq_iova; + ops_data->vq_cxt[vq_num].sw_vq_size = size; + ops_data->sw_vq_iova += size; + } + vq_cfg.evvc_target_vf = SFC_VDPA_VF_NULL; /* even virtqueue for RX and odd for TX */ @@ -309,9 +507,12 @@ static int sfc_vdpa_virtq_stop(struct sfc_vdpa_ops_data *ops_data, int vq_num) { - int rc; + int rc, fd; + void *dev, *buf; + uint64_t size, len, iova; efx_virtio_vq_dyncfg_t vq_idx; efx_virtio_vq_t *vq; + struct rte_vhost_vring vring; if (ops_data->vq_cxt[vq_num].enable != B_TRUE) return -1; @@ -319,13 +520,34 @@ vq = ops_data->vq_cxt[vq_num].vq; if (vq == NULL) return -1; + if (ops_data->sw_fallback_mode) { + dev = ops_data->dev_handle; + fd = sfc_vdpa_adapter_by_dev_handle(dev)->vfio_container_fd; + /* synchronize remaining new used entries if any */ + if (!(vq_num & 1)) + sfc_vdpa_queue_relay(ops_data, vq_num); + + rte_vhost_get_vhost_vring(ops_data->vid, vq_num, &vring); + len = SFC_VDPA_USED_RING_LEN(vring.size); + rte_vhost_log_used_vring(ops_data->vid, vq_num, 0, len); + + buf = ops_data->sw_vq[vq_num].desc; + size = ops_data->vq_cxt[vq_num].sw_vq_size; + iova = ops_data->vq_cxt[vq_num].sw_vq_iova; + rte_vfio_container_dma_unmap(fd, (uint64_t)(uintptr_t)buf, + iova, size); + } /* stop the vq */ rc = efx_virtio_qstop(vq, &vq_idx); if (rc == 0) { - ops_data->vq_cxt[vq_num].cidx = vq_idx.evvd_vq_cidx; - ops_data->vq_cxt[vq_num].pidx = vq_idx.evvd_vq_pidx; + if (ops_data->sw_fallback_mode) + vq_idx.evvd_vq_avail_idx = vq_idx.evvd_vq_used_idx; + rte_vhost_set_vring_base(ops_data->vid, vq_num, + vq_idx.evvd_vq_avail_idx, + vq_idx.evvd_vq_used_idx); } + ops_data->vq_cxt[vq_num].enable = B_FALSE; return rc; @@ -450,7 +672,11 @@ SFC_EFX_ASSERT(ops_data->state == SFC_VDPA_STATE_CONFIGURED); - sfc_vdpa_log_init(ops_data->dev_handle, "entry"); + if (ops_data->sw_fallback_mode) { + sfc_vdpa_log_init(ops_data->dev_handle, + "Trying to start VDPA with SW I/O relay"); + ops_data->sw_vq_iova = SFC_SW_VRING_IOVA; + } ops_data->state = SFC_VDPA_STATE_STARTING; @@ -675,6 +901,7 @@ sfc_vdpa_dev_close(int vid) { int ret; + void *status; struct rte_vdpa_device *vdpa_dev; struct sfc_vdpa_ops_data *ops_data; @@ -707,7 +934,23 @@ } ops_data->is_notify_thread_started = false; + if (ops_data->sw_fallback_mode) { + ret = pthread_cancel(ops_data->sw_relay_thread_id); + if (ret != 0) + sfc_vdpa_err(ops_data->dev_handle, + "failed to cancel LM relay thread: %s", + rte_strerror(ret)); + + ret = pthread_join(ops_data->sw_relay_thread_id, &status); + if (ret != 0) + sfc_vdpa_err(ops_data->dev_handle, + "failed to join LM relay thread: %s", + rte_strerror(ret)); + } + sfc_vdpa_stop(ops_data); + ops_data->sw_fallback_mode = false; + sfc_vdpa_close(ops_data); sfc_vdpa_adapter_unlock(ops_data->dev_handle); @@ -774,9 +1017,49 @@ static int sfc_vdpa_set_features(int vid) { - RTE_SET_USED(vid); + int ret; + uint64_t features = 0; + struct rte_vdpa_device *vdpa_dev; + struct sfc_vdpa_ops_data *ops_data; - return -1; + vdpa_dev = rte_vhost_get_vdpa_device(vid); + ops_data = sfc_vdpa_get_data_by_dev(vdpa_dev); + if (ops_data == NULL) + return -1; + + rte_vhost_get_negotiated_features(vid, &features); + + if (!RTE_VHOST_NEED_LOG(features)) + return -1; + + sfc_vdpa_info(ops_data->dev_handle, "live-migration triggered"); + + sfc_vdpa_adapter_lock(ops_data->dev_handle); + + /* Stop HW Offload and unset host notifier */ + sfc_vdpa_stop(ops_data); + if (rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, false) != 0) + sfc_vdpa_info(ops_data->dev_handle, + "vDPA (%s): Failed to clear host notifier", + ops_data->vdpa_dev->device->name); + + /* Restart vDPA with SW relay on RX queue */ + ops_data->sw_fallback_mode = true; + sfc_vdpa_start(ops_data); + ret = pthread_create(&ops_data->sw_relay_thread_id, NULL, + sfc_vdpa_sw_relay, (void *)ops_data); + if (ret != 0) + sfc_vdpa_err(ops_data->dev_handle, + "failed to create rx_relay thread: %s", + rte_strerror(ret)); + + if (rte_vhost_host_notifier_ctrl(vid, RTE_VHOST_QUEUE_ALL, true) != 0) + sfc_vdpa_info(ops_data->dev_handle, "notifier setup failed!"); + + sfc_vdpa_adapter_unlock(ops_data->dev_handle); + sfc_vdpa_info(ops_data->dev_handle, "SW fallback setup done!"); + + return 0; } static int @@ -860,17 +1143,28 @@ sfc_vdpa_info(dev, "vDPA ops get_notify_area :: offset : 0x%" PRIx64, *offset); - pci_dev = sfc_vdpa_adapter_by_dev_handle(dev)->pdev; - doorbell = (uint8_t *)pci_dev->mem_resource[reg.index].addr + *offset; + if (!ops_data->sw_fallback_mode) { + pci_dev = sfc_vdpa_adapter_by_dev_handle(dev)->pdev; + doorbell = (uint8_t *)pci_dev->mem_resource[reg.index].addr + + *offset; + /* + * virtio-net driver in VM sends queue notifications before + * vDPA has a chance to setup the queues and notification area, + * and hence the HW misses these doorbell notifications. + * Since, it is safe to send duplicate doorbell, send another + * doorbell from vDPA driver as workaround for this timing issue + */ + rte_write16(qid, doorbell); + + /* + * Update doorbell address, it will come in handy during + * live-migration. + */ + ops_data->vq_cxt[qid].doorbell = doorbell; + } - /* - * virtio-net driver in VM sends queue notifications before - * vDPA has a chance to setup the queues and notification area, - * and hence the HW misses these doorbell notifications. - * Since, it is safe to send duplicate doorbell, send another - * doorbell from vDPA driver as workaround for this timing issue. - */ - rte_write16(qid, doorbell); + sfc_vdpa_info(dev, "vDPA ops get_notify_area :: offset : 0x%" PRIx64, + *offset); return 0; } diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.h b/drivers/vdpa/sfc/sfc_vdpa_ops.h index 5c8e352..dd301ba 100644 --- a/drivers/vdpa/sfc/sfc_vdpa_ops.h +++ b/drivers/vdpa/sfc/sfc_vdpa_ops.h @@ -6,8 +6,11 @@ #define _SFC_VDPA_OPS_H #include +#include #define SFC_VDPA_MAX_QUEUE_PAIRS 8 +#define SFC_VDPA_USED_RING_LEN(size) \ + ((size) * sizeof(struct vring_used_elem) + sizeof(uint16_t) * 3) enum sfc_vdpa_context { SFC_VDPA_AS_VF @@ -37,9 +40,10 @@ struct sfc_vdpa_vring_info { typedef struct sfc_vdpa_vq_context_s { volatile void *doorbell; uint8_t enable; - uint32_t pidx; - uint32_t cidx; efx_virtio_vq_t *vq; + + uint64_t sw_vq_iova; + uint64_t sw_vq_size; } sfc_vdpa_vq_context_t; struct sfc_vdpa_ops_data { @@ -57,6 +61,13 @@ struct sfc_vdpa_ops_data { uint16_t vq_count; struct sfc_vdpa_vq_context_s vq_cxt[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; + + int epfd; + uint64_t sw_vq_iova; + bool sw_fallback_mode; + pthread_t sw_relay_thread_id; + struct vring sw_vq[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; + int intr_fd[SFC_VDPA_MAX_QUEUE_PAIRS * 2]; }; struct sfc_vdpa_ops_data *