From patchwork Mon Jun 27 11:54:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerin Jacob X-Patchwork-Id: 14413 X-Patchwork-Delegate: yuanhan.liu@linux.intel.com Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 401455A41; Mon, 27 Jun 2016 13:55:06 +0200 (CEST) Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1on0088.outbound.protection.outlook.com [157.56.110.88]) by dpdk.org (Postfix) with ESMTP id CE7592B8D for ; Mon, 27 Jun 2016 13:55:04 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=CQrpBhzptymk2q58sWAy+zVQJCgsJ57qRTtz5DljxUw=; b=Z+6r5vV/VqV91FfZ0FoBAuxydf1I1+ihw8pyUdMwqz0dyINTukrEYaF3ScDSFGNc7BrNVtmjxVKObAWyHNhzrE/SezCBtUhueRIRRNXx/MNoHTb7bDWWgOXAJzQfYQRhrX+pEY6vlnG2Dj4lNOUo+pn1EGzZOU1RISrQFPxvBu8= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Jerin.Jacob@cavium.com; Received: from localhost.localdomain.localdomain (171.48.12.117) by BN3PR0701MB1720.namprd07.prod.outlook.com (10.163.39.19) with Microsoft SMTP Server (TLS) id 15.1.528.16; Mon, 27 Jun 2016 11:55:00 +0000 From: Jerin Jacob To: CC: , , , , , Jerin Jacob Date: Mon, 27 Jun 2016 17:24:07 +0530 Message-ID: <1467028448-8914-4-git-send-email-jerin.jacob@caviumnetworks.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1467028448-8914-1-git-send-email-jerin.jacob@caviumnetworks.com> References: <1467028448-8914-1-git-send-email-jerin.jacob@caviumnetworks.com> MIME-Version: 1.0 X-Originating-IP: [171.48.12.117] X-ClientProxiedBy: MA1PR01CA0014.INDPRD01.PROD.OUTLOOK.COM (10.164.117.21) To BN3PR0701MB1720.namprd07.prod.outlook.com (10.163.39.19) X-MS-Office365-Filtering-Correlation-Id: bd8c861b-d317-4b3e-3e62-08d39e81dfb7 X-Microsoft-Exchange-Diagnostics: 1; BN3PR0701MB1720; 2:Vf7L5i23M+9szIg+/JcG6LTVXZwx9SIrHhw1pdcymSVL7Zme0+NHLZr1+eSvi1XlUTiJfeByPe2nvro2nBQvkBz2YPxPJpkc27h5u2ei3CPKYyexSuUmC4R6uXKxAUNG788twIMCua9nad0McyLpKJePlUSWXf8NmBWgf+0HPi4o9kYOfNzJypfh7G1FYlkD; 3:s+QX5Wee5j3cqUUDY5hrcisIhhSg9uNOSzQWmopfMvp2NSgRLRE4cKp8myCI7ioY4TobwiGaidYXWqVFdfSBIRYfrIDAlQfBUOPCTIRrLb4k2rPknR4HFqCzj092nX0V X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BN3PR0701MB1720; X-Microsoft-Exchange-Diagnostics: 1; BN3PR0701MB1720; 25:J8hhmjFEqRF6Yd+hi1m3jQT5+AKJG0p6zaolhNhZ1Mh160TJedVT0VGnUfnk/crkk8nPMExlb6mBeoGB8V6wtxJMNS1PExG2Fsv/+nFOwwCcY8K3cu1GKToL37fDyJq3qgDmuBbb6sLzN2SzkQeG6pgcRGKlXNtKMBCknTDt18ae2ZCpUiN66zcCOnMjMoPjEzsQKFm3+AbSQfc8aITm3jZZ0xdPh36+th111+KMc/DM2PXfrYQ40/jvBLgfT9W8CAdGl2NRxGcm81dd2Jqp9zf6/RcfocvwkYcKv03fJdTa9/hUIVCJqNeQFctvOJP6Fsuij7LwslopHv3sJ2RhniaEJYH81aS4gA0aTyslBLuyGrRdsaqi+rtWc4S3LsR+PmZ+LEH5YgO23mNFuKOomaRlUzXx7j5+czbvO66wE0BIzSLw0MVBIdlyXbkZjZLSdh7fVr8RdWnCsnvUw4Sb5tl1MzrKkPYhAzAFvlZJ2gCo58e5Wy/uDYQScgbAw0YCOg1efH+JupOGGKKOnHz47eIGFMyU65ltqZC6jWq4AKSPi54eBvTAtCdDQBg1sCoIS9mvWkboA5+E5CsxFfc7mLwR4XtpmUjNCjqgqjnLFlCCGursK/8brj46ozg2t01NHDqRMJ316yyXjdKjZ6GGlh/vuchTZohQNEaK+/A2MOhj7bXUbb9HAIYpNUxhkbuxnL+4V005+3lPUl7LKp8BtQNUPKuCsjHNJhPkHUDA9R3sJUW+FxwV2JpE9bgRhO0RynDBt3kx7hwaAnJe2lX29ZA92paTO4m55LionYVAGXQ= X-Microsoft-Exchange-Diagnostics: 1; BN3PR0701MB1720; 31:aW3/C11WzJJ3aa4yTpNGz5w+jQ9l/2IfJQHnnlCNUZ9JkgE1EJsOhDOnyEM5fFR8b/2Ji5fjOv1wsa4L5IAbJ1JDbvoscRodSaxnrjpgll81VouZec3LdXpCDegCYytygPMhkHJ223Rti4ptfTTf2m64JimV2FKkxptCs6wey6TNMx4I+JH+ABEwb5y/G5ZAkx8/dTh2yvZYUktEax40qg==; 20:lTxlUU4WgIuSkeQplzVzQwbWrQHSASWsHrgOTMTaulERUQ+DkNS65P4VdZ/fFIMXYKyJFwL4gc7MKz3Sq4A8iI/ZGZoqclYrd8UosTVuYoAcIwFCEEXgvJ131oIPoSE4y05kXLypUqOKwhRjAuwiVIl9eBTZYikk7xW4KOgxOF4Vtl6KWtwlpHC7B3g0GZCe8R1ksXP4ZYRNBG9Ehsn+t3AMo5eo5jCQrxt9i1JTW9qkG+ZiQ/rbwh6QTI8qJ92iScmz1Oam96qv3IV2+XbrYj1DIp1kZLDrTjq4/RJc2M4+NOGYOmw5sQKWnK5+T6wUUgtx6HLRp4sqR2z+XNh3MqVYEqY/7WMKc3OrwPafX7DwnvaQ4M6vlpUzPpOnWG7bguN/9wSSUoNf8dmHofQosiD+zRrs+fNWIA0IiYAMClF1Iyr2Vf6u0sSnCOytK3THK3EddIBSfsZuKkQU6yG7VUp953nDCPuAco5m8/Bxir21YZNpkWJxV7aA/VCRZ0hRxdaaRXrTqjT1lUIQkyPTjiM86D+Dd8pjKTfdFWXWSrX4Hxzllc1ftbum+YitM7YMUte1XTNvf8YzD2ZGkQ6oJyJlh8+jBXwpUsmO8yL5ocM= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046); SRVR:BN3PR0701MB1720; BCL:0; PCL:0; RULEID:; SRVR:BN3PR0701MB1720; X-Microsoft-Exchange-Diagnostics: 1; BN3PR0701MB1720; 4:r4L6rLH8c/aE2UJ5RmQnInP4nHPClI1wusCExIjvtJW89Sp5sWQKuSyQh3bAEu9cOFNZbq8B6CPiSbxyNgM580RsUzyVhcGMI+toP4oHHu51LOshPuunu7aYj8TBMu9EglFns4oyA2vX3TfsSodFqvFfGR+8KF2NCeTlpECNaZbDhvMO1OhkteE8nf+21n96vS3aStg9VL7ZGzTCRQ9FB/UB03uWHk+M3pFYzuNhWfuBP6eYWLs0e1FkQz3LaAikiBFxa3nda7yKDefZSPvsHN1oAkLQop1v+SVAhFzqgBsYDkBR4jEvniAMIrvlWOuwGFGHbxPmbprlJsdbi0baa2590EEu+d5GVGdcS5QN1nsZHz4/S/unFJcxebQ2/Orb X-Forefront-PRVS: 09860C2161 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(4630300001)(6069001)(6009001)(7916002)(189002)(199003)(57704003)(50226002)(4001430100002)(97736004)(105586002)(50466002)(50986999)(76176999)(586003)(3846002)(48376002)(42186005)(106356001)(101416001)(107886002)(5003940100001)(110136002)(2950100001)(19580395003)(19580405001)(305945005)(77096005)(36756003)(92566002)(2906002)(81166006)(47776003)(68736007)(4326007)(7846002)(66066001)(229853001)(8676002)(2351001)(33646002)(6116002)(81156014)(7736002)(189998001); DIR:OUT; SFP:1101; SCL:1; SRVR:BN3PR0701MB1720; H:localhost.localdomain.localdomain; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; Received-SPF: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; BN3PR0701MB1720; 23:rs7QYgW/KoC/GNJzOYGoLR9rH2H9JfjjlUQh9cb?= =?us-ascii?Q?EG3kFshy6yNeClAPYxIrBqrTDqIrif8M4Y0QPUUUOz6AanafVIDSmsEhmOSs?= =?us-ascii?Q?UsuK4ORbyVwaq/lYnedduluvbPFvSk6w5LwaIi20waPiI5G8rQ4JYQ3Suw1A?= =?us-ascii?Q?cX+I8juXbl/ajXjRhXToulCUYGVL/fPdKPj+rMZoe8er/cgf8vYQsOXpufBa?= =?us-ascii?Q?x1F109lDE03q/KMAMYcoDJlsos815TFHVqL0R55+CP2k+796WWmAcyBN3SX7?= =?us-ascii?Q?fIAiwNSKIc7hheCnxvq/68jNwQG22hZMB75GZ7pbSaIsRejCjvO8zVXe7Vnd?= =?us-ascii?Q?ai0X1a8188azCCtqoOmC22QPPcEYzqG8bg2kio0pqwTdF9P+SXQbsNhSDJFO?= =?us-ascii?Q?kclD3UhZViE+QaDsXGiSRP/Y47biVMZSOnD40k46+wzuXoJYqOsFI4i4G4EQ?= =?us-ascii?Q?+2YzS+gLZ2Pq5h+PePFTSsaHPstNusXTz+/MosZYZ7Js7gxxqeE+quyDbLuq?= =?us-ascii?Q?Akw2dPCccIVa2qm7AeJapaUuT3VDtIuw2rU/c6tE1o799ROYD2f/gaQqW3fC?= =?us-ascii?Q?4BTJHCnt7rgDaIsfSWk7Qwl4ZgowKV34Kn2Hj3BW4+aRX+bla0lgA3ZPt6YZ?= =?us-ascii?Q?FSzDOb21llpfPG5gD+yxO2N7GOn9m1RQhKl1p8u6uzNyr5bluoQCZsp2MTMN?= =?us-ascii?Q?vI0P3v5X63XyIThtDP3yvsgnYijMUDeZ0dD5GGnFgwYeS1GMG7yliwVxQtvo?= =?us-ascii?Q?MxX200SMNtIEUALY9lApFzBAtBpYGgvzLcYhOiUTTV8LsmNE26CqxTT3ZRun?= =?us-ascii?Q?epcPMV4hOOAVC7uFmMh03oF6xx+ZKJeqAlL110iqEuiUS87erN7mAJc+wF18?= =?us-ascii?Q?r65UhEH6vvWzsWE5xCfC5jMWKNRVdJC/ka9weQabf7iHKzfaWSxvyI+SLvIl?= =?us-ascii?Q?5WxyND3BwV5vxQdXeYL9yxklD8rHEkrZBKMCQXgzZ3TlsRSIpUoBovKSOAL+?= =?us-ascii?Q?bH8G18jX7ZqRRlOvwI9YjFl1pti//DrDtv80hjrCb3PB7mfYvcOJAgGRq8UQ?= =?us-ascii?Q?GRsxXXtoXsFRE0sh1NIaMHPMPcc8HG1+abv45nseddwdp+MxC6LTOEy7std4?= =?us-ascii?Q?ij0PKvVi/OmEVD40UxS2WyLcMaFiLpveQ?= X-Microsoft-Exchange-Diagnostics: 1; BN3PR0701MB1720; 6:FnEH8n4gwzWIYLM8OiSidOLthZPfxIyLxhRSU70JiBgX0Wt2dLWsNbtcSiYL/N0Y/cdI0gjeLO7hXp+lkwKKAFdgFvYy2MS1ivpgdua0yXiDz/zrR2bMGJTIfJRINg5hi9uckAFsPd9PQVGnRs7GM7hmenACR1oi5taT2mGpzRNgR84A8bJ5q/8ldPNGxqN0OhYbgcZLuwly5XxI9yvQdhOWfizEm/wwLupJ4QUT7PkOQTp+g/++0k1YPdeXghmFeWwb3g5tgPjRPtRdRvgt1TeIv/xHZpivFu3eFQ6eNs0=; 5:p8RhyssAHnQaZTTt7WkqTKyxLcetCWjLDPL9gs10bHNK5+d00vPcvqJNRa6N/ZO4yDSnaMHmTOXO83XNUlTcI1g50UIO0GFOpY11veLCqAstK3h2SZwAzx1hI7nqxxHKM+kbobZoBcwtbHEn2cRFDQ==; 24:6OkIKLvO2x3PHjKObcskBPeAPFyWyoOrsQAiOV6iNCU+9+OTZ1wN92m7npXAA/0UObwxxNwT9W3Xec3wcZLL5DJluGymKeIHuh0hKHa+zBQ=; 7:Nfz78WjXkLFn4HSHBBTFwAkqrYiKj/nq0eMDba7l7Y/JqO5PAveg6ySR0pIr0zn3yac/QuuaqfX+uQdrSk4o57wNY2Z2+vWpYGLZ6wAsTDt2X13jN4iHYjVHYAv3Lr6JTJ7xe4S7bFSZrIOTMTayVIb1ZAz3f4An5c7GVYSTkxiN9I1mdkSCwKHC9ltFoCirw17D58IYjf7qIKYq/+iaw0pxF2GNWcYoRhqC+jYuXmuG1LNunarD380lInK4WTpv SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Jun 2016 11:55:00.5199 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN3PR0701MB1720 Subject: [dpdk-dev] [PATCH 3/4] virtio: move SSE based Rx implementation to separate file X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" split out SSE instruction based virtio simple rx implementation to a separate file Signed-off-by: Jerin Jacob --- drivers/net/virtio/virtio_rxtx_simple.c | 166 +------------------- drivers/net/virtio/virtio_rxtx_simple_sse.h | 225 ++++++++++++++++++++++++++++ 2 files changed, 226 insertions(+), 165 deletions(-) create mode 100644 drivers/net/virtio/virtio_rxtx_simple_sse.h diff --git a/drivers/net/virtio/virtio_rxtx_simple.c b/drivers/net/virtio/virtio_rxtx_simple.c index 67430da..ca87605 100644 --- a/drivers/net/virtio/virtio_rxtx_simple.c +++ b/drivers/net/virtio/virtio_rxtx_simple.c @@ -130,171 +130,7 @@ virtio_rxq_rearm_vec(struct virtnet_rx *rxvq) } #ifdef RTE_MACHINE_CPUFLAG_SSSE3 - -#include - -/* virtio vPMD receive routine, only accept(nb_pkts >= RTE_VIRTIO_DESC_PER_LOOP) - * - * This routine is for non-mergeable RX, one desc for each guest buffer. - * This routine is based on the RX ring layout optimization. Each entry in the - * avail ring points to the desc with the same index in the desc ring and this - * will never be changed in the driver. - * - * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet - */ -uint16_t -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, - uint16_t nb_pkts) -{ - struct virtnet_rx *rxvq = rx_queue; - struct virtqueue *vq = rxvq->vq; - uint16_t nb_used; - uint16_t desc_idx; - struct vring_used_elem *rused; - struct rte_mbuf **sw_ring; - struct rte_mbuf **sw_ring_end; - uint16_t nb_pkts_received; - __m128i shuf_msk1, shuf_msk2, len_adjust; - - shuf_msk1 = _mm_set_epi8( - 0xFF, 0xFF, 0xFF, 0xFF, - 0xFF, 0xFF, /* vlan tci */ - 5, 4, /* dat len */ - 0xFF, 0xFF, 5, 4, /* pkt len */ - 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ - - ); - - shuf_msk2 = _mm_set_epi8( - 0xFF, 0xFF, 0xFF, 0xFF, - 0xFF, 0xFF, /* vlan tci */ - 13, 12, /* dat len */ - 0xFF, 0xFF, 13, 12, /* pkt len */ - 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ - ); - - /* Subtract the header length. - * In which case do we need the header length in used->len ? - */ - len_adjust = _mm_set_epi16( - 0, 0, - 0, - (uint16_t)-vq->hw->vtnet_hdr_size, - 0, (uint16_t)-vq->hw->vtnet_hdr_size, - 0, 0); - - if (unlikely(nb_pkts < RTE_VIRTIO_DESC_PER_LOOP)) - return 0; - - nb_used = VIRTQUEUE_NUSED(vq); - - rte_compiler_barrier(); - - if (unlikely(nb_used == 0)) - return 0; - - nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_VIRTIO_DESC_PER_LOOP); - nb_used = RTE_MIN(nb_used, nb_pkts); - - desc_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); - rused = &vq->vq_ring.used->ring[desc_idx]; - sw_ring = &vq->sw_ring[desc_idx]; - sw_ring_end = &vq->sw_ring[vq->vq_nentries]; - - _mm_prefetch((const void *)rused, _MM_HINT_T0); - - if (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { - virtio_rxq_rearm_vec(rxvq); - if (unlikely(virtqueue_kick_prepare(vq))) - virtqueue_notify(vq); - } - - for (nb_pkts_received = 0; - nb_pkts_received < nb_used;) { - __m128i desc[RTE_VIRTIO_DESC_PER_LOOP / 2]; - __m128i mbp[RTE_VIRTIO_DESC_PER_LOOP / 2]; - __m128i pkt_mb[RTE_VIRTIO_DESC_PER_LOOP]; - - mbp[0] = _mm_loadu_si128((__m128i *)(sw_ring + 0)); - desc[0] = _mm_loadu_si128((__m128i *)(rused + 0)); - _mm_storeu_si128((__m128i *)&rx_pkts[0], mbp[0]); - - mbp[1] = _mm_loadu_si128((__m128i *)(sw_ring + 2)); - desc[1] = _mm_loadu_si128((__m128i *)(rused + 2)); - _mm_storeu_si128((__m128i *)&rx_pkts[2], mbp[1]); - - mbp[2] = _mm_loadu_si128((__m128i *)(sw_ring + 4)); - desc[2] = _mm_loadu_si128((__m128i *)(rused + 4)); - _mm_storeu_si128((__m128i *)&rx_pkts[4], mbp[2]); - - mbp[3] = _mm_loadu_si128((__m128i *)(sw_ring + 6)); - desc[3] = _mm_loadu_si128((__m128i *)(rused + 6)); - _mm_storeu_si128((__m128i *)&rx_pkts[6], mbp[3]); - - pkt_mb[1] = _mm_shuffle_epi8(desc[0], shuf_msk2); - pkt_mb[0] = _mm_shuffle_epi8(desc[0], shuf_msk1); - pkt_mb[1] = _mm_add_epi16(pkt_mb[1], len_adjust); - pkt_mb[0] = _mm_add_epi16(pkt_mb[0], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[1]->rx_descriptor_fields1, - pkt_mb[1]); - _mm_storeu_si128((void *)&rx_pkts[0]->rx_descriptor_fields1, - pkt_mb[0]); - - pkt_mb[3] = _mm_shuffle_epi8(desc[1], shuf_msk2); - pkt_mb[2] = _mm_shuffle_epi8(desc[1], shuf_msk1); - pkt_mb[3] = _mm_add_epi16(pkt_mb[3], len_adjust); - pkt_mb[2] = _mm_add_epi16(pkt_mb[2], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[3]->rx_descriptor_fields1, - pkt_mb[3]); - _mm_storeu_si128((void *)&rx_pkts[2]->rx_descriptor_fields1, - pkt_mb[2]); - - pkt_mb[5] = _mm_shuffle_epi8(desc[2], shuf_msk2); - pkt_mb[4] = _mm_shuffle_epi8(desc[2], shuf_msk1); - pkt_mb[5] = _mm_add_epi16(pkt_mb[5], len_adjust); - pkt_mb[4] = _mm_add_epi16(pkt_mb[4], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[5]->rx_descriptor_fields1, - pkt_mb[5]); - _mm_storeu_si128((void *)&rx_pkts[4]->rx_descriptor_fields1, - pkt_mb[4]); - - pkt_mb[7] = _mm_shuffle_epi8(desc[3], shuf_msk2); - pkt_mb[6] = _mm_shuffle_epi8(desc[3], shuf_msk1); - pkt_mb[7] = _mm_add_epi16(pkt_mb[7], len_adjust); - pkt_mb[6] = _mm_add_epi16(pkt_mb[6], len_adjust); - _mm_storeu_si128((void *)&rx_pkts[7]->rx_descriptor_fields1, - pkt_mb[7]); - _mm_storeu_si128((void *)&rx_pkts[6]->rx_descriptor_fields1, - pkt_mb[6]); - - if (unlikely(nb_used <= RTE_VIRTIO_DESC_PER_LOOP)) { - if (sw_ring + nb_used <= sw_ring_end) - nb_pkts_received += nb_used; - else - nb_pkts_received += sw_ring_end - sw_ring; - break; - } else { - if (unlikely(sw_ring + RTE_VIRTIO_DESC_PER_LOOP >= - sw_ring_end)) { - nb_pkts_received += sw_ring_end - sw_ring; - break; - } else { - nb_pkts_received += RTE_VIRTIO_DESC_PER_LOOP; - - rx_pkts += RTE_VIRTIO_DESC_PER_LOOP; - sw_ring += RTE_VIRTIO_DESC_PER_LOOP; - rused += RTE_VIRTIO_DESC_PER_LOOP; - nb_used -= RTE_VIRTIO_DESC_PER_LOOP; - } - } - } - - vq->vq_used_cons_idx += nb_pkts_received; - vq->vq_free_cnt += nb_pkts_received; - rxvq->stats.packets += nb_pkts_received; - return nb_pkts_received; -} - +#include "virtio_rxtx_simple_sse.h" #endif #define VIRTIO_TX_FREE_THRESH 32 diff --git a/drivers/net/virtio/virtio_rxtx_simple_sse.h b/drivers/net/virtio/virtio_rxtx_simple_sse.h new file mode 100644 index 0000000..4e8728f --- /dev/null +++ b/drivers/net/virtio/virtio_rxtx_simple_sse.h @@ -0,0 +1,225 @@ +/*- + * BSD LICENSE + * + * Copyright(c) 2010-2015 Intel Corporation. All rights reserved. + * All rights reserved. + * + * Redistribution and use in source and binary forms, with or without + * modification, are permitted provided that the following conditions + * are met: + * + * * Redistributions of source code must retain the above copyright + * notice, this list of conditions and the following disclaimer. + * * Redistributions in binary form must reproduce the above copyright + * notice, this list of conditions and the following disclaimer in + * the documentation and/or other materials provided with the + * distribution. + * * Neither the name of Intel Corporation nor the names of its + * contributors may be used to endorse or promote products derived + * from this software without specific prior written permission. + * + * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS + * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT + * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT + * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, + * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT + * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, + * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY + * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE + * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + */ + +#include +#include +#include +#include +#include + +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "virtio_logs.h" +#include "virtio_ethdev.h" +#include "virtqueue.h" +#include "virtio_rxtx.h" + +#define RTE_VIRTIO_VPMD_RX_BURST 32 +#define RTE_VIRTIO_DESC_PER_LOOP 8 +#define RTE_VIRTIO_VPMD_RX_REARM_THRESH RTE_VIRTIO_VPMD_RX_BURST + +/* virtio vPMD receive routine, only accept(nb_pkts >= RTE_VIRTIO_DESC_PER_LOOP) + * + * This routine is for non-mergeable RX, one desc for each guest buffer. + * This routine is based on the RX ring layout optimization. Each entry in the + * avail ring points to the desc with the same index in the desc ring and this + * will never be changed in the driver. + * + * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet + */ +uint16_t +virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts, + uint16_t nb_pkts) +{ + struct virtnet_rx *rxvq = rx_queue; + struct virtqueue *vq = rxvq->vq; + uint16_t nb_used; + uint16_t desc_idx; + struct vring_used_elem *rused; + struct rte_mbuf **sw_ring; + struct rte_mbuf **sw_ring_end; + uint16_t nb_pkts_received; + __m128i shuf_msk1, shuf_msk2, len_adjust; + + shuf_msk1 = _mm_set_epi8( + 0xFF, 0xFF, 0xFF, 0xFF, + 0xFF, 0xFF, /* vlan tci */ + 5, 4, /* dat len */ + 0xFF, 0xFF, 5, 4, /* pkt len */ + 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ + + ); + + shuf_msk2 = _mm_set_epi8( + 0xFF, 0xFF, 0xFF, 0xFF, + 0xFF, 0xFF, /* vlan tci */ + 13, 12, /* dat len */ + 0xFF, 0xFF, 13, 12, /* pkt len */ + 0xFF, 0xFF, 0xFF, 0xFF /* packet type */ + ); + + /* Subtract the header length. + * In which case do we need the header length in used->len ? + */ + len_adjust = _mm_set_epi16( + 0, 0, + 0, + (uint16_t)-vq->hw->vtnet_hdr_size, + 0, (uint16_t)-vq->hw->vtnet_hdr_size, + 0, 0); + + if (unlikely(nb_pkts < RTE_VIRTIO_DESC_PER_LOOP)) + return 0; + + nb_used = VIRTQUEUE_NUSED(vq); + + rte_compiler_barrier(); + + if (unlikely(nb_used == 0)) + return 0; + + nb_pkts = RTE_ALIGN_FLOOR(nb_pkts, RTE_VIRTIO_DESC_PER_LOOP); + nb_used = RTE_MIN(nb_used, nb_pkts); + + desc_idx = (uint16_t)(vq->vq_used_cons_idx & (vq->vq_nentries - 1)); + rused = &vq->vq_ring.used->ring[desc_idx]; + sw_ring = &vq->sw_ring[desc_idx]; + sw_ring_end = &vq->sw_ring[vq->vq_nentries]; + + _mm_prefetch((const void *)rused, _MM_HINT_T0); + + if (vq->vq_free_cnt >= RTE_VIRTIO_VPMD_RX_REARM_THRESH) { + virtio_rxq_rearm_vec(rxvq); + if (unlikely(virtqueue_kick_prepare(vq))) + virtqueue_notify(vq); + } + + for (nb_pkts_received = 0; + nb_pkts_received < nb_used;) { + __m128i desc[RTE_VIRTIO_DESC_PER_LOOP / 2]; + __m128i mbp[RTE_VIRTIO_DESC_PER_LOOP / 2]; + __m128i pkt_mb[RTE_VIRTIO_DESC_PER_LOOP]; + + mbp[0] = _mm_loadu_si128((__m128i *)(sw_ring + 0)); + desc[0] = _mm_loadu_si128((__m128i *)(rused + 0)); + _mm_storeu_si128((__m128i *)&rx_pkts[0], mbp[0]); + + mbp[1] = _mm_loadu_si128((__m128i *)(sw_ring + 2)); + desc[1] = _mm_loadu_si128((__m128i *)(rused + 2)); + _mm_storeu_si128((__m128i *)&rx_pkts[2], mbp[1]); + + mbp[2] = _mm_loadu_si128((__m128i *)(sw_ring + 4)); + desc[2] = _mm_loadu_si128((__m128i *)(rused + 4)); + _mm_storeu_si128((__m128i *)&rx_pkts[4], mbp[2]); + + mbp[3] = _mm_loadu_si128((__m128i *)(sw_ring + 6)); + desc[3] = _mm_loadu_si128((__m128i *)(rused + 6)); + _mm_storeu_si128((__m128i *)&rx_pkts[6], mbp[3]); + + pkt_mb[1] = _mm_shuffle_epi8(desc[0], shuf_msk2); + pkt_mb[0] = _mm_shuffle_epi8(desc[0], shuf_msk1); + pkt_mb[1] = _mm_add_epi16(pkt_mb[1], len_adjust); + pkt_mb[0] = _mm_add_epi16(pkt_mb[0], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[1]->rx_descriptor_fields1, + pkt_mb[1]); + _mm_storeu_si128((void *)&rx_pkts[0]->rx_descriptor_fields1, + pkt_mb[0]); + + pkt_mb[3] = _mm_shuffle_epi8(desc[1], shuf_msk2); + pkt_mb[2] = _mm_shuffle_epi8(desc[1], shuf_msk1); + pkt_mb[3] = _mm_add_epi16(pkt_mb[3], len_adjust); + pkt_mb[2] = _mm_add_epi16(pkt_mb[2], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[3]->rx_descriptor_fields1, + pkt_mb[3]); + _mm_storeu_si128((void *)&rx_pkts[2]->rx_descriptor_fields1, + pkt_mb[2]); + + pkt_mb[5] = _mm_shuffle_epi8(desc[2], shuf_msk2); + pkt_mb[4] = _mm_shuffle_epi8(desc[2], shuf_msk1); + pkt_mb[5] = _mm_add_epi16(pkt_mb[5], len_adjust); + pkt_mb[4] = _mm_add_epi16(pkt_mb[4], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[5]->rx_descriptor_fields1, + pkt_mb[5]); + _mm_storeu_si128((void *)&rx_pkts[4]->rx_descriptor_fields1, + pkt_mb[4]); + + pkt_mb[7] = _mm_shuffle_epi8(desc[3], shuf_msk2); + pkt_mb[6] = _mm_shuffle_epi8(desc[3], shuf_msk1); + pkt_mb[7] = _mm_add_epi16(pkt_mb[7], len_adjust); + pkt_mb[6] = _mm_add_epi16(pkt_mb[6], len_adjust); + _mm_storeu_si128((void *)&rx_pkts[7]->rx_descriptor_fields1, + pkt_mb[7]); + _mm_storeu_si128((void *)&rx_pkts[6]->rx_descriptor_fields1, + pkt_mb[6]); + + if (unlikely(nb_used <= RTE_VIRTIO_DESC_PER_LOOP)) { + if (sw_ring + nb_used <= sw_ring_end) + nb_pkts_received += nb_used; + else + nb_pkts_received += sw_ring_end - sw_ring; + break; + } else { + if (unlikely(sw_ring + RTE_VIRTIO_DESC_PER_LOOP >= + sw_ring_end)) { + nb_pkts_received += sw_ring_end - sw_ring; + break; + } else { + nb_pkts_received += RTE_VIRTIO_DESC_PER_LOOP; + + rx_pkts += RTE_VIRTIO_DESC_PER_LOOP; + sw_ring += RTE_VIRTIO_DESC_PER_LOOP; + rused += RTE_VIRTIO_DESC_PER_LOOP; + nb_used -= RTE_VIRTIO_DESC_PER_LOOP; + } + } + } + + vq->vq_used_cons_idx += nb_pkts_received; + vq->vq_free_cnt += nb_pkts_received; + rxvq->stats.packets += nb_pkts_received; + return nb_pkts_received; +}