From patchwork Fri Jul 28 07:58:48 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: chenchanghu X-Patchwork-Id: 27248 Return-Path: X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id B4BBE7D73; Fri, 28 Jul 2017 09:59:08 +0200 (CEST) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by dpdk.org (Postfix) with ESMTP id C3CCB90F2 for ; Fri, 28 Jul 2017 09:59:03 +0200 (CEST) Received: from 172.30.72.53 (EHLO dggeml406-hub.china.huawei.com) ([172.30.72.53]) by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued) with ESMTP id ASH74521; Fri, 28 Jul 2017 15:58:58 +0800 (CST) Received: from DGGEML505-MBX.china.huawei.com ([169.254.12.75]) by dggeml406-hub.china.huawei.com ([10.3.17.50]) with mapi id 14.03.0301.000; Fri, 28 Jul 2017 15:58:48 +0800 From: chenchanghu To: "dev@dpdk.org" , "adrien.mazarguil@6wind.com" , "nelio.laranjeiro@6wind.com" CC: Zhoujingbin , "Zhoulei (G)" , Deng Kairong , Chenrujie , cuiyayun , "Chengwei (Titus)" , "Lixuan (Alex)" Thread-Topic: [disscussion] mlx4 driver MLX4_PMD_TX_MP_CACHE default vaule Thread-Index: AdMHci7lab8r+gbyRquE7Dssu+4iPw== Date: Fri, 28 Jul 2017 07:58:48 +0000 Message-ID: <859E1CB9FBF08C4B839DCF451B09C5032D62BFA3@dggeml505-mbx.china.huawei.com> Accept-Language: en-US Content-Language: zh-CN X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.177.244.234] MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090205.597AEEC5.0016, ss=1, re=0.000, recu=0.000, reip=0.000, cl=1, cld=1, fgs=0, ip=169.254.12.75, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: f06950d922529e156392a90542d75ed7 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: [dpdk-dev] [disscussion] mlx4 driver MLX4_PMD_TX_MP_CACHE default vaule X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, When I used the mlx4 pmd, I meet a problem about MLX4_PMD_TX_MP_CACHE vaule, which is used for Memory pool to Memory region translation. The detail test is descripted below. 1.Test environmemt infomation: a. Linux distribution: CentOS b. dpdk version: dpdk-16.04 c. Ethernet device : mlx4 VF d. pmd info: mlx4 poll-mode-driver 2.Test diagram: +----------------------+ +---------------------+ +-----------------------+ | client1 | | client2 | ..... | clientN | +----------------+----+ +-------+------------+ +----------+------------+ | | | | | | +----v---------------v------------------------------v------+ | share memory queue | +----------------------------+------------------------------+ | | +-------------------------------v--------------------------------+ | server | +-------------------------------+--------------------------------+ | | +-------------------------------v--------------------------------+ | dpdk rte_eth_tx_burst | +-------------------------------+--------------------------------+ | +-------------------------------v---------------------------------+ | mlx4 pmd driver | +-----------------------------------------------------------------+ a. Every client has one memory pool, all clients send message to server queue in the shared memory. b. Server is only one thread, and mlx4 pmd use one tx queue. 3.Test step: a. We start 30 clients, which means total mempool number reaching 30, every client will send 20 packets/second, every packet length is 10k.However,the server will do large packet segmentation before the packet send to rte_eth_tx_burst. b. When we use the mlx4 pmd default MLX4_PMD_TX_MP_CACHE value which is 8, we found that the function 'rte_eth_tx_burst' cost about 40ms, which is mostly cost by the function 'ibv_reg_mr'. c. Then we modify the MLX4_PMD_TX_MP_CACHE vaule to 32, which is configured the vaule 'CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE' in the config/common_base file, we found the function 'rte_eth_tx_burst' running time is less than 5ms. Would the community modify the default CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE value to 32 to adapt the scenario like above description, avoiding the slow operation when use too many mempool number which is more than the CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE value in one tx queue. Please send your reply to chenchanghu@huawei.com, any suggestion is to be greatefully appreciated. 4. Patch: diff --git a/config/common_base b/config/common_base index a0580d1..af6ba47 100644 --- a/config/common_base +++ b/config/common_base @@ -207,7 +207,7 @@ CONFIG_RTE_LIBRTE_MLX4_PMD=y CONFIG_RTE_LIBRTE_MLX4_DEBUG=n CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N=4 CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE=0 -CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8 +CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=32 CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS=1