Message ID | 546579A3.3010804@igel.co.jp (mailing list archive) |
---|---|
State | Rejected, archived |
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [IPv6:::1]) by dpdk.org (Postfix) with ESMTP id 59D077F6C; Fri, 14 Nov 2014 04:30:26 +0100 (CET) Received: from mail-pa0-f53.google.com (mail-pa0-f53.google.com [209.85.220.53]) by dpdk.org (Postfix) with ESMTP id D56E77F29 for <dev@dpdk.org>; Fri, 14 Nov 2014 04:30:21 +0100 (CET) Received: by mail-pa0-f53.google.com with SMTP id kx10so16670849pab.40 for <dev@dpdk.org>; Thu, 13 Nov 2014 19:40:22 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:references:in-reply-to:content-type :content-transfer-encoding; bh=5hkRPvKPakoJorv0PqM1mA0U25uqy1n5RlwCjozpMZE=; b=eW0QIAe6mPc3keKlcRDu0veqg2JxgiahjlOtUnx2MBNLdHmD9vkEfzbLC7uN4vTfqd h9V/PnSlIn/fh8M1mu7hK8eu3A1UCbGToXPJq2mMTdogHEVZISeeJ+c8xl3bsTYVbyzh twjx6MEwCoArxWwGMQBzwn5vGTJ+q9o6X5P3ZaMKTpcXu9bt5bVkAkhO8hfekW7NmcYQ 4NOHT598es8BSMBHiFUhb6atpDa68ZFHkofRFQioZyZuROd3qy+FrCk/WFQVZIFsXKDQ WysFTDftiW9ZX85OJS1iqOn38C87YqDladpb3aAfp2/zP4RrC1NA9qI95EgfePYP8dTe U+5w== X-Gm-Message-State: ALoCoQlb2WFe1DG56/FM8p+x62xEeKYN89rWZguQkzCymCh1XI8vX9UUe0mL4pKlxJn7hiTUygT4 X-Received: by 10.67.1.39 with SMTP id bd7mr6976696pad.57.1415936422670; Thu, 13 Nov 2014 19:40:22 -0800 (PST) Received: from [10.16.129.101] (napt.igel.co.jp. [219.106.231.132]) by mx.google.com with ESMTPSA id yl6sm26004175pbc.91.2014.11.13.19.40.21 for <multiple recipients> (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 13 Nov 2014 19:40:22 -0800 (PST) Message-ID: <546579A3.3010804@igel.co.jp> Date: Fri, 14 Nov 2014 12:40:19 +0900 From: Tetsuya Mukawa <mukawa@igel.co.jp> User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0) Gecko/20100101 Thunderbird/24.6.0 MIME-Version: 1.0 To: Linhaifeng <haifeng.lin@huawei.com>, "Xie, Huawei" <huawei.xie@intel.com> References: <C37D651A908B024F974696C65296B57B0F2F19EF@SHSMSX101.ccr.corp.intel.com> <5462DE39.1070006@igel.co.jp> <54645007.3010301@huawei.com> <54656950.1050204@igel.co.jp> <54657365.7090504@huawei.com> In-Reply-To: <54657365.7090504@huawei.com> Content-Type: text/plain; charset=ISO-2022-JP Content-Transfer-Encoding: 7bit Cc: "dev@dpdk.org" <dev@dpdk.org> Subject: Re: [dpdk-dev] vhost-user technical isssues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK <dev.dpdk.org> List-Unsubscribe: <http://dpdk.org/ml/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://dpdk.org/ml/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <http://dpdk.org/ml/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Commit Message
Tetsuya Mukawa
Nov. 14, 2014, 3:40 a.m. UTC
Hi Lin, (2014/11/14 12:13), Linhaifeng wrote: > > size should be same as mmap and > guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem)); > Thanks. It should be. How about following patch? ------------------------------------------------------- #define QEMU_CMD_CHR " -chardev socket,id=chr0,path=%s" #define QEMU_CMD_NETDEV " -netdev vhost-user,id=net0,chardev=chr0,vhostforce" @@ -221,13 +221,16 @@ static void read_guest_mem(void) /* check for sanity */ g_assert_cmpint(fds_num, >, 0); - g_assert_cmpint(fds_num, ==, memory.nregions); + //g_assert_cmpint(fds_num, ==, memory.nregions); + fprintf(stderr, "%s(%d)\n", __func__, __LINE__); /* iterate all regions */ for (i = 0; i < fds_num; i++) { + int ret = 0; /* We'll check only the region statring at 0x0*/ - if (memory.regions[i].guest_phys_addr != 0x0) { + if (memory.regions[i].guest_phys_addr == 0x0) { + close(fds[i]); continue; } @@ -237,6 +240,7 @@ static void read_guest_mem(void) guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fds[i], 0); + fprintf(stderr, "region=%d, mmap=%p, size=%lu\n", i, guest_mem, size); g_assert(guest_mem != MAP_FAILED); guest_mem += (memory.regions[i].mmap_offset / sizeof(*guest_mem)); @@ -247,8 +251,10 @@ static void read_guest_mem(void) g_assert_cmpint(a, ==, b); } - - munmap(guest_mem, memory.regions[i].memory_size); + guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem)); + ret = munmap(guest_mem, memory.regions[i].memory_size); + fprintf(stderr, "region=%d, munmap=%p, size=%lu, ret=%d\n", + i, guest_mem, size, ret); } g_assert_cmpint(1, ==, 1); ------------------------------------------------------- I am using 1GB hugepage size. $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check region=0, mmap=0x2aaac0000000, size=6291456000 region=0, munmap=0x2aaac0000000, size=6291456000, ret=-1 << failed 6291456000 is not aligned by 1GB. When I specify 4096MB as guest memory size, munmap() doesn't return error like following. $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check region=0, mmap=0x2aaac0000000, size=4294967296 region=0, munmap=0x2aaac0000000, size=4294967296, ret=0 Thanks, Tetsuya
Comments
(2014/11/14 12:40), Tetsuya Mukawa wrote: > I am using 1GB hugepage size. > > $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check > region=0, mmap=0x2aaac0000000, size=6291456000 > region=0, munmap=0x2aaac0000000, size=6291456000, ret=-1 << failed > > 6291456000 is not aligned by 1GB. > When I specify 4096MB as guest memory size, munmap() doesn't return > error like following. > > $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check > region=0, mmap=0x2aaac0000000, size=4294967296 > region=0, munmap=0x2aaac0000000, size=4294967296, ret=0 > > Also I've checked mmap2 and munmap implementation of current linux kernel. When a file on hugetlbfs is mapped, 'size' will be aligned by hugepages size in some case. But when munmap is called, 'size' will be aligned by PAGE_SIZE. It mean we cannot use same 'size' value for mmap and munmap in some case. I guess this implementation or specification cases the munmap issue. Thanks, Tetsuya
On 2014/11/14 11:40, Tetsuya Mukawa wrote: > Hi Lin, > > (2014/11/14 12:13), Linhaifeng wrote: >> >> size should be same as mmap and >> guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem)); >> > > Thanks. It should be. > How about following patch? > > ------------------------------------------------------- > diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c > index 75fedf0..be4b171 100644 > --- a/tests/vhost-user-test.c > +++ b/tests/vhost-user-test.c > @@ -37,7 +37,7 @@ > #endif > > #define QEMU_CMD_ACCEL " -machine accel=tcg" > -#define QEMU_CMD_MEM " -m 512 -object > memory-backend-file,id=mem,size=512M,"\ > +#define QEMU_CMD_MEM " -m 6000 -object > memory-backend-file,id=mem,size=6000M,"\ > "mem-path=%s,share=on -numa node,memdev=mem" > #define QEMU_CMD_CHR " -chardev socket,id=chr0,path=%s" > #define QEMU_CMD_NETDEV " -netdev > vhost-user,id=net0,chardev=chr0,vhostforce" > @@ -221,13 +221,16 @@ static void read_guest_mem(void) > > /* check for sanity */ > g_assert_cmpint(fds_num, >, 0); > - g_assert_cmpint(fds_num, ==, memory.nregions); > + //g_assert_cmpint(fds_num, ==, memory.nregions); > > + fprintf(stderr, "%s(%d)\n", __func__, __LINE__); > /* iterate all regions */ > for (i = 0; i < fds_num; i++) { > + int ret = 0; > > /* We'll check only the region statring at 0x0*/ > - if (memory.regions[i].guest_phys_addr != 0x0) { > + if (memory.regions[i].guest_phys_addr == 0x0) { > + close(fds[i]); > continue; > } > > @@ -237,6 +240,7 @@ static void read_guest_mem(void) > > guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, How many is size? mmap_size + mmap_offset ? > MAP_SHARED, fds[i], 0); > + fprintf(stderr, "region=%d, mmap=%p, size=%lu\n", i, guest_mem, size); > > g_assert(guest_mem != MAP_FAILED); > guest_mem += (memory.regions[i].mmap_offset / sizeof(*guest_mem)); > @@ -247,8 +251,10 @@ static void read_guest_mem(void) > > g_assert_cmpint(a, ==, b); > } > - > - munmap(guest_mem, memory.regions[i].memory_size); > + guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem)); > + ret = munmap(guest_mem, memory.regions[i].memory_size); memory.regions[i].memory_size --> memory.regions[i].memory_size + memory.regions[i].memory_offset check you have apply qemu's patch: [PATCH] vhost-user: fix mmap offset calculation > + fprintf(stderr, "region=%d, munmap=%p, size=%lu, ret=%d\n", > + i, guest_mem, size, ret); > } > > g_assert_cmpint(1, ==, 1); > ------------------------------------------------------- > I am using 1GB hugepage size. > > $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check > region=0, mmap=0x2aaac0000000, size=6291456000 > region=0, munmap=0x2aaac0000000, size=6291456000, ret=-1 << failed > > 6291456000 is not aligned by 1GB. > When I specify 4096MB as guest memory size, munmap() doesn't return > error like following. > > $ sudo QTEST_HUGETLBFS_PATH=/mnt/huge make check > region=0, mmap=0x2aaac0000000, size=4294967296 > region=0, munmap=0x2aaac0000000, size=4294967296, ret=0 > > Thanks, > Tetsuya > > . >
Hi Lin, (2014/11/14 13:42), Linhaifeng wrote: > > On 2014/11/14 11:40, Tetsuya Mukawa wrote: >> Hi Lin, >> >> (2014/11/14 12:13), Linhaifeng wrote: >>> size should be same as mmap and >>> guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem)); >>> >> Thanks. It should be. >> How about following patch? >> >> ------------------------------------------------------- >> diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c >> index 75fedf0..be4b171 100644 >> --- a/tests/vhost-user-test.c >> +++ b/tests/vhost-user-test.c >> @@ -37,7 +37,7 @@ >> #endif >> >> #define QEMU_CMD_ACCEL " -machine accel=tcg" >> -#define QEMU_CMD_MEM " -m 512 -object >> memory-backend-file,id=mem,size=512M,"\ >> +#define QEMU_CMD_MEM " -m 6000 -object >> memory-backend-file,id=mem,size=6000M,"\ >> "mem-path=%s,share=on -numa node,memdev=mem" >> #define QEMU_CMD_CHR " -chardev socket,id=chr0,path=%s" >> #define QEMU_CMD_NETDEV " -netdev >> vhost-user,id=net0,chardev=chr0,vhostforce" >> @@ -221,13 +221,16 @@ static void read_guest_mem(void) >> >> /* check for sanity */ >> g_assert_cmpint(fds_num, >, 0); >> - g_assert_cmpint(fds_num, ==, memory.nregions); >> + //g_assert_cmpint(fds_num, ==, memory.nregions); >> >> + fprintf(stderr, "%s(%d)\n", __func__, __LINE__); >> /* iterate all regions */ >> for (i = 0; i < fds_num; i++) { >> + int ret = 0; >> >> /* We'll check only the region statring at 0x0*/ >> - if (memory.regions[i].guest_phys_addr != 0x0) { >> + if (memory.regions[i].guest_phys_addr == 0x0) { >> + close(fds[i]); >> continue; >> } >> >> @@ -237,6 +240,7 @@ static void read_guest_mem(void) >> >> guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, > > How many is size? mmap_size + mmap_offset ? In this case, guest memory length is the size. I added messages from this program within last email. Could you please also check it? > > >> MAP_SHARED, fds[i], 0); >> + fprintf(stderr, "region=%d, mmap=%p, size=%lu\n", i, guest_mem, size); >> >> g_assert(guest_mem != MAP_FAILED); >> guest_mem += (memory.regions[i].mmap_offset / sizeof(*guest_mem)); >> @@ -247,8 +251,10 @@ static void read_guest_mem(void) >> >> g_assert_cmpint(a, ==, b); >> } >> - >> - munmap(guest_mem, memory.regions[i].memory_size); >> + guest_mem -= (memory.regions[i].mmap_offset / sizeof(*guest_mem)); >> + ret = munmap(guest_mem, memory.regions[i].memory_size); > memory.regions[i].memory_size --> memory.regions[i].memory_size + memory.regions[i].memory_offset > > check you have apply qemu's patch: [PATCH] vhost-user: fix mmap offset calculation I checked it using latest QEMU code. So the patch you mentioned is included. I guess you can munmap a file, because 'size' is aligned by hugepage size like 2GB. Could you please try another value like 6000MB? Thanks, Tetsuya
On 2014/11/14 13:12, Tetsuya Mukawa wrote:
> ease try another value like 6000MB
i have try this value 6000MB.I can munmap success.
you mmap with size "memory_size + memory_offset" should also munmap with this size.
Hi Lin, (2014/11/14 14:30), Linhaifeng wrote: > > On 2014/11/14 13:12, Tetsuya Mukawa wrote: >> ease try another value like 6000MB > i have try this value 6000MB.I can munmap success. > > you mmap with size "memory_size + memory_offset" should also munmap with this size. > I appreciate for your testing and sugesstions. :) I am not sure what is difference between your environment and my environment. Here is my code and message from the code. --------------------------------------------- [code] --------------------------------------------- size = memory.regions[i].memory_size + memory.regions[i].mmap_offset; guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, fds[i], 0); fprintf(stderr, "region=%d, mmap=%p, size=%lu\n", i, guest_mem, size); g_assert(guest_mem != MAP_FAILED); ret = munmap(guest_mem, size); fprintf(stderr, "region=%d, munmap=%p, size=%lu, ret=%d\n", i, guest_mem, size, ret); --------------------------------------------- [messages] --------------------------------------------- region=0, mmap=0x2aaac0000000, size=6291456000 region=0, munmap=0x2aaac0000000, size=6291456000, ret=-1 With your environment, 'ret' will be 0. In my environment, 'size' should be aligned not to get error. Anyway, it's nice to implement more simple. When munmap failure occurs, let's think it again. Thanks, Tetsuya
I tested with latest qemu(with offset fix) in vhost app(not with test case), unmap succeeds only when the size is aligned to 1GB(hugepage size). Another important thing is could we do mmap(0, region[i].memory_size, PROT_XX, mmap_offset) rather than with offset 0? With the region above 4GB, we will waste 4GB address space. Or we at least need to round down offset to nearest 1GB, and round up memory size to upper 1GB, to save some address space waste. Anyway, this is ugly. Kernel doesn't take care of us, do those alignment for us automatically. > -----Original Message----- > From: Tetsuya Mukawa [mailto:mukawa@igel.co.jp] > Sent: Thursday, November 13, 2014 11:57 PM > To: Linhaifeng; Xie, Huawei > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] vhost-user technical isssues > > Hi Lin, > (2014/11/14 14:30), Linhaifeng wrote: > > > > On 2014/11/14 13:12, Tetsuya Mukawa wrote: > >> ease try another value like 6000MB > > i have try this value 6000MB.I can munmap success. > > > > you mmap with size "memory_size + memory_offset" should also munmap > with this size. > > > I appreciate for your testing and sugesstions. :) > I am not sure what is difference between your environment and my > environment. > > Here is my code and message from the code. > --------------------------------------------- > [code] > --------------------------------------------- > size = memory.regions[i].memory_size + memory.regions[i].mmap_offset; > > guest_mem = mmap(0, size, PROT_READ | PROT_WRITE, > MAP_SHARED, fds[i], 0); > > fprintf(stderr, "region=%d, mmap=%p, size=%lu\n", i, guest_mem, size); > > g_assert(guest_mem != MAP_FAILED); > > ret = munmap(guest_mem, size); > > fprintf(stderr, "region=%d, munmap=%p, size=%lu, ret=%d\n", > i, guest_mem, size, ret); > > --------------------------------------------- > [messages] > --------------------------------------------- > region=0, mmap=0x2aaac0000000, size=6291456000 > region=0, munmap=0x2aaac0000000, size=6291456000, ret=-1 > > With your environment, 'ret' will be 0. > In my environment, 'size' should be aligned not to get error. > Anyway, it's nice to implement more simple. > When munmap failure occurs, let's think it again. > > Thanks, > Tetsuya
Hi Xie, (2014/11/14 19:59), Xie, Huawei wrote: > I tested with latest qemu(with offset fix) in vhost app(not with test case), unmap succeeds only when the size is aligned to 1GB(hugepage size). I appreciate for your testing. > Another important thing is could we do mmap(0, region[i].memory_size, PROT_XX, mmap_offset) rather than with offset 0? With the region above 4GB, we will waste 4GB address space. Or we at least need to round down offset to nearest 1GB, and round up memory size to upper 1GB, to save some address space waste. > > Anyway, this is ugly. Kernel doesn't take care of us, do those alignment for us automatically. > It seems 'offset' also should be aligned by hugepage size also. But it might be a specification of mmap. Manpage of mmap says 'offset' should be aligned by sysconf(_SC_PAGE_SIZE). If the target file is on hugetlbfs, I guess hugepage size is used as alignment size. Thanks, Tetsuya
diff --git a/tests/vhost-user-test.c b/tests/vhost-user-test.c index 75fedf0..be4b171 100644 --- a/tests/vhost-user-test.c +++ b/tests/vhost-user-test.c @@ -37,7 +37,7 @@ #endif #define QEMU_CMD_ACCEL " -machine accel=tcg" -#define QEMU_CMD_MEM " -m 512 -object memory-backend-file,id=mem,size=512M,"\ +#define QEMU_CMD_MEM " -m 6000 -object memory-backend-file,id=mem,size=6000M,"\ "mem-path=%s,share=on -numa node,memdev=mem"