[v3] vhost: exclude VM hugepages from coredumps
Checks
Commit Message
Currently if an application wants to include shared hugepages in
coredumps in conjunction with the vhost library, the coredump will be
larger than expected and include unneeded virtual machine memory.
This patch will mark all vhost huge pages as DONTDUMP, except for some
select pages used by DPDK.
Signed-off-by: Mike Pattrick <mkp@redhat.com>
---
v2:
* Removed warning on unsupported platforms
v3:
* Removed pointer warning on 32bit platforms
---
lib/vhost/iotlb.c | 5 +++++
lib/vhost/vhost.h | 12 ++++++++++++
lib/vhost/vhost_user.c | 10 ++++++++++
3 files changed, 27 insertions(+)
Comments
Hi Mike,
On 12/7/22 17:54, Mike Pattrick wrote:
> Currently if an application wants to include shared hugepages in
> coredumps in conjunction with the vhost library, the coredump will be
> larger than expected and include unneeded virtual machine memory.
>
> This patch will mark all vhost huge pages as DONTDUMP, except for some
> select pages used by DPDK.
>
> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>
> ---
> v2:
> * Removed warning on unsupported platforms
>
> v3:
> * Removed pointer warning on 32bit platforms
>
> ---
> lib/vhost/iotlb.c | 5 +++++
> lib/vhost/vhost.h | 12 ++++++++++++
> lib/vhost/vhost_user.c | 10 ++++++++++
> 3 files changed, 27 insertions(+)
>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Thanks,
Maxime
On 12/7/22 17:54, Mike Pattrick wrote:
> Currently if an application wants to include shared hugepages in
> coredumps in conjunction with the vhost library, the coredump will be
> larger than expected and include unneeded virtual machine memory.
>
> This patch will mark all vhost huge pages as DONTDUMP, except for some
> select pages used by DPDK.
>
> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>
> ---
> v2:
> * Removed warning on unsupported platforms
>
> v3:
> * Removed pointer warning on 32bit platforms
>
> ---
> lib/vhost/iotlb.c | 5 +++++
> lib/vhost/vhost.h | 12 ++++++++++++
> lib/vhost/vhost_user.c | 10 ++++++++++
> 3 files changed, 27 insertions(+)
>
Applied to dpdk-next-virtio/main.
Thanks,
Maxime
Hello Mike,
On Wed, Dec 7, 2022 at 5:54 PM Mike Pattrick <mkp@redhat.com> wrote:
>
> Currently if an application wants to include shared hugepages in
> coredumps in conjunction with the vhost library, the coredump will be
> larger than expected and include unneeded virtual machine memory.
>
> This patch will mark all vhost huge pages as DONTDUMP, except for some
> select pages used by DPDK.
>
> Signed-off-by: Mike Pattrick <mkp@redhat.com>
I noticed the following warnings today on my f37 kernel, while running
a vhost-user/virtio-user testpmd setup on next-virtio branch.
Linux dmarchan 6.1.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 2
00:21:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
My system has 2M hugepages, only.
$ rm vhost-net; strace -e trace=madvise -f
./build-clang/app/dpdk-testpmd --in-memory --no-pci
--vdev=net_vhost0,iface=./vhost-net,client=1 -- -i
$ ./build-clang/app/dpdk-testpmd --in-memory --single-file-segment
--no-pci --vdev
'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1' --
-i
Then, on the "vhost side" testpmd:
...
VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
VHOST_CONFIG: (./vhost-net) vring base idx:0 last_used_idx:0 last_avail_idx:0.
VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_ADDR
VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_KICK
VHOST_CONFIG: (./vhost-net) vring kick idx:0 file:391
[pid 59565] madvise(0x7fa6d8da4000, 2052, MADV_DODUMP) = -1 EINVAL
(Invalid argument)
VHOST_CONFIG: could not set coredump preference (Invalid argument).
[pid 59565] madvise(0x7fa6d8da5000, 2052, MADV_DODUMP) = -1 EINVAL
(Invalid argument)
VHOST_CONFIG: could not set coredump preference (Invalid argument).
[pid 59565] madvise(0x7fa6d8da6000, 2052, MADV_DODUMP) = -1 EINVAL
(Invalid argument)
VHOST_CONFIG: could not set coredump preference (Invalid argument).
VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
Looking at the whole trace, only madvise calls with MADV_DODUMP (with
all of them for a 2052 size) fail.
I did not investigate further.
Could you have a look please?
Thanks David for the report.
On 2/10/23 16:53, David Marchand wrote:
> Hello Mike,
>
> On Wed, Dec 7, 2022 at 5:54 PM Mike Pattrick <mkp@redhat.com> wrote:
>>
>> Currently if an application wants to include shared hugepages in
>> coredumps in conjunction with the vhost library, the coredump will be
>> larger than expected and include unneeded virtual machine memory.
>>
>> This patch will mark all vhost huge pages as DONTDUMP, except for some
>> select pages used by DPDK.
>>
>> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>
> I noticed the following warnings today on my f37 kernel, while running
> a vhost-user/virtio-user testpmd setup on next-virtio branch.
> Linux dmarchan 6.1.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 2
> 00:21:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
> My system has 2M hugepages, only.
FYI, I don't see it on my system which is using 2MB hugepages too:
Linux max-t490s 6.1.9-100.fc36.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 2
03:27:13 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Almost the same Kernel BTW.
>
> $ rm vhost-net; strace -e trace=madvise -f
> ./build-clang/app/dpdk-testpmd --in-memory --no-pci
> --vdev=net_vhost0,iface=./vhost-net,client=1 -- -i
>
> $ ./build-clang/app/dpdk-testpmd --in-memory --single-file-segment
> --no-pci --vdev
> 'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1' --
> -i
>
> Then, on the "vhost side" testpmd:
> ...
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
> VHOST_CONFIG: (./vhost-net) vring base idx:0 last_used_idx:0 last_avail_idx:0.
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_ADDR
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_KICK
> VHOST_CONFIG: (./vhost-net) vring kick idx:0 file:391
> [pid 59565] madvise(0x7fa6d8da4000, 2052, MADV_DODUMP) = -1 EINVAL
> (Invalid argument)
> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> [pid 59565] madvise(0x7fa6d8da5000, 2052, MADV_DODUMP) = -1 EINVAL
> (Invalid argument)
> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> [pid 59565] madvise(0x7fa6d8da6000, 2052, MADV_DODUMP) = -1 EINVAL
> (Invalid argument)
> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
>
> Looking at the whole trace, only madvise calls with MADV_DODUMP (with
> all of them for a 2052 size) fail.
>
> I did not investigate further.
> Could you have a look please?
>
>
Maxime
On Fri, Feb 10, 2023 at 10:53 AM David Marchand
<david.marchand@redhat.com> wrote:
>
> Hello Mike,
>
> On Wed, Dec 7, 2022 at 5:54 PM Mike Pattrick <mkp@redhat.com> wrote:
> >
> > Currently if an application wants to include shared hugepages in
> > coredumps in conjunction with the vhost library, the coredump will be
> > larger than expected and include unneeded virtual machine memory.
> >
> > This patch will mark all vhost huge pages as DONTDUMP, except for some
> > select pages used by DPDK.
> >
> > Signed-off-by: Mike Pattrick <mkp@redhat.com>
>
> I noticed the following warnings today on my f37 kernel, while running
> a vhost-user/virtio-user testpmd setup on next-virtio branch.
> Linux dmarchan 6.1.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 2
> 00:21:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
> My system has 2M hugepages, only.
>
>
> $ rm vhost-net; strace -e trace=madvise -f
> ./build-clang/app/dpdk-testpmd --in-memory --no-pci
> --vdev=net_vhost0,iface=./vhost-net,client=1 -- -i
>
> $ ./build-clang/app/dpdk-testpmd --in-memory --single-file-segment
> --no-pci --vdev
> 'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1' --
> -i
>
> Then, on the "vhost side" testpmd:
> ...
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
> VHOST_CONFIG: (./vhost-net) vring base idx:0 last_used_idx:0 last_avail_idx:0.
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_ADDR
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_KICK
> VHOST_CONFIG: (./vhost-net) vring kick idx:0 file:391
> [pid 59565] madvise(0x7fa6d8da4000, 2052, MADV_DODUMP) = -1 EINVAL
> (Invalid argument)
> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> [pid 59565] madvise(0x7fa6d8da5000, 2052, MADV_DODUMP) = -1 EINVAL
> (Invalid argument)
> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> [pid 59565] madvise(0x7fa6d8da6000, 2052, MADV_DODUMP) = -1 EINVAL
> (Invalid argument)
> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
>
> Looking at the whole trace, only madvise calls with MADV_DODUMP (with
> all of them for a 2052 size) fail.
>
> I did not investigate further.
> Could you have a look please?
>
I tried it on that exact kernel and also ran into this issue. I'll
check it out in more depth.
-M
>
> --
> David Marchand
>
Hi Mike,
On 2/10/23 22:12, Mike Pattrick wrote:
> On Fri, Feb 10, 2023 at 10:53 AM David Marchand
> <david.marchand@redhat.com> wrote:
>>
>> Hello Mike,
>>
>> On Wed, Dec 7, 2022 at 5:54 PM Mike Pattrick <mkp@redhat.com> wrote:
>>>
>>> Currently if an application wants to include shared hugepages in
>>> coredumps in conjunction with the vhost library, the coredump will be
>>> larger than expected and include unneeded virtual machine memory.
>>>
>>> This patch will mark all vhost huge pages as DONTDUMP, except for some
>>> select pages used by DPDK.
>>>
>>> Signed-off-by: Mike Pattrick <mkp@redhat.com>
>>
>> I noticed the following warnings today on my f37 kernel, while running
>> a vhost-user/virtio-user testpmd setup on next-virtio branch.
>> Linux dmarchan 6.1.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 2
>> 00:21:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
>> My system has 2M hugepages, only.
>>
>>
>> $ rm vhost-net; strace -e trace=madvise -f
>> ./build-clang/app/dpdk-testpmd --in-memory --no-pci
>> --vdev=net_vhost0,iface=./vhost-net,client=1 -- -i
>>
>> $ ./build-clang/app/dpdk-testpmd --in-memory --single-file-segment
>> --no-pci --vdev
>> 'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1' --
>> -i
>>
>> Then, on the "vhost side" testpmd:
>> ...
>> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
>> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
>> VHOST_CONFIG: (./vhost-net) vring base idx:0 last_used_idx:0 last_avail_idx:0.
>> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_ADDR
>> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_KICK
>> VHOST_CONFIG: (./vhost-net) vring kick idx:0 file:391
>> [pid 59565] madvise(0x7fa6d8da4000, 2052, MADV_DODUMP) = -1 EINVAL
>> (Invalid argument)
>> VHOST_CONFIG: could not set coredump preference (Invalid argument).
>> [pid 59565] madvise(0x7fa6d8da5000, 2052, MADV_DODUMP) = -1 EINVAL
>> (Invalid argument)
>> VHOST_CONFIG: could not set coredump preference (Invalid argument).
>> [pid 59565] madvise(0x7fa6d8da6000, 2052, MADV_DODUMP) = -1 EINVAL
>> (Invalid argument)
>> VHOST_CONFIG: could not set coredump preference (Invalid argument).
>> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
>> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
>>
>> Looking at the whole trace, only madvise calls with MADV_DODUMP (with
>> all of them for a 2052 size) fail.
>>
>> I did not investigate further.
>> Could you have a look please?
>>
>
> I tried it on that exact kernel and also ran into this issue. I'll
> check it out in more depth.
Gentle reminder, have you found the root cause for this issue?
Thanks,
Maxime
> -M
>
>>
>> --
>> David Marchand
>>
>
On Tue, Feb 21, 2023 at 11:26 AM Maxime Coquelin
<maxime.coquelin@redhat.com> wrote:
>
> Hi Mike,
>
> On 2/10/23 22:12, Mike Pattrick wrote:
> > On Fri, Feb 10, 2023 at 10:53 AM David Marchand
> > <david.marchand@redhat.com> wrote:
> >>
> >> Hello Mike,
> >>
> >> On Wed, Dec 7, 2022 at 5:54 PM Mike Pattrick <mkp@redhat.com> wrote:
> >>>
> >>> Currently if an application wants to include shared hugepages in
> >>> coredumps in conjunction with the vhost library, the coredump will be
> >>> larger than expected and include unneeded virtual machine memory.
> >>>
> >>> This patch will mark all vhost huge pages as DONTDUMP, except for some
> >>> select pages used by DPDK.
> >>>
> >>> Signed-off-by: Mike Pattrick <mkp@redhat.com>
> >>
> >> I noticed the following warnings today on my f37 kernel, while running
> >> a vhost-user/virtio-user testpmd setup on next-virtio branch.
> >> Linux dmarchan 6.1.9-200.fc37.x86_64 #1 SMP PREEMPT_DYNAMIC Thu Feb 2
> >> 00:21:48 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
> >> My system has 2M hugepages, only.
> >>
> >>
> >> $ rm vhost-net; strace -e trace=madvise -f
> >> ./build-clang/app/dpdk-testpmd --in-memory --no-pci
> >> --vdev=net_vhost0,iface=./vhost-net,client=1 -- -i
> >>
> >> $ ./build-clang/app/dpdk-testpmd --in-memory --single-file-segment
> >> --no-pci --vdev
> >> 'net_virtio_user0,mac=00:01:02:03:04:05,path=./vhost-net,server=1' --
> >> -i
> >>
> >> Then, on the "vhost side" testpmd:
> >> ...
> >> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
> >> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
> >> VHOST_CONFIG: (./vhost-net) vring base idx:0 last_used_idx:0 last_avail_idx:0.
> >> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_ADDR
> >> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_KICK
> >> VHOST_CONFIG: (./vhost-net) vring kick idx:0 file:391
> >> [pid 59565] madvise(0x7fa6d8da4000, 2052, MADV_DODUMP) = -1 EINVAL
> >> (Invalid argument)
> >> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> >> [pid 59565] madvise(0x7fa6d8da5000, 2052, MADV_DODUMP) = -1 EINVAL
> >> (Invalid argument)
> >> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> >> [pid 59565] madvise(0x7fa6d8da6000, 2052, MADV_DODUMP) = -1 EINVAL
> >> (Invalid argument)
> >> VHOST_CONFIG: could not set coredump preference (Invalid argument).
> >> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_NUM
> >> VHOST_CONFIG: (./vhost-net) read message VHOST_USER_SET_VRING_BASE
> >>
> >> Looking at the whole trace, only madvise calls with MADV_DODUMP (with
> >> all of them for a 2052 size) fail.
> >>
> >> I did not investigate further.
> >> Could you have a look please?
> >>
> >
> > I tried it on that exact kernel and also ran into this issue. I'll
> > check it out in more depth.
>
> Gentle reminder, have you found the root cause for this issue?
Yes, thanks for the ping. I should have a patch soon.
Cheers,
M
>
> Thanks,
> Maxime
>
> > -M
> >
> >>
> >> --
> >> David Marchand
> >>
> >
>
@@ -149,6 +149,7 @@ vhost_user_iotlb_cache_remove_all(struct vhost_virtqueue *vq)
rte_rwlock_write_lock(&vq->iotlb_lock);
RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) {
+ mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, true);
TAILQ_REMOVE(&vq->iotlb_list, node, next);
vhost_user_iotlb_pool_put(vq, node);
}
@@ -170,6 +171,7 @@ vhost_user_iotlb_cache_random_evict(struct vhost_virtqueue *vq)
RTE_TAILQ_FOREACH_SAFE(node, &vq->iotlb_list, next, temp_node) {
if (!entry_idx) {
+ mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, true);
TAILQ_REMOVE(&vq->iotlb_list, node, next);
vhost_user_iotlb_pool_put(vq, node);
vq->iotlb_cache_nr--;
@@ -222,12 +224,14 @@ vhost_user_iotlb_cache_insert(struct virtio_net *dev, struct vhost_virtqueue *vq
vhost_user_iotlb_pool_put(vq, new_node);
goto unlock;
} else if (node->iova > new_node->iova) {
+ mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, true);
TAILQ_INSERT_BEFORE(node, new_node, next);
vq->iotlb_cache_nr++;
goto unlock;
}
}
+ mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, true);
TAILQ_INSERT_TAIL(&vq->iotlb_list, new_node, next);
vq->iotlb_cache_nr++;
@@ -255,6 +259,7 @@ vhost_user_iotlb_cache_remove(struct vhost_virtqueue *vq,
break;
if (iova < node->iova + node->size) {
+ mem_set_dump((void *)(uintptr_t)node->uaddr, node->size, true);
TAILQ_REMOVE(&vq->iotlb_list, node, next);
vhost_user_iotlb_pool_put(vq, node);
vq->iotlb_cache_nr--;
@@ -13,6 +13,7 @@
#include <linux/virtio_net.h>
#include <sys/socket.h>
#include <linux/if.h>
+#include <sys/mman.h>
#include <rte_log.h>
#include <rte_ether.h>
@@ -987,4 +988,15 @@ mbuf_is_consumed(struct rte_mbuf *m)
return true;
}
+
+static __rte_always_inline void
+mem_set_dump(__rte_unused void *ptr, __rte_unused size_t size, __rte_unused bool enable)
+{
+#ifdef MADV_DONTDUMP
+ if (madvise(ptr, size, enable ? MADV_DODUMP : MADV_DONTDUMP) == -1) {
+ rte_log(RTE_LOG_INFO, vhost_config_log_level,
+ "VHOST_CONFIG: could not set coredump preference (%s).\n", strerror(errno));
+ }
+#endif
+}
#endif /* _VHOST_NET_CDEV_H_ */
@@ -793,6 +793,9 @@ translate_ring_addresses(struct virtio_net **pdev, struct vhost_virtqueue **pvq)
return;
}
+ mem_set_dump(vq->desc_packed, len, true);
+ mem_set_dump(vq->driver_event, len, true);
+ mem_set_dump(vq->device_event, len, true);
vq->access_ok = true;
return;
}
@@ -846,6 +849,9 @@ translate_ring_addresses(struct virtio_net **pdev, struct vhost_virtqueue **pvq)
"some packets maybe resent for Tx and dropped for Rx\n");
}
+ mem_set_dump(vq->desc, len, true);
+ mem_set_dump(vq->avail, len, true);
+ mem_set_dump(vq->used, len, true);
vq->access_ok = true;
VHOST_LOG_CONFIG(dev->ifname, DEBUG, "mapped address desc: %p\n", vq->desc);
@@ -1224,6 +1230,7 @@ vhost_user_mmap_region(struct virtio_net *dev,
region->mmap_addr = mmap_addr;
region->mmap_size = mmap_size;
region->host_user_addr = (uint64_t)(uintptr_t)mmap_addr + mmap_offset;
+ mem_set_dump(mmap_addr, mmap_size, false);
if (dev->async_copy) {
if (add_guest_pages(dev, region, alignment) < 0) {
@@ -1528,6 +1535,7 @@ inflight_mem_alloc(struct virtio_net *dev, const char *name, size_t size, int *f
return NULL;
}
+ mem_set_dump(ptr, size, false);
*fd = mfd;
return ptr;
}
@@ -1736,6 +1744,7 @@ vhost_user_set_inflight_fd(struct virtio_net **pdev,
dev->inflight_info->fd = -1;
}
+ mem_set_dump(addr, mmap_size, false);
dev->inflight_info->fd = fd;
dev->inflight_info->addr = addr;
dev->inflight_info->size = mmap_size;
@@ -2283,6 +2292,7 @@ vhost_user_set_log_base(struct virtio_net **pdev,
dev->log_addr = (uint64_t)(uintptr_t)addr;
dev->log_base = dev->log_addr + off;
dev->log_size = size;
+ mem_set_dump(addr, size, false);
for (i = 0; i < dev->nr_vring; i++) {
struct vhost_virtqueue *vq = dev->virtqueue[i];