[dpdk-dev,v3,01/19] Revert "vhost: workaround MQ fails to startup"

Message ID 20171005083627.27828-2-maxime.coquelin@redhat.com (mailing list archive)
State Accepted, archived
Delegated to: Yuanhan Liu
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Maxime Coquelin Oct. 5, 2017, 8:36 a.m. UTC
  This reverts commit 04d81227960b5c1cf2f11f492100979ead20c526.

As agreed when this workaround was introduced, it can be reverted
as Qemu v2.10 that fixes the issue is now out.

The reply-ack feature is required for vhost-user IOMMU support.

Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
 lib/librte_vhost/vhost_user.h | 6 +-----
 1 file changed, 1 insertion(+), 5 deletions(-)
  

Comments

Mark Kavanagh Nov. 1, 2017, 5:11 p.m. UTC | #1
Hi Maxime,

First off, apologies for the lateness of this reply - I realize that this patch has already been upstreamed.

Unfortunately, during OvS-DPDK regression testing for DPDK v17.11-rc2 just today, a regression involving vHost multiq was detected, and pinpointed to this patch.

Version info for the components involved during the aforementioned testing is as follows:
DPDK:	v17.11-rc2
OvS:	af2e40c ("sparse: eliminate "duplicate initialization") + DPDK v17.11 upgrade patch
QEMU:	v2.7.0

The regression may be reproduced as follows:
- Set up OvS-DPDK as normal, and add one vHostUser client port:
	$OVS_DIR/utilities/ovs-vsctl add-port br0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuserclient
 
- Start QEMU, specifying 2 ports for the guest interface"
	$QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
    -cpu host -enable-kvm -m 4096M \
    -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
    -numa node,memdev=mem -mem-prealloc \
    -drive file=$VM_IMAGE \
    -chardev socket,id=char0,path=/tmp/sock0,server \
    -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce,queues=2 \
    -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off,mq=on,vectors=6 \
    -nographic"
 
- The guest subsequently starts as normal, but then hangs completely, rendering it completely unusable.
  The last lines of ovs-vswitchd.log read as follows:
	|00051|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
	|00052|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
	|00053|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_FEATURES

Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.

One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?

Thanks in advance,
Mark 


>From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Maxime Coquelin
>Sent: Thursday, October 5, 2017 9:36 AM
>To: dev@dpdk.org; Horton, Remy <remy.horton@intel.com>; Bie, Tiwei
><tiwei.bie@intel.com>; yliu@fridaylinux.org
>Cc: mst@redhat.com; jfreiman@redhat.com; vkaplans@redhat.com;
>jasowang@redhat.com; Maxime Coquelin <maxime.coquelin@redhat.com>
>Subject: [dpdk-dev] [PATCH v3 01/19] Revert "vhost: workaround MQ fails to
>startup"
>
>This reverts commit 04d81227960b5c1cf2f11f492100979ead20c526.
>
>As agreed when this workaround was introduced, it can be reverted
>as Qemu v2.10 that fixes the issue is now out.
>
>The reply-ack feature is required for vhost-user IOMMU support.
>
>Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>---
> lib/librte_vhost/vhost_user.h | 6 +-----
> 1 file changed, 1 insertion(+), 5 deletions(-)
>
>diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
>index 35ebd7190..2ba22dbb0 100644
>--- a/lib/librte_vhost/vhost_user.h
>+++ b/lib/librte_vhost/vhost_user.h
>@@ -49,14 +49,10 @@
> #define VHOST_USER_PROTOCOL_F_REPLY_ACK	3
> #define VHOST_USER_PROTOCOL_F_NET_MTU 4
>
>-/*
>- * disable REPLY_ACK feature to workaround the buggy QEMU implementation.
>- * Proved buggy QEMU includes v2.7 - v2.9.
>- */
> #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
> 					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
> 					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
>-					 (0ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
>+					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
> 					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))
>
> typedef enum VhostUserRequest {
>--
>2.13.6
  
Maxime Coquelin Nov. 2, 2017, 9:40 a.m. UTC | #2
Hi Mark,

On 11/01/2017 06:11 PM, Kavanagh, Mark B wrote:
> Hi Maxime,
> 
> First off, apologies for the lateness of this reply - I realize that this patch has already been upstreamed.

No worries, great to see DPDK integration being tested before v17.11 is
released. Is the v17.11 upgrade patch available somewhere?

> Unfortunately, during OvS-DPDK regression testing for DPDK v17.11-rc2 just today, a regression involving vHost multiq was detected, and pinpointed to this patch.
> 
> Version info for the components involved during the aforementioned testing is as follows:
> DPDK:	v17.11-rc2
> OvS:	af2e40c ("sparse: eliminate "duplicate initialization") + DPDK v17.11 upgrade patch
> QEMU:	v2.7.0
> 
> The regression may be reproduced as follows:
> - Set up OvS-DPDK as normal, and add one vHostUser client port:
> 	$OVS_DIR/utilities/ovs-vsctl add-port br0 vhost-user0 -- set Interface vhost-user0 type=dpdkvhostuserclient
>   
> - Start QEMU, specifying 2 ports for the guest interface"
> 	$QEMU_DIR/x86_64-softmmu/qemu-system-x86_64 \
>      -cpu host -enable-kvm -m 4096M \
>      -object memory-backend-file,id=mem,size=4096M,mem-path=/dev/hugepages,share=on \
>      -numa node,memdev=mem -mem-prealloc \
>      -drive file=$VM_IMAGE \
>      -chardev socket,id=char0,path=/tmp/sock0,server \
>      -netdev type=vhost-user,id=mynet1,chardev=char0,vhostforce,queues=2 \
>      -device virtio-net-pci,mac=00:00:00:00:00:01,netdev=mynet1,mrg_rxbuf=off,mq=on,vectors=6 \
>      -nographic"
>   
> - The guest subsequently starts as normal, but then hangs completely, rendering it completely unusable.
>    The last lines of ovs-vswitchd.log read as follows:
> 	|00051|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_VRING_ENABLE
> 	|00052|dpdk|INFO|VHOST_CONFIG: set queue enable: 1 to qp idx: 1
> 	|00053|dpdk|INFO|VHOST_CONFIG: read message VHOST_USER_SET_FEATURES
> 
> Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.

Do you mean that upstream Qemu v2.7.0 is used in production?
I would expect the customers to use a distro Qemu which should contain
relevant fixes, or follow upstream's stable branches.

FYI, Qemu v2.9.1 contains a backport of the fix.

> One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?

Yes, that's one option, but:
1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
tested.

Yuanhan, what do you think?

Regards,
Maxime

> Thanks in advance,
> Mark
> 
> 
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Maxime Coquelin
>> Sent: Thursday, October 5, 2017 9:36 AM
>> To: dev@dpdk.org; Horton, Remy <remy.horton@intel.com>; Bie, Tiwei
>> <tiwei.bie@intel.com>; yliu@fridaylinux.org
>> Cc: mst@redhat.com; jfreiman@redhat.com; vkaplans@redhat.com;
>> jasowang@redhat.com; Maxime Coquelin <maxime.coquelin@redhat.com>
>> Subject: [dpdk-dev] [PATCH v3 01/19] Revert "vhost: workaround MQ fails to
>> startup"
>>
>> This reverts commit 04d81227960b5c1cf2f11f492100979ead20c526.
>>
>> As agreed when this workaround was introduced, it can be reverted
>> as Qemu v2.10 that fixes the issue is now out.
>>
>> The reply-ack feature is required for vhost-user IOMMU support.
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> ---
>> lib/librte_vhost/vhost_user.h | 6 +-----
>> 1 file changed, 1 insertion(+), 5 deletions(-)
>>
>> diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
>> index 35ebd7190..2ba22dbb0 100644
>> --- a/lib/librte_vhost/vhost_user.h
>> +++ b/lib/librte_vhost/vhost_user.h
>> @@ -49,14 +49,10 @@
>> #define VHOST_USER_PROTOCOL_F_REPLY_ACK	3
>> #define VHOST_USER_PROTOCOL_F_NET_MTU 4
>>
>> -/*
>> - * disable REPLY_ACK feature to workaround the buggy QEMU implementation.
>> - * Proved buggy QEMU includes v2.7 - v2.9.
>> - */
>> #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
>> 					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
>> 					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
>> -					 (0ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
>> +					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
>> 					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))
>>
>> typedef enum VhostUserRequest {
>> --
>> 2.13.6
>
  
Yuanhan Liu Nov. 3, 2017, 1:05 p.m. UTC | #3
On Thu, Nov 02, 2017 at 10:40:26AM +0100, Maxime Coquelin wrote:
> >Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
> 
> Do you mean that upstream Qemu v2.7.0 is used in production?
> I would expect the customers to use a distro Qemu which should contain
> relevant fixes, or follow upstream's stable branches.
> 
> FYI, Qemu v2.9.1 contains a backport of the fix.
> 
> >One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?
> 
> Yes, that's one option, but:
> 1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
> 2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
> tested.
> 
> Yuanhan, what do you think?

My suggestion is to still disable it by default. Qemu 2.7 - 2.9 (inclusive)
is a pretty big range, that I think quite many people would hit this issue.

	--yliu
  
Maxime Coquelin Nov. 3, 2017, 2:28 p.m. UTC | #4
On 11/03/2017 02:05 PM, Yuanhan Liu wrote:
> On Thu, Nov 02, 2017 at 10:40:26AM +0100, Maxime Coquelin wrote:
>>> Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
>>
>> Do you mean that upstream Qemu v2.7.0 is used in production?
>> I would expect the customers to use a distro Qemu which should contain
>> relevant fixes, or follow upstream's stable branches.
>>
>> FYI, Qemu v2.9.1 contains a backport of the fix.
>>
>>> One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?
>>
>> Yes, that's one option, but:
>> 1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
>> 2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
>> tested.
>>
>> Yuanhan, what do you think?
> 
> My suggestion is to still disable it by default. Qemu 2.7 - 2.9 (inclusive)
> is a pretty big range, that I think quite many people would hit this issue
Ok, then what about adding a new flag to rte_vhost_driver_register(), as
done for tx zero copy to enable IOMMU feature?
If flag is unset, then we mask out both IOMMU virtio feature flag and
REPLY_ACK protocol feature flag.

For a while this flag will be unset by default, not to break these
deprecated and unmaintained Qemu versions. But I think at some point
we should make it enabled by default, as it would be sad not to benefit
from this security feature.

This change will have an impact on OVS, as it will need a new vhost-user
port option to enable IOMMU feature. Thing that is transparent to OVS
currently.

Mark, Yuanhan, does that sound good to you?

Maxime
> 	--yliu
>
  
Thomas Monjalon Nov. 3, 2017, 3:34 p.m. UTC | #5
02/11/2017 10:40, Maxime Coquelin:
> Hi Mark,
> 
> On 11/01/2017 06:11 PM, Kavanagh, Mark B wrote:
> > Hi Maxime,
> > 
> > First off, apologies for the lateness of this reply - I realize that this patch has already been upstreamed.
> 
> No worries, great to see DPDK integration being tested before v17.11 is
> released. Is the v17.11 upgrade patch available somewhere?
> 
> > Unfortunately, during OvS-DPDK regression testing for DPDK v17.11-rc2 just today, a regression involving vHost multiq was detected, and pinpointed to this patch.
> > 
> > Version info for the components involved during the aforementioned testing is as follows:
> > DPDK:	v17.11-rc2
> > OvS:	af2e40c ("sparse: eliminate "duplicate initialization") + DPDK v17.11 upgrade patch
> > QEMU:	v2.7.0
[...]
> > Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
> 
> Do you mean that upstream Qemu v2.7.0 is used in production?
> I would expect the customers to use a distro Qemu which should contain
> relevant fixes, or follow upstream's stable branches.

Me too, I would expect they integrate the fixes.

> FYI, Qemu v2.9.1 contains a backport of the fix.

But you know, some users do not want to upgrade anything in production,
as in the old time of hardware networking equipments.
Curiously, the case considered here seems to be users sticked to old Qemu
while willing to consider the switch as an upgradable software.
It is really really strange, but let's consider such constraint.

If I remember well, we have integrated the vhost multiqueue feature
as soon as a Qemu release was almost ready.
For the record, it was Qemu 2.5.0 (released in Dec 2015):
	https://lists.nongnu.org/archive/html/qemu-devel/2015-12/msg02731.html
	https://wiki.qemu.org/ChangeLog/2.5#virtio
And it was supported in DPDK 2.2.0 (released one day before):
	http://dpdk.org/ml/archives/announce/2015-December/000073.html
	http://dpdk.org/doc/guides/rel_notes/release_2_2.html

Nowadays, Qemu 2.10 is ready to let us enable IOMMU support.
But you ask to wait more. How much time should we wait?
Is there a policy to agree or are we just waiting for someone to approve?
  
Mark Kavanagh Nov. 3, 2017, 4:31 p.m. UTC | #6
>From: Thomas Monjalon [mailto:thomas@monjalon.net]
>Sent: Friday, November 3, 2017 3:35 PM
>To: Kavanagh, Mark B <mark.b.kavanagh@intel.com>; yliu@fridaylinux.org
>Cc: dev@dpdk.org; Maxime Coquelin <maxime.coquelin@redhat.com>; Horton, Remy
><remy.horton@intel.com>; Bie, Tiwei <tiwei.bie@intel.com>; mst@redhat.com;
>jfreiman@redhat.com; vkaplans@redhat.com; jasowang@redhat.com; Mcnamara, John
><john.mcnamara@intel.com>; Loftus, Ciara <ciara.loftus@intel.com>; Stokes, Ian
><ian.stokes@intel.com>
>Subject: Re: [dpdk-dev] [PATCH v3 01/19] Revert "vhost: workaround MQ fails to
>startup"
>
>02/11/2017 10:40, Maxime Coquelin:
>> Hi Mark,
>>
>> On 11/01/2017 06:11 PM, Kavanagh, Mark B wrote:
>> > Hi Maxime,
>> >
>> > First off, apologies for the lateness of this reply - I realize that this
>patch has already been upstreamed.
>>
>> No worries, great to see DPDK integration being tested before v17.11 is
>> released. Is the v17.11 upgrade patch available somewhere?
>>
>> > Unfortunately, during OvS-DPDK regression testing for DPDK v17.11-rc2 just
>today, a regression involving vHost multiq was detected, and pinpointed to
>this patch.
>> >
>> > Version info for the components involved during the aforementioned testing
>is as follows:
>> > DPDK:	v17.11-rc2
>> > OvS:	af2e40c ("sparse: eliminate "duplicate initialization") + DPDK
>v17.11 upgrade patch
>> > QEMU:	v2.7.0
>[...]
>> > Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein
>lies the issue: QEMU v2.10.0 was only released in August of this year;
>anecdotally, we know that many OvS-DPDK customers use older versions of QEMU
>(typically, v2.7.0), and are likely un[able|willing] to move. With this patch,
>a hard dependency on QEMU v2.10 is created for users who want to use the vHU
>multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0),
>which IMO will likely be unacceptable for many.
>>
>> Do you mean that upstream Qemu v2.7.0 is used in production?
>> I would expect the customers to use a distro Qemu which should contain
>> relevant fixes, or follow upstream's stable branches.
>

To be honest, I don't have hard data to back this up, apart from anecdotal reports that "some customers use 'older' versions of QEMU".
I understand that this is not the most solid foundation upon which to build an argument.

>Me too, I would expect they integrate the fixes.
>
>> FYI, Qemu v2.9.1 contains a backport of the fix.
>
>But you know, some users do not want to upgrade anything in production,
>as in the old time of hardware networking equipments.
>Curiously, the case considered here seems to be users sticked to old Qemu
>while willing to consider the switch as an upgradable software.
>It is really really strange, but let's consider such constraint.
>
>If I remember well, we have integrated the vhost multiqueue feature
>as soon as a Qemu release was almost ready.
>For the record, it was Qemu 2.5.0 (released in Dec 2015):
>	https://lists.nongnu.org/archive/html/qemu-devel/2015-12/msg02731.html
>	https://wiki.qemu.org/ChangeLog/2.5#virtio
>And it was supported in DPDK 2.2.0 (released one day before):
>	http://dpdk.org/ml/archives/announce/2015-December/000073.html
>	http://dpdk.org/doc/guides/rel_notes/release_2_2.html

Hmm, that's an interesting data point.

>
>Nowadays, Qemu 2.10 is ready to let us enable IOMMU support.
>But you ask to wait more. How much time should we wait?
>Is there a policy to agree or are we just waiting for someone to approve?

I'm afraid that I don't have an answer for this. 
My concern here was purely proactive - however, without concrete data to back it up, and in the light of Thomas' point regarding 2.5.0/DPDK 2.2.0, perhaps my concerns are unfounded.

It may be sufficient therefore, to simply document compatible versions of QEMU as part of the OvS documentation.

Thanks to all involved for your input,
Mark
  
Yuanhan Liu Nov. 6, 2017, noon UTC | #7
On Fri, Nov 03, 2017 at 03:28:36PM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/03/2017 02:05 PM, Yuanhan Liu wrote:
> >On Thu, Nov 02, 2017 at 10:40:26AM +0100, Maxime Coquelin wrote:
> >>>Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
> >>
> >>Do you mean that upstream Qemu v2.7.0 is used in production?
> >>I would expect the customers to use a distro Qemu which should contain
> >>relevant fixes, or follow upstream's stable branches.
> >>
> >>FYI, Qemu v2.9.1 contains a backport of the fix.
> >>
> >>>One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?
> >>
> >>Yes, that's one option, but:
> >>1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
> >>2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
> >>tested.
> >>
> >>Yuanhan, what do you think?
> >
> >My suggestion is to still disable it by default. Qemu 2.7 - 2.9 (inclusive)
> >is a pretty big range, that I think quite many people would hit this issue
> Ok, then what about adding a new flag to rte_vhost_driver_register(), as
> done for tx zero copy to enable IOMMU feature?
> If flag is unset, then we mask out both IOMMU virtio feature flag and
> REPLY_ACK protocol feature flag.
> 
> For a while this flag will be unset by default, not to break these
> deprecated and unmaintained Qemu versions. But I think at some point
> we should make it enabled by default, as it would be sad not to benefit
> from this security feature.

This sounds good to me.

	--yliu
> 
> This change will have an impact on OVS, as it will need a new vhost-user
> port option to enable IOMMU feature. Thing that is transparent to OVS
> currently.
> 
> Mark, Yuanhan, does that sound good to you?
> 
> Maxime
> >	--yliu
> >
  
Maxime Coquelin Nov. 6, 2017, 12:07 p.m. UTC | #8
On 11/06/2017 01:00 PM, Yuanhan Liu wrote:
> On Fri, Nov 03, 2017 at 03:28:36PM +0100, Maxime Coquelin wrote:
>>
>>
>> On 11/03/2017 02:05 PM, Yuanhan Liu wrote:
>>> On Thu, Nov 02, 2017 at 10:40:26AM +0100, Maxime Coquelin wrote:
>>>>> Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
>>>>
>>>> Do you mean that upstream Qemu v2.7.0 is used in production?
>>>> I would expect the customers to use a distro Qemu which should contain
>>>> relevant fixes, or follow upstream's stable branches.
>>>>
>>>> FYI, Qemu v2.9.1 contains a backport of the fix.
>>>>
>>>>> One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?
>>>>
>>>> Yes, that's one option, but:
>>>> 1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
>>>> 2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
>>>> tested.
>>>>
>>>> Yuanhan, what do you think?
>>>
>>> My suggestion is to still disable it by default. Qemu 2.7 - 2.9 (inclusive)
>>> is a pretty big range, that I think quite many people would hit this issue
>> Ok, then what about adding a new flag to rte_vhost_driver_register(), as
>> done for tx zero copy to enable IOMMU feature?
>> If flag is unset, then we mask out both IOMMU virtio feature flag and
>> REPLY_ACK protocol feature flag.
>>
>> For a while this flag will be unset by default, not to break these
>> deprecated and unmaintained Qemu versions. But I think at some point
>> we should make it enabled by default, as it would be sad not to benefit
>> from this security feature.
> 
> This sounds good to me.

Actually, I have posted a different patch, so that we don't have API
change for this. Upstream OVS can disable IOMMU feature, which will in
turn disable REPLY-ACK protocol feature if they want to.

Thanks,
Maxime

> 	--yliu
>>
>> This change will have an impact on OVS, as it will need a new vhost-user
>> port option to enable IOMMU feature. Thing that is transparent to OVS
>> currently.
>>
>> Mark, Yuanhan, does that sound good to you?
>>
>> Maxime
>>> 	--yliu
>>>
  
Yuanhan Liu Nov. 6, 2017, 12:24 p.m. UTC | #9
On Mon, Nov 06, 2017 at 01:07:15PM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/06/2017 01:00 PM, Yuanhan Liu wrote:
> >On Fri, Nov 03, 2017 at 03:28:36PM +0100, Maxime Coquelin wrote:
> >>
> >>
> >>On 11/03/2017 02:05 PM, Yuanhan Liu wrote:
> >>>On Thu, Nov 02, 2017 at 10:40:26AM +0100, Maxime Coquelin wrote:
> >>>>>Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
> >>>>
> >>>>Do you mean that upstream Qemu v2.7.0 is used in production?
> >>>>I would expect the customers to use a distro Qemu which should contain
> >>>>relevant fixes, or follow upstream's stable branches.
> >>>>
> >>>>FYI, Qemu v2.9.1 contains a backport of the fix.
> >>>>
> >>>>>One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?
> >>>>
> >>>>Yes, that's one option, but:
> >>>>1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
> >>>>2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
> >>>>tested.
> >>>>
> >>>>Yuanhan, what do you think?
> >>>
> >>>My suggestion is to still disable it by default. Qemu 2.7 - 2.9 (inclusive)
> >>>is a pretty big range, that I think quite many people would hit this issue
> >>Ok, then what about adding a new flag to rte_vhost_driver_register(), as
> >>done for tx zero copy to enable IOMMU feature?
> >>If flag is unset, then we mask out both IOMMU virtio feature flag and
> >>REPLY_ACK protocol feature flag.
> >>
> >>For a while this flag will be unset by default, not to break these
> >>deprecated and unmaintained Qemu versions. But I think at some point
> >>we should make it enabled by default, as it would be sad not to benefit
> >>from this security feature.
> >
> >This sounds good to me.
> 
> Actually, I have posted a different patch, so that we don't have API
> change for this. Upstream OVS can disable IOMMU feature, which will in
> turn disable REPLY-ACK protocol feature if they want to.

Sorry I missed that. So the REPLY-ACK will still be enabled by default and
you leave the choice to the users to disable it, explicitly? This doesn't
sound the best to me. We now know that it breaks OVS, but other users may
hit the same issue again without any awareness.

Also, I know this feature brings good benefits on security. But IIRC, you
mentioned that it became barely un-usable with Linux kernel virtio-net
driver.

From the two points, I think let's make it be disable by default now?

	--yliu
  
Maxime Coquelin Nov. 6, 2017, 12:50 p.m. UTC | #10
On 11/06/2017 01:24 PM, Yuanhan Liu wrote:
> On Mon, Nov 06, 2017 at 01:07:15PM +0100, Maxime Coquelin wrote:
>>
>>
>> On 11/06/2017 01:00 PM, Yuanhan Liu wrote:
>>> On Fri, Nov 03, 2017 at 03:28:36PM +0100, Maxime Coquelin wrote:
>>>>
>>>>
>>>> On 11/03/2017 02:05 PM, Yuanhan Liu wrote:
>>>>> On Thu, Nov 02, 2017 at 10:40:26AM +0100, Maxime Coquelin wrote:
>>>>>>> Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
>>>>>>
>>>>>> Do you mean that upstream Qemu v2.7.0 is used in production?
>>>>>> I would expect the customers to use a distro Qemu which should contain
>>>>>> relevant fixes, or follow upstream's stable branches.
>>>>>>
>>>>>> FYI, Qemu v2.9.1 contains a backport of the fix.
>>>>>>
>>>>>>> One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?
>>>>>>
>>>>>> Yes, that's one option, but:
>>>>>> 1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
>>>>>> 2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
>>>>>> tested.
>>>>>>
>>>>>> Yuanhan, what do you think?
>>>>>
>>>>> My suggestion is to still disable it by default. Qemu 2.7 - 2.9 (inclusive)
>>>>> is a pretty big range, that I think quite many people would hit this issue
>>>> Ok, then what about adding a new flag to rte_vhost_driver_register(), as
>>>> done for tx zero copy to enable IOMMU feature?
>>>> If flag is unset, then we mask out both IOMMU virtio feature flag and
>>>> REPLY_ACK protocol feature flag.
>>>>
>>>> For a while this flag will be unset by default, not to break these
>>>> deprecated and unmaintained Qemu versions. But I think at some point
>>>> we should make it enabled by default, as it would be sad not to benefit
>>> >from this security feature.
>>>
>>> This sounds good to me.
>>
>> Actually, I have posted a different patch, so that we don't have API
>> change for this. Upstream OVS can disable IOMMU feature, which will in
>> turn disable REPLY-ACK protocol feature if they want to.
> 
> Sorry I missed that. So the REPLY-ACK will still be enabled by default and
> you leave the choice to the users to disable it, explicitly? This doesn't
> sound the best to me. We now know that it breaks OVS, but other users may
> hit the same issue again without any awareness.
 >
> Also, I know this feature brings good benefits on security. But IIRC, you
> mentioned that it became barely un-usable with Linux kernel virtio-net
> driver.
> 
>  From the two points, I think let's make it be disable by default now?

What concerns me is that hasn't been replied yet is when will we 
consider Qemu 2.7.0-Qemu v2.9.0 (Qemu v2.9.1 being fixed) old enough
to enable it by default? Knowing that Qemu 2.7.x/2.8.x are already
end of life uptream.

Maxime
> 	--yliu
>
  
Yuanhan Liu Nov. 6, 2017, 1:36 p.m. UTC | #11
On Mon, Nov 06, 2017 at 01:50:35PM +0100, Maxime Coquelin wrote:
> 
> 
> On 11/06/2017 01:24 PM, Yuanhan Liu wrote:
> >On Mon, Nov 06, 2017 at 01:07:15PM +0100, Maxime Coquelin wrote:
> >>
> >>
> >>On 11/06/2017 01:00 PM, Yuanhan Liu wrote:
> >>>On Fri, Nov 03, 2017 at 03:28:36PM +0100, Maxime Coquelin wrote:
> >>>>
> >>>>
> >>>>On 11/03/2017 02:05 PM, Yuanhan Liu wrote:
> >>>>>On Thu, Nov 02, 2017 at 10:40:26AM +0100, Maxime Coquelin wrote:
> >>>>>>>Moving from QEMU v2.7.0 to v2.10.0 resolves the issue. However, herein lies the issue: QEMU v2.10.0 was only released in August of this year; anecdotally, we know that many OvS-DPDK customers use older versions of QEMU (typically, v2.7.0), and are likely un[able|willing] to move. With this patch, a hard dependency on QEMU v2.10 is created for users who want to use the vHU multiq feature in DPDK v17.11 (and subsequently, the upcoming OvS v2.9.0), which IMO will likely be unacceptable for many.
> >>>>>>
> >>>>>>Do you mean that upstream Qemu v2.7.0 is used in production?
> >>>>>>I would expect the customers to use a distro Qemu which should contain
> >>>>>>relevant fixes, or follow upstream's stable branches.
> >>>>>>
> >>>>>>FYI, Qemu v2.9.1 contains a backport of the fix.
> >>>>>>
> >>>>>>>One potential solution to this problem is to introduce a compile-time option that would allow the user to [dis|en]able the VHOST_USER_PROTOCOL_F_REPLY_ACK feature - is that something that would be acceptable to you Maxime?
> >>>>>>
> >>>>>>Yes, that's one option, but:
> >>>>>>1. VHOST_USER_PROTOCOL_F_REPLY_ACK enabled should be the default
> >>>>>>2. VHOST_USER_PROTOCOL_F_REPLY_ACK disabled will be less extensively
> >>>>>>tested.
> >>>>>>
> >>>>>>Yuanhan, what do you think?
> >>>>>
> >>>>>My suggestion is to still disable it by default. Qemu 2.7 - 2.9 (inclusive)
> >>>>>is a pretty big range, that I think quite many people would hit this issue
> >>>>Ok, then what about adding a new flag to rte_vhost_driver_register(), as
> >>>>done for tx zero copy to enable IOMMU feature?
> >>>>If flag is unset, then we mask out both IOMMU virtio feature flag and
> >>>>REPLY_ACK protocol feature flag.
> >>>>
> >>>>For a while this flag will be unset by default, not to break these
> >>>>deprecated and unmaintained Qemu versions. But I think at some point
> >>>>we should make it enabled by default, as it would be sad not to benefit
> >>>>from this security feature.
> >>>
> >>>This sounds good to me.
> >>
> >>Actually, I have posted a different patch, so that we don't have API
> >>change for this. Upstream OVS can disable IOMMU feature, which will in
> >>turn disable REPLY-ACK protocol feature if they want to.
> >
> >Sorry I missed that. So the REPLY-ACK will still be enabled by default and
> >you leave the choice to the users to disable it, explicitly? This doesn't
> >sound the best to me. We now know that it breaks OVS, but other users may
> >hit the same issue again without any awareness.
> >
> >Also, I know this feature brings good benefits on security. But IIRC, you
> >mentioned that it became barely un-usable with Linux kernel virtio-net
> >driver.
> >
> > From the two points, I think let's make it be disable by default now?
> 
> What concerns me is that hasn't been replied yet is when will we consider
> Qemu 2.7.0-Qemu v2.9.0 (Qemu v2.9.1 being fixed) old enough
> to enable it by default? Knowing that Qemu 2.7.x/2.8.x are already
> end of life uptream.

I can't tell. But there are probably something we could do. For example, we
could introduce a vhost pmd option, to enable the IOMMU feature. If the user
concerns about the security, he could use such option. By default, let's still
disable it. Meanwhile, OVS may could also add such an option.

	--yliu
  

Patch

diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h
index 35ebd7190..2ba22dbb0 100644
--- a/lib/librte_vhost/vhost_user.h
+++ b/lib/librte_vhost/vhost_user.h
@@ -49,14 +49,10 @@ 
 #define VHOST_USER_PROTOCOL_F_REPLY_ACK	3
 #define VHOST_USER_PROTOCOL_F_NET_MTU 4
 
-/*
- * disable REPLY_ACK feature to workaround the buggy QEMU implementation.
- * Proved buggy QEMU includes v2.7 - v2.9.
- */
 #define VHOST_USER_PROTOCOL_FEATURES	((1ULL << VHOST_USER_PROTOCOL_F_MQ) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD) |\
 					 (1ULL << VHOST_USER_PROTOCOL_F_RARP) | \
-					 (0ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
+					 (1ULL << VHOST_USER_PROTOCOL_F_REPLY_ACK) | \
 					 (1ULL << VHOST_USER_PROTOCOL_F_NET_MTU))
 
 typedef enum VhostUserRequest {