[v2,1/4] vhost: enforce avail index and desc read ordering
Checks
Commit Message
A read barrier is required to ensure the ordering between
available index and the descriptor reads is enforced.
1. read avail_head = avail->idx
2. read cur_idx = last_avail_idx
if (cur_idx != avail_head) {
3. read idx = avail->ring[cur_idx]
4. read desc[idx]
}
There is a control dependency between step 1 and steps 3 & 4,
3 could be speculatively executed before 1, which could result
in 'idx' to not being updated yet.
Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")
Cc: stable@dpdk.org
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Ilya Maximets <i.maximets@samsung.com>
---
lib/librte_vhost/virtio_net.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
Comments
On Wed, Dec 19, 2018 at 09:21:10AM +0100, Maxime Coquelin wrote:
> A read barrier is required to ensure the ordering between
> available index and the descriptor reads is enforced.
>
> 1. read avail_head = avail->idx
> 2. read cur_idx = last_avail_idx
> if (cur_idx != avail_head) {
> 3. read idx = avail->ring[cur_idx]
> 4. read desc[idx]
> }
>
> There is a control dependency between step 1 and steps 3 & 4,
> 3 could be speculatively executed before 1, which could result
> in 'idx' to not being updated yet.
>
> Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")
> Cc: stable@dpdk.org
>
> Reported-by: Jason Wang <jasowang@redhat.com>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> Acked-by: Ilya Maximets <i.maximets@samsung.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
BTW Ilya do you see a performance degradation from RMBs with these patches?
> ---
> lib/librte_vhost/virtio_net.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 8c657a101..7f37bbbed 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -752,6 +752,12 @@ virtio_dev_rx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
> rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]);
> avail_head = *((volatile uint16_t *)&vq->avail->idx);
>
> + /*
> + * The ordering between avail index and
> + * desc reads needs to be enforced.
> + */
> + rte_smp_rmb();
> +
I'd guess you want to put the RMB before the prefetch. No?
Otherwise I think you are either stalling until that completes
or discard the prefetch ...
> for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
> uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen;
> uint16_t nr_vec = 0;
> @@ -1334,6 +1340,12 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
> if (free_entries == 0)
> return 0;
>
> + /*
> + * The ordering between avail index and
> + * desc reads needs to be enforced.
> + */
> + rte_smp_rmb();
> +
> VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
>
> count = RTE_MIN(count, MAX_PKT_BURST);
> --
> 2.17.2
On 12/19/18 4:47 PM, Michael S. Tsirkin wrote:
> On Wed, Dec 19, 2018 at 09:21:10AM +0100, Maxime Coquelin wrote:
>> A read barrier is required to ensure the ordering between
>> available index and the descriptor reads is enforced.
>>
>> 1. read avail_head = avail->idx
>> 2. read cur_idx = last_avail_idx
>> if (cur_idx != avail_head) {
>> 3. read idx = avail->ring[cur_idx]
>> 4. read desc[idx]
>> }
>>
>> There is a control dependency between step 1 and steps 3 & 4,
>> 3 could be speculatively executed before 1, which could result
>> in 'idx' to not being updated yet.
>>
>> Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")
>> Cc: stable@dpdk.org
>>
>> Reported-by: Jason Wang <jasowang@redhat.com>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Acked-by: Ilya Maximets <i.maximets@samsung.com>
>
> Acked-by: Michael S. Tsirkin <mst@redhat.com>
>
> BTW Ilya do you see a performance degradation from RMBs with these patches?
>
>
>> ---
>> lib/librte_vhost/virtio_net.c | 12 ++++++++++++
>> 1 file changed, 12 insertions(+)
>>
>> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
>> index 8c657a101..7f37bbbed 100644
>> --- a/lib/librte_vhost/virtio_net.c
>> +++ b/lib/librte_vhost/virtio_net.c
>> @@ -752,6 +752,12 @@ virtio_dev_rx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
>> rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]);
>> avail_head = *((volatile uint16_t *)&vq->avail->idx);
>>
>> + /*
>> + * The ordering between avail index and
>> + * desc reads needs to be enforced.
>> + */
>> + rte_smp_rmb();
>> +
>
> I'd guess you want to put the RMB before the prefetch. No?
> Otherwise I think you are either stalling until that completes
> or discard the prefetch ...
Yeah, this is actually done in patch 3.
Thanks,
Maxime
>> for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
>> uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen;
>> uint16_t nr_vec = 0;
>> @@ -1334,6 +1340,12 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
>> if (free_entries == 0)
>> return 0;
>>
>> + /*
>> + * The ordering between avail index and
>> + * desc reads needs to be enforced.
>> + */
>> + rte_smp_rmb();
>> +
>> VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
>>
>> count = RTE_MIN(count, MAX_PKT_BURST);
>> --
>> 2.17.2
@@ -752,6 +752,12 @@ virtio_dev_rx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]);
avail_head = *((volatile uint16_t *)&vq->avail->idx);
+ /*
+ * The ordering between avail index and
+ * desc reads needs to be enforced.
+ */
+ rte_smp_rmb();
+
for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen;
uint16_t nr_vec = 0;
@@ -1334,6 +1340,12 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
if (free_entries == 0)
return 0;
+ /*
+ * The ordering between avail index and
+ * desc reads needs to be enforced.
+ */
+ rte_smp_rmb();
+
VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
count = RTE_MIN(count, MAX_PKT_BURST);