[v7,2/2] bus/pci: support MMIO in PCI ioport accessors
Checks
Commit Message
From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
With IO BAR, we get PIO(programmed IO) address.
With MMIO BAR, we get mapped virtual address.
We distinguish PIO(Programmed IO) and MMIO(memory mapped IO) by their address like how kernel does.
ioread/write8/16/32 is provided to access PIO/MMIO.
By the way, for virtio on arch other than x86, BAR flag indicates PIO but is mapped.
Signed-off-by: huawei xie <huawei.xhw@alibaba-inc.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
---
drivers/bus/pci/linux/pci.c | 4 --
drivers/bus/pci/linux/pci_uio.c | 156 +++++++++++++++++++++++++++++-----------
2 files changed, 113 insertions(+), 47 deletions(-)
Comments
On 2/22/2021 5:15 PM, 谢华伟(此时此刻) wrote:
> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>
> With IO BAR, we get PIO(programmed IO) address.
> With MMIO BAR, we get mapped virtual address.
> We distinguish PIO(Programmed IO) and MMIO(memory mapped IO) by their address like how kernel does.
> ioread/write8/16/32 is provided to access PIO/MMIO.
> By the way, for virtio on arch other than x86, BAR flag indicates PIO but is mapped.
>
> Signed-off-by: huawei xie <huawei.xhw@alibaba-inc.com>
> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
<...>
> +
> +static inline void iowrite8(uint8_t val, void *addr)
> +{
> + (uint64_t)(uintptr_t)addr >= PIO_MAX ?
> + *(volatile uint8_t *)addr = val :
> + outb(val, (unsigned long)addr);
//copying question from previous version:
Is the 'outb_p' to 'outb' conversion intentional? And if so why?
Same of the all 'outb_p', 'outw_p', 'outl_p'.
On 2021/2/23 1:25, Ferruh Yigit wrote:
> On 2/22/2021 5:15 PM, 谢华伟(此时此刻) wrote:
>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>
>> With IO BAR, we get PIO(programmed IO) address.
>> With MMIO BAR, we get mapped virtual address.
>> We distinguish PIO(Programmed IO) and MMIO(memory mapped IO) by their
>> address like how kernel does.
>> ioread/write8/16/32 is provided to access PIO/MMIO.
>> By the way, for virtio on arch other than x86, BAR flag indicates PIO
>> but is mapped.
>>
>> Signed-off-by: huawei xie <huawei.xhw@alibaba-inc.com>
>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>
> <...>
>
>> +
>> +static inline void iowrite8(uint8_t val, void *addr)
>> +{
>> + (uint64_t)(uintptr_t)addr >= PIO_MAX ?
>> + *(volatile uint8_t *)addr = val :
>> + outb(val, (unsigned long)addr);
>
> //copying question from previous version:
>
> Is the 'outb_p' to 'outb' conversion intentional? And if so why?
>
> Same of the all 'outb_p', 'outw_p', 'outl_p'.
There is no need to delay for virtio device, as we can see in virtio
legacy driver.
IMO, the delay is for ugly old device. The device itself should assure
the previous IO completes when the subsequent IO instruction arrives.
On 2/23/2021 2:20 PM, 谢华伟(此时此刻) wrote:
>
> On 2021/2/23 1:25, Ferruh Yigit wrote:
>> On 2/22/2021 5:15 PM, 谢华伟(此时此刻) wrote:
>>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>>
>>> With IO BAR, we get PIO(programmed IO) address.
>>> With MMIO BAR, we get mapped virtual address.
>>> We distinguish PIO(Programmed IO) and MMIO(memory mapped IO) by their address
>>> like how kernel does.
>>> ioread/write8/16/32 is provided to access PIO/MMIO.
>>> By the way, for virtio on arch other than x86, BAR flag indicates PIO but is
>>> mapped.
>>>
>>> Signed-off-by: huawei xie <huawei.xhw@alibaba-inc.com>
>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>
>> <...>
>>
>>> +
>>> +static inline void iowrite8(uint8_t val, void *addr)
>>> +{
>>> + (uint64_t)(uintptr_t)addr >= PIO_MAX ?
>>> + *(volatile uint8_t *)addr = val :
>>> + outb(val, (unsigned long)addr);
>>
>> //copying question from previous version:
>>
>> Is the 'outb_p' to 'outb' conversion intentional? And if so why?
>>
>> Same of the all 'outb_p', 'outw_p', 'outl_p'.
>
> There is no need to delay for virtio device, as we can see in virtio legacy driver.
>
> IMO, the delay is for ugly old device. The device itself should assure the
> previous IO completes when the subsequent IO instruction arrives.
>
Can there be any virtio legacy device needing this?
What is the downside of using "pause until the I/O completes" versions?
On 2021/2/24 23:45, Ferruh Yigit wrote:
> On 2/23/2021 2:20 PM, 谢华伟(此时此刻) wrote:
>>
>> On 2021/2/23 1:25, Ferruh Yigit wrote:
>>> On 2/22/2021 5:15 PM, 谢华伟(此时此刻) wrote:
>>>> From: "huawei.xhw" <huawei.xhw@alibaba-inc.com>
>>>>
>>>> With IO BAR, we get PIO(programmed IO) address.
>>>> With MMIO BAR, we get mapped virtual address.
>>>> We distinguish PIO(Programmed IO) and MMIO(memory mapped IO) by
>>>> their address like how kernel does.
>>>> ioread/write8/16/32 is provided to access PIO/MMIO.
>>>> By the way, for virtio on arch other than x86, BAR flag indicates
>>>> PIO but is mapped.
>>>>
>>>> Signed-off-by: huawei xie <huawei.xhw@alibaba-inc.com>
>>>> Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>
>>> <...>
>>>
>>>> +
>>>> +static inline void iowrite8(uint8_t val, void *addr)
>>>> +{
>>>> + (uint64_t)(uintptr_t)addr >= PIO_MAX ?
>>>> + *(volatile uint8_t *)addr = val :
>>>> + outb(val, (unsigned long)addr);
>>>
>>> //copying question from previous version:
>>>
>>> Is the 'outb_p' to 'outb' conversion intentional? And if so why?
>>>
>>> Same of the all 'outb_p', 'outw_p', 'outl_p'.
>>
>> There is no need to delay for virtio device, as we can see in virtio
>> legacy driver.
>>
>> IMO, the delay is for ugly old device. The device itself should
>> assure the previous IO completes when the subsequent IO instruction
>> arrives.
>>
>
> Can there be any virtio legacy device needing this?
The pause version delays sometime by writing to 0x80 debug port. virtio
doesn't need this. virtio legacy PMD driver doens't use this.
Any device relying on this i think is buggy. How could the device rely
on some uncertain cpu cycles to behave correct?
>
> What is the downside of using "pause until the I/O completes" versions?
The downside in virtio PMD is a small performance penalty when we use it
to notify backend. CPU executes unnecessary serializing IO instruction.
I check kernel code, io wrapper for in/out doesn't use p version.
On Thu, Feb 25, 2021 at 5:00 AM 谢华伟(此时此刻) <huawei.xhw@alibaba-inc.com> wrote:
> >>> Is the 'outb_p' to 'outb' conversion intentional? And if so why?
> >>>
> >>> Same of the all 'outb_p', 'outw_p', 'outl_p'.
> >>
> >> There is no need to delay for virtio device, as we can see in virtio
> >> legacy driver.
> >>
> >> IMO, the delay is for ugly old device. The device itself should
> >> assure the previous IO completes when the subsequent IO instruction
> >> arrives.
> >>
> >
> > Can there be any virtio legacy device needing this?
>
> The pause version delays sometime by writing to 0x80 debug port. virtio
> doesn't need this. virtio legacy PMD driver doens't use this.
>
> Any device relying on this i think is buggy. How could the device rely
> on some uncertain cpu cycles to behave correct?
>
> >
> > What is the downside of using "pause until the I/O completes" versions?
>
> The downside in virtio PMD is a small performance penalty when we use it
> to notify backend. CPU executes unnecessary serializing IO instruction.
>
> I check kernel code, io wrapper for in/out doesn't use p version.
This change is a fix/optimisation.
This is a separate topic from adding MMIO support with x86 ioport.
I would split as a separate patch.
On 2021/2/25 17:52, David Marchand wrote:
> On Thu, Feb 25, 2021 at 5:00 AM 谢华伟(此时此刻) <huawei.xhw@alibaba-inc.com> wrote:
>>>>> Is the 'outb_p' to 'outb' conversion intentional? And if so why?
>>>>>
>>>>> Same of the all 'outb_p', 'outw_p', 'outl_p'.
>>>> There is no need to delay for virtio device, as we can see in virtio
>>>> legacy driver.
>>>>
>>>> IMO, the delay is for ugly old device. The device itself should
>>>> assure the previous IO completes when the subsequent IO instruction
>>>> arrives.
>>>>
>>> Can there be any virtio legacy device needing this?
>> The pause version delays sometime by writing to 0x80 debug port. virtio
>> doesn't need this. virtio legacy PMD driver doens't use this.
>>
>> Any device relying on this i think is buggy. How could the device rely
>> on some uncertain cpu cycles to behave correct?
>>
>>> What is the downside of using "pause until the I/O completes" versions?
>> The downside in virtio PMD is a small performance penalty when we use it
>> to notify backend. CPU executes unnecessary serializing IO instruction.
>>
>> I check kernel code, io wrapper for in/out doesn't use p version.
> This change is a fix/optimisation.
> This is a separate topic from adding MMIO support with x86 ioport.
> I would split as a separate patch.
Hi David:
Maybe there is confuse? There is no change. The out/in is added. I don't
remove _p on purpose.
>
>
On Mon, Mar 1, 2021 at 4:44 PM 谢华伟(此时此刻) <huawei.xhw@alibaba-inc.com> wrote:
> >>> What is the downside of using "pause until the I/O completes" versions?
> >> The downside in virtio PMD is a small performance penalty when we use it
> >> to notify backend. CPU executes unnecessary serializing IO instruction.
> >>
> >> I check kernel code, io wrapper for in/out doesn't use p version.
> > This change is a fix/optimisation.
> > This is a separate topic from adding MMIO support with x86 ioport.
> > I would split as a separate patch.
>
> Hi David:
>
> Maybe there is confuse? There is no change. The out/in is added. I don't
> remove _p on purpose.
Looking at v8 and repeating previous mails:
+#if defined(RTE_ARCH_X86)
...
+static inline void iowrite8(uint8_t val, void *addr)
+{
+ (uint64_t)(uintptr_t)addr >= PIO_MAX ?
+ *(volatile uint8_t *)addr = val :
+ outb(val, (unsigned long)addr); <======
+}
[...]
-#if defined(RTE_ARCH_X86)
- outb_p(*s, reg); <======
-#else
- *(volatile uint8_t *)reg = *s;
-#endif
+ iowrite8(*s, (void *)reg);
This almost went unnoticed (thanks Ferruh for spotting).
Do we _need_ this change on outX_p -> outX?
I am not comfortable at touching such low level internal routines that
have been in dpdk since v1.5.0.
If there is a good reason, it has nothing to do with adding MMIO
support and must be split in a separate patch.
If there is no reason, please restore outX_p, since the safest is not to touch.
On 2021/3/2 21:14, David Marchand wrote:
>>> This change is a fix/optimisation.
>>> This is a separate topic from adding MMIO support with x86 ioport.
>>> I would split as a separate patch.
>> Hi David:
>>
>> Maybe there is confuse? There is no change. The out/in is added. I don't
>> remove _p on purpose.
> Looking at v8 and repeating previous mails:
>
> +#if defined(RTE_ARCH_X86)
> ...
> +static inline void iowrite8(uint8_t val, void *addr)
> +{
> + (uint64_t)(uintptr_t)addr >= PIO_MAX ?
> + *(volatile uint8_t *)addr = val :
> + outb(val, (unsigned long)addr); <======
> +}
>
> [...]
>
>
> -#if defined(RTE_ARCH_X86)
> - outb_p(*s, reg); <======
> -#else
> - *(volatile uint8_t *)reg = *s;
> -#endif
> + iowrite8(*s, (void *)reg);
>
>
> This almost went unnoticed (thanks Ferruh for spotting).
>
> Do we_need_ this change on outX_p -> outX?
I understand where the confuse comes from. In the previous
implementation, it is _p version, however i think _p is not needed.
In the initial DPDK virtio PMD, there is no _p, if my memory is correct.
Don't know who added it, if any reason.
Anyway, i will send v9 with _p.
> I am not comfortable at touching such low level internal routines that
> have been in dpdk since v1.5.0.
>
> If there is a good reason, it has nothing to do with adding MMIO
> support and must be split in a separate patch.
> If there is no reason, please restore outX_p, since the safest is not to touch.
@@ -715,8 +715,6 @@ int rte_pci_write_config(const struct rte_pci_device *device,
break;
#endif
case RTE_PCI_KDRV_IGB_UIO:
- pci_uio_ioport_read(p, data, len, offset);
- break;
case RTE_PCI_KDRV_UIO_GENERIC:
pci_uio_ioport_read(p, data, len, offset);
break;
@@ -736,8 +734,6 @@ int rte_pci_write_config(const struct rte_pci_device *device,
break;
#endif
case RTE_PCI_KDRV_IGB_UIO:
- pci_uio_ioport_write(p, data, len, offset);
- break;
case RTE_PCI_KDRV_UIO_GENERIC:
pci_uio_ioport_write(p, data, len, offset);
break;
@@ -368,6 +368,8 @@
return -1;
}
+#define PIO_MAX 0x10000
+
#if defined(RTE_ARCH_X86)
int
pci_uio_ioport_map(struct rte_pci_device *dev, int bar,
@@ -381,12 +383,6 @@
unsigned long base;
int i;
- if (rte_eal_iopl_init() != 0) {
- RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for PCI device %s\n",
- __func__, dev->name);
- return -1;
- }
-
/* open and read addresses of the corresponding resource in sysfs */
snprintf(filename, sizeof(filename), "%s/" PCI_PRI_FMT "/resource",
rte_pci_get_sysfs_path(), dev->addr.domain, dev->addr.bus,
@@ -408,15 +404,27 @@
&end_addr, &flags) < 0)
goto error;
- if (!(flags & IORESOURCE_IO)) {
- RTE_LOG(ERR, EAL, "%s(): bar resource other than IO is not supported\n", __func__);
- goto error;
- }
- base = (unsigned long)phys_addr;
- RTE_LOG(INFO, EAL, "%s(): PIO BAR %08lx detected\n", __func__, base);
+ if (flags & IORESOURCE_IO) {
+ if (rte_eal_iopl_init()) {
+ RTE_LOG(ERR, EAL, "%s(): insufficient ioport permissions for PCI device %s\n",
+ __func__, dev->name);
+ goto error;
+ }
- if (base > UINT16_MAX)
+ base = (unsigned long)phys_addr;
+ if (base > PIO_MAX) {
+ RTE_LOG(ERR, EAL, "%s(): %08lx too large PIO resource\n", __func__, base);
+ goto error;
+ }
+
+ RTE_LOG(DEBUG, EAL, "%s(): PIO BAR %08lx detected\n", __func__, base);
+ } else if (flags & IORESOURCE_MEM) {
+ base = (unsigned long)dev->mem_resource[bar].addr;
+ RTE_LOG(DEBUG, EAL, "%s(): MMIO BAR %08lx detected\n", __func__, base);
+ } else {
+ RTE_LOG(ERR, EAL, "%s(): unknown BAR type\n", __func__);
goto error;
+ }
/* FIXME only for primary process ? */
if (dev->intr_handle.type == RTE_INTR_HANDLE_UNKNOWN) {
@@ -517,6 +525,92 @@
}
#endif
+#if defined(RTE_ARCH_X86)
+static inline uint8_t ioread8(void *addr)
+{
+ uint8_t val;
+
+ val = (uint64_t)(uintptr_t)addr >= PIO_MAX ?
+ *(volatile uint8_t *)addr :
+ inb((unsigned long)addr);
+
+ return val;
+}
+
+static inline uint16_t ioread16(void *addr)
+{
+ uint16_t val;
+
+ val = (uint64_t)(uintptr_t)addr >= PIO_MAX ?
+ *(volatile uint16_t *)addr :
+ inw((unsigned long)addr);
+
+ return val;
+}
+
+static inline uint32_t ioread32(void *addr)
+{
+ uint32_t val;
+
+ val = (uint64_t)(uintptr_t)addr >= PIO_MAX ?
+ *(volatile uint32_t *)addr :
+ inl((unsigned long)addr);
+
+ return val;
+}
+
+static inline void iowrite8(uint8_t val, void *addr)
+{
+ (uint64_t)(uintptr_t)addr >= PIO_MAX ?
+ *(volatile uint8_t *)addr = val :
+ outb(val, (unsigned long)addr);
+}
+
+static inline void iowrite16(uint16_t val, void *addr)
+{
+ (uint64_t)(uintptr_t)addr >= PIO_MAX ?
+ *(volatile uint16_t *)addr = val :
+ outw(val, (unsigned long)addr);
+}
+
+static inline void iowrite32(uint32_t val, void *addr)
+{
+ (uint64_t)(uintptr_t)addr >= PIO_MAX ?
+ *(volatile uint32_t *)addr = val :
+ outl(val, (unsigned long)addr);
+}
+#else
+static inline uint8_t ioread8(void *addr)
+{
+ return *(volatile uint8_t *)addr;
+}
+
+static inline uint16_t ioread16(void *addr)
+{
+ return *(volatile uint16_t *)addr;
+}
+
+static inline uint32_t ioread32(void *addr)
+{
+ return *(volatile uint32_t *)addr;
+}
+
+static inline void iowrite8(uint8_t val, void *addr)
+{
+ *(volatile uint8_t *)addr = val;
+}
+
+static inline void iowrite16(uint16_t val, void *addr)
+{
+ *(volatile uint16_t *)addr = val;
+}
+
+static inline void iowrite32(uint32_t val, void *addr)
+{
+ *(volatile uint32_t *)addr = val;
+}
+#endif
+
void
pci_uio_ioport_read(struct rte_pci_ioport *p,
void *data, size_t len, off_t offset)
@@ -528,25 +622,13 @@
for (d = data; len > 0; d += size, reg += size, len -= size) {
if (len >= 4) {
size = 4;
-#if defined(RTE_ARCH_X86)
- *(uint32_t *)d = inl(reg);
-#else
- *(uint32_t *)d = *(volatile uint32_t *)reg;
-#endif
+ *(uint32_t *)d = ioread32((void *)reg);
} else if (len >= 2) {
size = 2;
-#if defined(RTE_ARCH_X86)
- *(uint16_t *)d = inw(reg);
-#else
- *(uint16_t *)d = *(volatile uint16_t *)reg;
-#endif
+ *(uint16_t *)d = ioread16((void *)reg);
} else {
size = 1;
-#if defined(RTE_ARCH_X86)
- *d = inb(reg);
-#else
- *d = *(volatile uint8_t *)reg;
-#endif
+ *d = ioread8((void *)reg);
}
}
}
@@ -562,25 +644,13 @@
for (s = data; len > 0; s += size, reg += size, len -= size) {
if (len >= 4) {
size = 4;
-#if defined(RTE_ARCH_X86)
- outl_p(*(const uint32_t *)s, reg);
-#else
- *(volatile uint32_t *)reg = *(const uint32_t *)s;
-#endif
+ iowrite32(*(const uint32_t *)s, (void *)reg);
} else if (len >= 2) {
size = 2;
-#if defined(RTE_ARCH_X86)
- outw_p(*(const uint16_t *)s, reg);
-#else
- *(volatile uint16_t *)reg = *(const uint16_t *)s;
-#endif
+ iowrite16(*(const uint16_t *)s, (void *)reg);
} else {
size = 1;
-#if defined(RTE_ARCH_X86)
- outb_p(*s, reg);
-#else
- *(volatile uint8_t *)reg = *s;
-#endif
+ iowrite8(*s, (void *)reg);
}
}
}