[v2,1/4] igb_uio: add wc option
diff mbox series

Message ID 1530191753-18689-2-git-send-email-rk@semihalf.com
State Superseded, archived
Delegated to: Thomas Monjalon
Headers show
Series
  • support for write combining
Related show

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Rafal Kozik June 28, 2018, 1:15 p.m. UTC
Write combining (WC) increases NIC performance by making better
utilization of PCI bus, but cannot be use by all PMDs.

To get internal_addr memory need to be mapped. But as memory could not be
mapped twice: with and without WC, it should be skipped for WC. [1]

To do not spoil other drivers that potentially could use internal_addr,
parameter wc_activate adds possibility to skip it for those PMDs,
that do not use it.

[1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
	section 5.3 and 5.4

Signed-off-by: Rafal Kozik <rk@semihalf.com>
Acked-by: Bruce Richardson <bruce.richardson@intel.com>
---
 kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

Comments

Ferruh Yigit June 28, 2018, 2:32 p.m. UTC | #1
On 6/28/2018 2:15 PM, Rafal Kozik wrote:
> Write combining (WC) increases NIC performance by making better
> utilization of PCI bus, but cannot be use by all PMDs.
> 
> To get internal_addr memory need to be mapped. But as memory could not be
> mapped twice: with and without WC, it should be skipped for WC. [1]
> 
> To do not spoil other drivers that potentially could use internal_addr,
> parameter wc_activate adds possibility to skip it for those PMDs,
> that do not use it.
> 
> [1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
> 	section 5.3 and 5.4

Hi Rafal,

Thank you for more information but I have a few more question:

- What do you mean "But as memory could not be mapped twice: with and without WC"?

ioremap() maps the physical address for kernel usage, and via uio we are mapping
it to userspace, do you mean these two?

- "internal_addr" is should be for kernel sage not for DPDK drivers which are in
the userspace, why it is a concern for us?

- What happens if you don't update this code at all? Won't you able to map
device address into userspace?
I tested adding RTE_PCI_DRV_WC_ACTIVATE to i40e, on top of your patch, and able
to map without igb_uio update.
I am not able to understand need of the modification.

> 
> Signed-off-by: Rafal Kozik <rk@semihalf.com>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
> ---
>  kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
> 
> diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
> index b3233f1..3382fb1 100644
> --- a/kernel/linux/igb_uio/igb_uio.c
> +++ b/kernel/linux/igb_uio/igb_uio.c
> @@ -30,6 +30,7 @@ struct rte_uio_pci_dev {
>  	int refcnt;
>  };
>  
> +static int wc_activate;
>  static char *intr_mode;
>  static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
>  /* sriov sysfs */
> @@ -375,9 +376,13 @@ igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
>  	len = pci_resource_len(dev, pci_bar);
>  	if (addr == 0 || len == 0)
>  		return -1;
> -	internal_addr = ioremap(addr, len);
> -	if (internal_addr == NULL)
> -		return -1;
> +	if (wc_activate == 0) {
> +		internal_addr = ioremap(addr, len);
> +		if (internal_addr == NULL)
> +			return -1;
> +	} else {
> +		internal_addr = NULL;
> +	}
>  	info->mem[n].name = name;
>  	info->mem[n].addr = addr;
>  	info->mem[n].internal_addr = internal_addr;
> @@ -650,6 +655,12 @@ MODULE_PARM_DESC(intr_mode,
>  "    " RTE_INTR_MODE_LEGACY_NAME "     Use Legacy interrupt\n"
>  "\n");
>  
> +module_param(wc_activate, int, 0);
> +MODULE_PARM_DESC(wc_activate,
> +"Activate support for write combining (WC) (default=0)\n"
> +"    0 - disable\n"
> +"    other - enable\n");
> +
>  MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
>  MODULE_LICENSE("GPL");
>  MODULE_AUTHOR("Intel Corporation");
>
Rafal Kozik June 29, 2018, 8:35 a.m. UTC | #2
2018-06-28 16:32 GMT+02:00 Ferruh Yigit <ferruh.yigit@intel.com>:
> On 6/28/2018 2:15 PM, Rafal Kozik wrote:
>> Write combining (WC) increases NIC performance by making better
>> utilization of PCI bus, but cannot be use by all PMDs.
>>
>> To get internal_addr memory need to be mapped. But as memory could not be
>> mapped twice: with and without WC, it should be skipped for WC. [1]
>>
>> To do not spoil other drivers that potentially could use internal_addr,
>> parameter wc_activate adds possibility to skip it for those PMDs,
>> that do not use it.
>>
>> [1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
>>       section 5.3 and 5.4
>
> Hi Rafal,
>
> Thank you for more information but I have a few more question:
>
> - What do you mean "But as memory could not be mapped twice: with and without WC"?
>
> ioremap() maps the physical address for kernel usage, and via uio we are mapping
> it to userspace, do you mean these two?
>
> - "internal_addr" is should be for kernel sage not for DPDK drivers which are in
> the userspace, why it is a concern for us?
>
> - What happens if you don't update this code at all? Won't you able to map
> device address into userspace?
> I tested adding RTE_PCI_DRV_WC_ACTIVATE to i40e, on top of your patch, and able
> to map without igb_uio update.
> I am not able to understand need of the modification.
>

Hello Ferruh,

I was not precisely. Memory could be mapped multiple time,
but cannot be mapped with and without WC support simultaneously.
When not setting wc_activate memory mapping work, but silently
fall-back to non prefetchable mode.

I perform measurements of writing speed.
When parameter wc_activate was set I get 4.81 GB/s.
Without this parameter result was 0.07 GB/s.
Code used for testing is located here:
gist.github.com/semihalf-kozik-rafal/327208cd52a2fac2d12250028becf9b3

Best regards,
Rafal

>>
>> Signed-off-by: Rafal Kozik <rk@semihalf.com>
>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
>> ---
>>  kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>
>> diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
>> index b3233f1..3382fb1 100644
>> --- a/kernel/linux/igb_uio/igb_uio.c
>> +++ b/kernel/linux/igb_uio/igb_uio.c
>> @@ -30,6 +30,7 @@ struct rte_uio_pci_dev {
>>       int refcnt;
>>  };
>>
>> +static int wc_activate;
>>  static char *intr_mode;
>>  static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
>>  /* sriov sysfs */
>> @@ -375,9 +376,13 @@ igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
>>       len = pci_resource_len(dev, pci_bar);
>>       if (addr == 0 || len == 0)
>>               return -1;
>> -     internal_addr = ioremap(addr, len);
>> -     if (internal_addr == NULL)
>> -             return -1;
>> +     if (wc_activate == 0) {
>> +             internal_addr = ioremap(addr, len);
>> +             if (internal_addr == NULL)
>> +                     return -1;
>> +     } else {
>> +             internal_addr = NULL;
>> +     }
>>       info->mem[n].name = name;
>>       info->mem[n].addr = addr;
>>       info->mem[n].internal_addr = internal_addr;
>> @@ -650,6 +655,12 @@ MODULE_PARM_DESC(intr_mode,
>>  "    " RTE_INTR_MODE_LEGACY_NAME "     Use Legacy interrupt\n"
>>  "\n");
>>
>> +module_param(wc_activate, int, 0);
>> +MODULE_PARM_DESC(wc_activate,
>> +"Activate support for write combining (WC) (default=0)\n"
>> +"    0 - disable\n"
>> +"    other - enable\n");
>> +
>>  MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
>>  MODULE_LICENSE("GPL");
>>  MODULE_AUTHOR("Intel Corporation");
>>
>
Ferruh Yigit June 29, 2018, 10:08 a.m. UTC | #3
On 6/29/2018 9:35 AM, Rafał Kozik wrote:
> 2018-06-28 16:32 GMT+02:00 Ferruh Yigit <ferruh.yigit@intel.com>:
>> On 6/28/2018 2:15 PM, Rafal Kozik wrote:
>>> Write combining (WC) increases NIC performance by making better
>>> utilization of PCI bus, but cannot be use by all PMDs.
>>>
>>> To get internal_addr memory need to be mapped. But as memory could not be
>>> mapped twice: with and without WC, it should be skipped for WC. [1]
>>>
>>> To do not spoil other drivers that potentially could use internal_addr,
>>> parameter wc_activate adds possibility to skip it for those PMDs,
>>> that do not use it.
>>>
>>> [1] https://www.kernel.org/doc/ols/2008/ols2008v2-pages-135-144.pdf
>>>       section 5.3 and 5.4
>>
>> Hi Rafal,
>>
>> Thank you for more information but I have a few more question:
>>
>> - What do you mean "But as memory could not be mapped twice: with and without WC"?
>>
>> ioremap() maps the physical address for kernel usage, and via uio we are mapping
>> it to userspace, do you mean these two?
>>
>> - "internal_addr" is should be for kernel sage not for DPDK drivers which are in
>> the userspace, why it is a concern for us?
>>
>> - What happens if you don't update this code at all? Won't you able to map
>> device address into userspace?
>> I tested adding RTE_PCI_DRV_WC_ACTIVATE to i40e, on top of your patch, and able
>> to map without igb_uio update.
>> I am not able to understand need of the modification.
>>
> 
> Hello Ferruh,
> 
> I was not precisely. Memory could be mapped multiple time,
> but cannot be mapped with and without WC support simultaneously.
> When not setting wc_activate memory mapping work, but silently
> fall-back to non prefetchable mode.

How can I confirm this silently fall-back behavior, is there any log can I turn
on in kernel or anything from proc/sysfs?

> 
> I perform measurements of writing speed.
> When parameter wc_activate was set I get 4.81 GB/s.
> Without this parameter result was 0.07 GB/s.
> Code used for testing is located here:
> gist.github.com/semihalf-kozik-rafal/327208cd52a2fac2d12250028becf9b3
> 
> Best regards,
> Rafal
> 
>>>
>>> Signed-off-by: Rafal Kozik <rk@semihalf.com>
>>> Acked-by: Bruce Richardson <bruce.richardson@intel.com>
>>> ---
>>>  kernel/linux/igb_uio/igb_uio.c | 17 ++++++++++++++---
>>>  1 file changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
>>> index b3233f1..3382fb1 100644
>>> --- a/kernel/linux/igb_uio/igb_uio.c
>>> +++ b/kernel/linux/igb_uio/igb_uio.c
>>> @@ -30,6 +30,7 @@ struct rte_uio_pci_dev {
>>>       int refcnt;
>>>  };
>>>
>>> +static int wc_activate;
>>>  static char *intr_mode;
>>>  static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
>>>  /* sriov sysfs */
>>> @@ -375,9 +376,13 @@ igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
>>>       len = pci_resource_len(dev, pci_bar);
>>>       if (addr == 0 || len == 0)
>>>               return -1;
>>> -     internal_addr = ioremap(addr, len);
>>> -     if (internal_addr == NULL)
>>> -             return -1;
>>> +     if (wc_activate == 0) {
>>> +             internal_addr = ioremap(addr, len);
>>> +             if (internal_addr == NULL)
>>> +                     return -1;
>>> +     } else {
>>> +             internal_addr = NULL;
>>> +     }
>>>       info->mem[n].name = name;
>>>       info->mem[n].addr = addr;
>>>       info->mem[n].internal_addr = internal_addr;
>>> @@ -650,6 +655,12 @@ MODULE_PARM_DESC(intr_mode,
>>>  "    " RTE_INTR_MODE_LEGACY_NAME "     Use Legacy interrupt\n"
>>>  "\n");
>>>
>>> +module_param(wc_activate, int, 0);
>>> +MODULE_PARM_DESC(wc_activate,
>>> +"Activate support for write combining (WC) (default=0)\n"
>>> +"    0 - disable\n"
>>> +"    other - enable\n");
>>> +
>>>  MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
>>>  MODULE_LICENSE("GPL");
>>>  MODULE_AUTHOR("Intel Corporation");
>>>
>>

Patch
diff mbox series

diff --git a/kernel/linux/igb_uio/igb_uio.c b/kernel/linux/igb_uio/igb_uio.c
index b3233f1..3382fb1 100644
--- a/kernel/linux/igb_uio/igb_uio.c
+++ b/kernel/linux/igb_uio/igb_uio.c
@@ -30,6 +30,7 @@  struct rte_uio_pci_dev {
 	int refcnt;
 };
 
+static int wc_activate;
 static char *intr_mode;
 static enum rte_intr_mode igbuio_intr_mode_preferred = RTE_INTR_MODE_MSIX;
 /* sriov sysfs */
@@ -375,9 +376,13 @@  igbuio_pci_setup_iomem(struct pci_dev *dev, struct uio_info *info,
 	len = pci_resource_len(dev, pci_bar);
 	if (addr == 0 || len == 0)
 		return -1;
-	internal_addr = ioremap(addr, len);
-	if (internal_addr == NULL)
-		return -1;
+	if (wc_activate == 0) {
+		internal_addr = ioremap(addr, len);
+		if (internal_addr == NULL)
+			return -1;
+	} else {
+		internal_addr = NULL;
+	}
 	info->mem[n].name = name;
 	info->mem[n].addr = addr;
 	info->mem[n].internal_addr = internal_addr;
@@ -650,6 +655,12 @@  MODULE_PARM_DESC(intr_mode,
 "    " RTE_INTR_MODE_LEGACY_NAME "     Use Legacy interrupt\n"
 "\n");
 
+module_param(wc_activate, int, 0);
+MODULE_PARM_DESC(wc_activate,
+"Activate support for write combining (WC) (default=0)\n"
+"    0 - disable\n"
+"    other - enable\n");
+
 MODULE_DESCRIPTION("UIO driver for Intel IGB PCI cards");
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Intel Corporation");