[v4,02/11] dma/ioat: create dmadev instances on PCI probe

Message ID 20210917154227.737554-3-conor.walsh@intel.com (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series dma: add dmadev driver for ioat devices |

Checks

Context Check Description
ci/checkpatch success coding style OK

Commit Message

Conor Walsh Sept. 17, 2021, 3:42 p.m. UTC
  When a suitable device is found during the PCI probe, create a dmadev
instance for each channel. Internal structures and HW definitions required
for device creation are also included.

Signed-off-by: Conor Walsh <conor.walsh@intel.com>
Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
---
 drivers/dma/ioat/ioat_dmadev.c   | 119 ++++++++++++++++++++++++++++++-
 drivers/dma/ioat/ioat_hw_defs.h  |  45 ++++++++++++
 drivers/dma/ioat/ioat_internal.h |  24 +++++++
 3 files changed, 186 insertions(+), 2 deletions(-)
  

Comments

Bruce Richardson Sept. 20, 2021, 1:31 p.m. UTC | #1
On Fri, Sep 17, 2021 at 03:42:18PM +0000, Conor Walsh wrote:
> When a suitable device is found during the PCI probe, create a dmadev
> instance for each channel. Internal structures and HW definitions required
> for device creation are also included.
> 
> Signed-off-by: Conor Walsh <conor.walsh@intel.com>
> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
> ---
>  drivers/dma/ioat/ioat_dmadev.c   | 119 ++++++++++++++++++++++++++++++-
>  drivers/dma/ioat/ioat_hw_defs.h  |  45 ++++++++++++
>  drivers/dma/ioat/ioat_internal.h |  24 +++++++
>  3 files changed, 186 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/dma/ioat/ioat_dmadev.c b/drivers/dma/ioat/ioat_dmadev.c
> index f3491d45b1..b815d30bcf 100644
> --- a/drivers/dma/ioat/ioat_dmadev.c
> +++ b/drivers/dma/ioat/ioat_dmadev.c
> @@ -4,6 +4,7 @@
>  

<snip>

> +/* Destroy a DMA device. */
> +static int
> +ioat_dmadev_destroy(const char *name)
> +{
> +	struct rte_dma_dev *dev;
> +	struct ioat_dmadev *ioat;
> +	int ret;
> +
> +	if (!name) {
> +		IOAT_PMD_ERR("Invalid device name");
> +		return -EINVAL;
> +	}
> +
> +	dev = &rte_dma_devices[rte_dma_get_dev_id(name)];
> +	if (!dev) {
> +		IOAT_PMD_ERR("Invalid device name (%s)", name);
> +		return -EINVAL;
> +	}
> +

I think you need to independently check the return value from
rte_dma_get_dev_id, rather than assuming when it returns an error value the
resultant index location will hold a null pointer.

> +	ioat = dev->dev_private;
> +	if (!ioat) {
> +		IOAT_PMD_ERR("Error getting dev_private");
> +		return -EINVAL;
> +	}
> +
> +	dev->dev_private = NULL;
> +	rte_free(ioat->desc_ring);
> +
> +	ret = rte_dma_pmd_release(name);

The rte_dma_pmd_allocate function reserves memory for the private data, so
the release function should free that memory too. However, you have
assigned private_data to NULL just above, so that probably won't work.

> +	if (ret)
> +		IOAT_PMD_DEBUG("Device cleanup failed");
> +
> +	return 0;
> +}
> +

<snip>
  
Conor Walsh Sept. 21, 2021, 4:25 p.m. UTC | #2
> On Fri, Sep 17, 2021 at 03:42:18PM +0000, Conor Walsh wrote:
>> When a suitable device is found during the PCI probe, create a dmadev
>> instance for each channel. Internal structures and HW definitions required
>> for device creation are also included.
>>
>> Signed-off-by: Conor Walsh <conor.walsh@intel.com>
>> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
>> ---
>>   drivers/dma/ioat/ioat_dmadev.c   | 119 ++++++++++++++++++++++++++++++-
>>   drivers/dma/ioat/ioat_hw_defs.h  |  45 ++++++++++++
>>   drivers/dma/ioat/ioat_internal.h |  24 +++++++
>>   3 files changed, 186 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/dma/ioat/ioat_dmadev.c b/drivers/dma/ioat/ioat_dmadev.c
>> index f3491d45b1..b815d30bcf 100644
>> --- a/drivers/dma/ioat/ioat_dmadev.c
>> +++ b/drivers/dma/ioat/ioat_dmadev.c
>> @@ -4,6 +4,7 @@
>>   
> <snip>
>
>> +/* Destroy a DMA device. */
>> +static int
>> +ioat_dmadev_destroy(const char *name)
>> +{
>> +	struct rte_dma_dev *dev;
>> +	struct ioat_dmadev *ioat;
>> +	int ret;
>> +
>> +	if (!name) {
>> +		IOAT_PMD_ERR("Invalid device name");
>> +		return -EINVAL;
>> +	}
>> +
>> +	dev = &rte_dma_devices[rte_dma_get_dev_id(name)];
>> +	if (!dev) {
>> +		IOAT_PMD_ERR("Invalid device name (%s)", name);
>> +		return -EINVAL;
>> +	}
>> +
> I think you need to independently check the return value from
> rte_dma_get_dev_id, rather than assuming when it returns an error value the
> resultant index location will hold a null pointer.

I will assign rte_dma_get_dev_id(name) to a variable and check it before 
use in v5.

>
>> +	ioat = dev->dev_private;
>> +	if (!ioat) {
>> +		IOAT_PMD_ERR("Error getting dev_private");
>> +		return -EINVAL;
>> +	}
>> +
>> +	dev->dev_private = NULL;
>> +	rte_free(ioat->desc_ring);
>> +
>> +	ret = rte_dma_pmd_release(name);
> The rte_dma_pmd_allocate function reserves memory for the private data, so
> the release function should free that memory too. However, you have
> assigned private_data to NULL just above, so that probably won't work.

I think I will actually remove the "dev->dev_private = NULL" in v5 as 
this is handled by the lib.

Thanks,

Conor.

>> +	if (ret)
>> +		IOAT_PMD_DEBUG("Device cleanup failed");
>> +
>> +	return 0;
>> +}
>> +
> <snip>
  
Chengwen Feng Sept. 22, 2021, 8:04 a.m. UTC | #3
On 2021/9/17 23:42, Conor Walsh wrote:
> When a suitable device is found during the PCI probe, create a dmadev
> instance for each channel. Internal structures and HW definitions required
> for device creation are also included.
> 
> Signed-off-by: Conor Walsh <conor.walsh@intel.com>
> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
> ---
>  drivers/dma/ioat/ioat_dmadev.c   | 119 ++++++++++++++++++++++++++++++-
>  drivers/dma/ioat/ioat_hw_defs.h  |  45 ++++++++++++
>  drivers/dma/ioat/ioat_internal.h |  24 +++++++
>  3 files changed, 186 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/dma/ioat/ioat_dmadev.c b/drivers/dma/ioat/ioat_dmadev.c
> index f3491d45b1..b815d30bcf 100644
> --- a/drivers/dma/ioat/ioat_dmadev.c
> +++ b/drivers/dma/ioat/ioat_dmadev.c
> @@ -4,6 +4,7 @@
>  
>  #include <rte_bus_pci.h>
>  #include <rte_dmadev_pmd.h>
> +#include <rte_malloc.h>
>  
>  #include "ioat_internal.h"
>  
> @@ -14,6 +15,120 @@ RTE_LOG_REGISTER_DEFAULT(ioat_pmd_logtype, INFO);
>  #define IOAT_PMD_NAME dmadev_ioat
>  #define IOAT_PMD_NAME_STR RTE_STR(IOAT_PMD_NAME)
>  
> +/* Create a DMA device. */
> +static int
> +ioat_dmadev_create(const char *name, struct rte_pci_device *dev)
> +{
> +	static const struct rte_dma_dev_ops ioat_dmadev_ops = { };
> +
> +	struct rte_dma_dev *dmadev = NULL;
> +	struct ioat_dmadev *ioat = NULL;
> +	int retry = 0;
> +
> +	if (!name) {
> +		IOAT_PMD_ERR("Invalid name of the device!");
> +		return -EINVAL;
> +	}
> +
> +	/* Allocate device structure. */
> +	dmadev = rte_dma_pmd_allocate(name, dev->device.numa_node,
> +			sizeof(dmadev->dev_private));
> +	if (dmadev == NULL) {
> +		IOAT_PMD_ERR("Unable to allocate dma device");
> +		return -ENOMEM;
> +	}
> +
> +	dmadev->device = &dev->device;
> +
> +	dmadev->data->dev_private = rte_malloc_socket(NULL, sizeof(*ioat),
> +			0, dmadev->device->numa_node);
> +	dmadev->dev_private = dmadev->data->dev_private;

The dmalib will malloc dev_private, so please invoke like:
  dmadev = rte_dma_pmd_allocate(name, dev->device.numa_node, sizeof(*ioat));

> +
> +	dmadev->dev_ops = &ioat_dmadev_ops;
> +
> +	ioat = dmadev->data->dev_private;
> +	ioat->dmadev = dmadev;
> +	ioat->regs = dev->mem_resource[0].addr;
> +	ioat->doorbell = &ioat->regs->dmacount;
> +	ioat->qcfg.nb_desc = 0;
> +	ioat->desc_ring = NULL;
> +
> +	/* Do device initialization - reset and set error behaviour. */
> +	if (ioat->regs->chancnt != 1)
> +		IOAT_PMD_WARN("%s: Channel count == %d\n", __func__,
> +				ioat->regs->chancnt);
> +
> +	/* Locked by someone else. */
> +	if (ioat->regs->chanctrl & IOAT_CHANCTRL_CHANNEL_IN_USE) {
> +		IOAT_PMD_WARN("%s: Channel appears locked\n", __func__);
> +		ioat->regs->chanctrl = 0;
> +	}
> +
> +	/* clear any previous errors */
> +	if (ioat->regs->chanerr != 0) {
> +		uint32_t val = ioat->regs->chanerr;
> +		ioat->regs->chanerr = val;
> +	}
> +
> +	ioat->regs->chancmd = IOAT_CHANCMD_SUSPEND;
> +	rte_delay_ms(1);
> +	ioat->regs->chancmd = IOAT_CHANCMD_RESET;
> +	rte_delay_ms(1);
> +	while (ioat->regs->chancmd & IOAT_CHANCMD_RESET) {
> +		ioat->regs->chainaddr = 0;
> +		rte_delay_ms(1);
> +		if (++retry >= 200) {
> +			IOAT_PMD_ERR("%s: cannot reset device. CHANCMD=%#"PRIx8
> +					", CHANSTS=%#"PRIx64", CHANERR=%#"PRIx32"\n",
> +					__func__,
> +					ioat->regs->chancmd,
> +					ioat->regs->chansts,
> +					ioat->regs->chanerr);
please release the dmadev.

> +			return -EIO;
> +		}
> +	}
> +	ioat->regs->chanctrl = IOAT_CHANCTRL_ANY_ERR_ABORT_EN |
> +			IOAT_CHANCTRL_ERR_COMPLETION_EN;
> +

please change dmadev state to RTE_DMA_DEV_READY.

> +	return 0;
> +
> +}
> +
> +/* Destroy a DMA device. */
> +static int
> +ioat_dmadev_destroy(const char *name)
> +{
> +	struct rte_dma_dev *dev;
> +	struct ioat_dmadev *ioat;
> +	int ret;
> +
> +	if (!name) {
> +		IOAT_PMD_ERR("Invalid device name");
> +		return -EINVAL;
> +	}
> +
> +	dev = &rte_dma_devices[rte_dma_get_dev_id(name)];
> +	if (!dev) {
> +		IOAT_PMD_ERR("Invalid device name (%s)", name);
> +		return -EINVAL;
> +	}
> +
> +	ioat = dev->dev_private;
> +	if (!ioat) {
> +		IOAT_PMD_ERR("Error getting dev_private");
> +		return -EINVAL;
> +	}
> +
> +	dev->dev_private = NULL;
> +	rte_free(ioat->desc_ring);
> +
> +	ret = rte_dma_pmd_release(name);
> +	if (ret)
> +		IOAT_PMD_DEBUG("Device cleanup failed");

driver only need call rte_dma_pmd_relese, and the dev_private will freed
in dmalib.

> +
> +	return 0;
> +}
> +
>  /* Probe DMA device. */
>  static int
>  ioat_dmadev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
> @@ -24,7 +139,7 @@ ioat_dmadev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
>  	IOAT_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
>  
>  	dev->device.driver = &drv->driver;
> -	return 0;
> +	return ioat_dmadev_create(name, dev);
>  }
>  
>  /* Remove DMA device. */
> @@ -38,7 +153,7 @@ ioat_dmadev_remove(struct rte_pci_device *dev)
>  	IOAT_PMD_INFO("Closing %s on NUMA node %d",
>  			name, dev->device.numa_node);
>  
> -	return 0;
> +	return ioat_dmadev_destroy(name);
>  }
>  
>  static const struct rte_pci_id pci_id_ioat_map[] = {
> diff --git a/drivers/dma/ioat/ioat_hw_defs.h b/drivers/dma/ioat/ioat_hw_defs.h
> index eeabba41ef..73bdf548b3 100644
> --- a/drivers/dma/ioat/ioat_hw_defs.h
> +++ b/drivers/dma/ioat/ioat_hw_defs.h
> @@ -11,6 +11,8 @@ extern "C" {
>  
>  #include <stdint.h>
>  
> +#define IOAT_PCI_CHANERR_INT_OFFSET	0x180
> +
>  #define IOAT_VER_3_0	0x30
>  #define IOAT_VER_3_3	0x33
>  
> @@ -28,6 +30,49 @@ extern "C" {
>  #define IOAT_DEVICE_ID_BDXF	0x6f2F
>  #define IOAT_DEVICE_ID_ICX	0x0b00
>  
> +#define IOAT_COMP_UPDATE_SHIFT	3
> +#define IOAT_CMD_OP_SHIFT	24
> +
> +/* DMA Channel Registers */
> +#define IOAT_CHANCTRL_CHANNEL_PRIORITY_MASK		0xF000
> +#define IOAT_CHANCTRL_COMPL_DCA_EN			0x0200
> +#define IOAT_CHANCTRL_CHANNEL_IN_USE			0x0100
> +#define IOAT_CHANCTRL_DESCRIPTOR_ADDR_SNOOP_CONTROL	0x0020
> +#define IOAT_CHANCTRL_ERR_INT_EN			0x0010
> +#define IOAT_CHANCTRL_ANY_ERR_ABORT_EN			0x0008
> +#define IOAT_CHANCTRL_ERR_COMPLETION_EN			0x0004
> +#define IOAT_CHANCTRL_INT_REARM				0x0001
> +
> +struct ioat_registers {
> +	uint8_t		chancnt;
> +	uint8_t		xfercap;
> +	uint8_t		genctrl;
> +	uint8_t		intrctrl;
> +	uint32_t	attnstatus;
> +	uint8_t		cbver;		/* 0x08 */
> +	uint8_t		reserved4[0x3]; /* 0x09 */
> +	uint16_t	intrdelay;	/* 0x0C */
> +	uint16_t	cs_status;	/* 0x0E */
> +	uint32_t	dmacapability;	/* 0x10 */
> +	uint8_t		reserved5[0x6C]; /* 0x14 */
> +	uint16_t	chanctrl;	/* 0x80 */
> +	uint8_t		reserved6[0x2];	/* 0x82 */
> +	uint8_t		chancmd;	/* 0x84 */
> +	uint8_t		reserved3[1];	/* 0x85 */
> +	uint16_t	dmacount;	/* 0x86 */
> +	uint64_t	chansts;	/* 0x88 */
> +	uint64_t	chainaddr;	/* 0x90 */
> +	uint64_t	chancmp;	/* 0x98 */
> +	uint8_t		reserved2[0x8];	/* 0xA0 */
> +	uint32_t	chanerr;	/* 0xA8 */
> +	uint32_t	chanerrmask;	/* 0xAC */
> +} __rte_packed;
> +
> +#define IOAT_CHANCMD_RESET	0x20
> +#define IOAT_CHANCMD_SUSPEND	0x04
> +
> +#define IOAT_CHANCMP_ALIGN	8 /* CHANCMP address must be 64-bit aligned */
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/drivers/dma/ioat/ioat_internal.h b/drivers/dma/ioat/ioat_internal.h
> index f1ec12a919..a4e323f360 100644
> --- a/drivers/dma/ioat/ioat_internal.h
> +++ b/drivers/dma/ioat/ioat_internal.h
> @@ -7,6 +7,30 @@
>  
>  #include "ioat_hw_defs.h"
>  
> +struct ioat_dmadev {
> +	struct rte_dma_dev *dmadev;

maybe the rte_dma_dev_data more appropriate if support multi-process.

> +	struct rte_dma_vchan_conf qcfg;
> +	struct rte_dma_stats stats;
> +
> +	volatile uint16_t *doorbell __rte_cache_aligned;
> +	phys_addr_t status_addr;
> +	phys_addr_t ring_addr;
> +
> +	struct ioat_dma_hw_desc *desc_ring;
> +
> +	unsigned short next_read;
> +	unsigned short next_write;
> +	unsigned short last_write; /* Used to compute submitted count. */
> +	unsigned short offset; /* Used after a device recovery when counts -> 0. */
> +	unsigned int failure; /* Used to store chanerr for error handling. */
> +
> +	/* To report completions, the device will write status back here. */
> +	volatile uint64_t status __rte_cache_aligned;
> +
> +	/* Pointer to the register bar. */
> +	volatile struct ioat_registers *regs;
> +};
> +
>  extern int ioat_pmd_logtype;
>  
>  #define IOAT_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \
>
  
Conor Walsh Sept. 22, 2021, 4:40 p.m. UTC | #4
On 22/09/2021 09:04, fengchengwen wrote:
> On 2021/9/17 23:42, Conor Walsh wrote:
>> When a suitable device is found during the PCI probe, create a dmadev
>> instance for each channel. Internal structures and HW definitions required
>> for device creation are also included.
>>
>> Signed-off-by: Conor Walsh <conor.walsh@intel.com>
>> Reviewed-by: Kevin Laatz <kevin.laatz@intel.com>
>> ---
>>   drivers/dma/ioat/ioat_dmadev.c   | 119 ++++++++++++++++++++++++++++++-
>>   drivers/dma/ioat/ioat_hw_defs.h  |  45 ++++++++++++
>>   drivers/dma/ioat/ioat_internal.h |  24 +++++++
>>   3 files changed, 186 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/dma/ioat/ioat_dmadev.c b/drivers/dma/ioat/ioat_dmadev.c
>> index f3491d45b1..b815d30bcf 100644
>> --- a/drivers/dma/ioat/ioat_dmadev.c
>> +++ b/drivers/dma/ioat/ioat_dmadev.c
>> @@ -4,6 +4,7 @@
>>   
>>   #include <rte_bus_pci.h>
>>   #include <rte_dmadev_pmd.h>
>> +#include <rte_malloc.h>
>>   
>>   #include "ioat_internal.h"
>>   
>> @@ -14,6 +15,120 @@ RTE_LOG_REGISTER_DEFAULT(ioat_pmd_logtype, INFO);
>>   #define IOAT_PMD_NAME dmadev_ioat
>>   #define IOAT_PMD_NAME_STR RTE_STR(IOAT_PMD_NAME)
>>   
>> +/* Create a DMA device. */
>> +static int
>> +ioat_dmadev_create(const char *name, struct rte_pci_device *dev)
>> +{
>> +	static const struct rte_dma_dev_ops ioat_dmadev_ops = { };
>> +
>> +	struct rte_dma_dev *dmadev = NULL;
>> +	struct ioat_dmadev *ioat = NULL;
>> +	int retry = 0;
>> +
>> +	if (!name) {
>> +		IOAT_PMD_ERR("Invalid name of the device!");
>> +		return -EINVAL;
>> +	}
>> +
>> +	/* Allocate device structure. */
>> +	dmadev = rte_dma_pmd_allocate(name, dev->device.numa_node,
>> +			sizeof(dmadev->dev_private));
>> +	if (dmadev == NULL) {
>> +		IOAT_PMD_ERR("Unable to allocate dma device");
>> +		return -ENOMEM;
>> +	}
>> +
>> +	dmadev->device = &dev->device;
>> +
>> +	dmadev->data->dev_private = rte_malloc_socket(NULL, sizeof(*ioat),
>> +			0, dmadev->device->numa_node);
>> +	dmadev->dev_private = dmadev->data->dev_private;
> The dmalib will malloc dev_private, so please invoke like:
>    dmadev = rte_dma_pmd_allocate(name, dev->device.numa_node, sizeof(*ioat));

I will update this in v5.

>> +	dmadev->dev_ops = &ioat_dmadev_ops;
>> +
>> +	ioat = dmadev->data->dev_private;
>> +	ioat->dmadev = dmadev;
>> +	ioat->regs = dev->mem_resource[0].addr;
>> +	ioat->doorbell = &ioat->regs->dmacount;
>> +	ioat->qcfg.nb_desc = 0;
>> +	ioat->desc_ring = NULL;
>> +
>> +	/* Do device initialization - reset and set error behaviour. */
>> +	if (ioat->regs->chancnt != 1)
>> +		IOAT_PMD_WARN("%s: Channel count == %d\n", __func__,
>> +				ioat->regs->chancnt);
>> +
>> +	/* Locked by someone else. */
>> +	if (ioat->regs->chanctrl & IOAT_CHANCTRL_CHANNEL_IN_USE) {
>> +		IOAT_PMD_WARN("%s: Channel appears locked\n", __func__);
>> +		ioat->regs->chanctrl = 0;
>> +	}
>> +
>> +	/* clear any previous errors */
>> +	if (ioat->regs->chanerr != 0) {
>> +		uint32_t val = ioat->regs->chanerr;
>> +		ioat->regs->chanerr = val;
>> +	}
>> +
>> +	ioat->regs->chancmd = IOAT_CHANCMD_SUSPEND;
>> +	rte_delay_ms(1);
>> +	ioat->regs->chancmd = IOAT_CHANCMD_RESET;
>> +	rte_delay_ms(1);
>> +	while (ioat->regs->chancmd & IOAT_CHANCMD_RESET) {
>> +		ioat->regs->chainaddr = 0;
>> +		rte_delay_ms(1);
>> +		if (++retry >= 200) {
>> +			IOAT_PMD_ERR("%s: cannot reset device. CHANCMD=%#"PRIx8
>> +					", CHANSTS=%#"PRIx64", CHANERR=%#"PRIx32"\n",
>> +					__func__,
>> +					ioat->regs->chancmd,
>> +					ioat->regs->chansts,
>> +					ioat->regs->chanerr);
> please release the dmadev.

Will add in v5.

>> +			return -EIO;
>> +		}
>> +	}
>> +	ioat->regs->chanctrl = IOAT_CHANCTRL_ANY_ERR_ABORT_EN |
>> +			IOAT_CHANCTRL_ERR_COMPLETION_EN;
>> +
> please change dmadev state to RTE_DMA_DEV_READY.

I will add this in v5.

>> +	return 0;
>> +
>> +}
>> +
>> +/* Destroy a DMA device. */
>> +static int
>> +ioat_dmadev_destroy(const char *name)
>> +{
>> +	struct rte_dma_dev *dev;
>> +	struct ioat_dmadev *ioat;
>> +	int ret;
>> +
>> +	if (!name) {
>> +		IOAT_PMD_ERR("Invalid device name");
>> +		return -EINVAL;
>> +	}
>> +
>> +	dev = &rte_dma_devices[rte_dma_get_dev_id(name)];
>> +	if (!dev) {
>> +		IOAT_PMD_ERR("Invalid device name (%s)", name);
>> +		return -EINVAL;
>> +	}
>> +
>> +	ioat = dev->dev_private;
>> +	if (!ioat) {
>> +		IOAT_PMD_ERR("Error getting dev_private");
>> +		return -EINVAL;
>> +	}
>> +
>> +	dev->dev_private = NULL;
>> +	rte_free(ioat->desc_ring);
>> +
>> +	ret = rte_dma_pmd_release(name);
>> +	if (ret)
>> +		IOAT_PMD_DEBUG("Device cleanup failed");
> driver only need call rte_dma_pmd_relese, and the dev_private will freed
> in dmalib.

I have removed the "dev->dev_private = NULL;" line for v5, that was 
probably stopping the free in the lib.

>
>> +	return 0;
>> +}
>> +
>>   /* Probe DMA device. */
>>   static int
>>   ioat_dmadev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
>> @@ -24,7 +139,7 @@ ioat_dmadev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
>>   	IOAT_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
>>   
>>   	dev->device.driver = &drv->driver;
>> -	return 0;
>> +	return ioat_dmadev_create(name, dev);
>>   }
>>   
>>   /* Remove DMA device. */
>> @@ -38,7 +153,7 @@ ioat_dmadev_remove(struct rte_pci_device *dev)
>>   	IOAT_PMD_INFO("Closing %s on NUMA node %d",
>>   			name, dev->device.numa_node);
>>   
>> -	return 0;
>> +	return ioat_dmadev_destroy(name);
>>   }
>>   
>>   static const struct rte_pci_id pci_id_ioat_map[] = {
>> diff --git a/drivers/dma/ioat/ioat_hw_defs.h b/drivers/dma/ioat/ioat_hw_defs.h
>> index eeabba41ef..73bdf548b3 100644
>> --- a/drivers/dma/ioat/ioat_hw_defs.h
>> +++ b/drivers/dma/ioat/ioat_hw_defs.h
>> @@ -11,6 +11,8 @@ extern "C" {
>>   
>>   #include <stdint.h>
>>   
>> +#define IOAT_PCI_CHANERR_INT_OFFSET	0x180
>> +
>>   #define IOAT_VER_3_0	0x30
>>   #define IOAT_VER_3_3	0x33
>>   
>> @@ -28,6 +30,49 @@ extern "C" {
>>   #define IOAT_DEVICE_ID_BDXF	0x6f2F
>>   #define IOAT_DEVICE_ID_ICX	0x0b00
>>   
>> +#define IOAT_COMP_UPDATE_SHIFT	3
>> +#define IOAT_CMD_OP_SHIFT	24
>> +
>> +/* DMA Channel Registers */
>> +#define IOAT_CHANCTRL_CHANNEL_PRIORITY_MASK		0xF000
>> +#define IOAT_CHANCTRL_COMPL_DCA_EN			0x0200
>> +#define IOAT_CHANCTRL_CHANNEL_IN_USE			0x0100
>> +#define IOAT_CHANCTRL_DESCRIPTOR_ADDR_SNOOP_CONTROL	0x0020
>> +#define IOAT_CHANCTRL_ERR_INT_EN			0x0010
>> +#define IOAT_CHANCTRL_ANY_ERR_ABORT_EN			0x0008
>> +#define IOAT_CHANCTRL_ERR_COMPLETION_EN			0x0004
>> +#define IOAT_CHANCTRL_INT_REARM				0x0001
>> +
>> +struct ioat_registers {
>> +	uint8_t		chancnt;
>> +	uint8_t		xfercap;
>> +	uint8_t		genctrl;
>> +	uint8_t		intrctrl;
>> +	uint32_t	attnstatus;
>> +	uint8_t		cbver;		/* 0x08 */
>> +	uint8_t		reserved4[0x3]; /* 0x09 */
>> +	uint16_t	intrdelay;	/* 0x0C */
>> +	uint16_t	cs_status;	/* 0x0E */
>> +	uint32_t	dmacapability;	/* 0x10 */
>> +	uint8_t		reserved5[0x6C]; /* 0x14 */
>> +	uint16_t	chanctrl;	/* 0x80 */
>> +	uint8_t		reserved6[0x2];	/* 0x82 */
>> +	uint8_t		chancmd;	/* 0x84 */
>> +	uint8_t		reserved3[1];	/* 0x85 */
>> +	uint16_t	dmacount;	/* 0x86 */
>> +	uint64_t	chansts;	/* 0x88 */
>> +	uint64_t	chainaddr;	/* 0x90 */
>> +	uint64_t	chancmp;	/* 0x98 */
>> +	uint8_t		reserved2[0x8];	/* 0xA0 */
>> +	uint32_t	chanerr;	/* 0xA8 */
>> +	uint32_t	chanerrmask;	/* 0xAC */
>> +} __rte_packed;
>> +
>> +#define IOAT_CHANCMD_RESET	0x20
>> +#define IOAT_CHANCMD_SUSPEND	0x04
>> +
>> +#define IOAT_CHANCMP_ALIGN	8 /* CHANCMP address must be 64-bit aligned */
>> +
>>   #ifdef __cplusplus
>>   }
>>   #endif
>> diff --git a/drivers/dma/ioat/ioat_internal.h b/drivers/dma/ioat/ioat_internal.h
>> index f1ec12a919..a4e323f360 100644
>> --- a/drivers/dma/ioat/ioat_internal.h
>> +++ b/drivers/dma/ioat/ioat_internal.h
>> @@ -7,6 +7,30 @@
>>   
>>   #include "ioat_hw_defs.h"
>>   
>> +struct ioat_dmadev {
>> +	struct rte_dma_dev *dmadev;
> maybe the rte_dma_dev_data more appropriate if support multi-process.

Changed for v5.

>> +	struct rte_dma_vchan_conf qcfg;
>> +	struct rte_dma_stats stats;
>> +
>> +	volatile uint16_t *doorbell __rte_cache_aligned;
>> +	phys_addr_t status_addr;
>> +	phys_addr_t ring_addr;
>> +
>> +	struct ioat_dma_hw_desc *desc_ring;
>> +
>> +	unsigned short next_read;
>> +	unsigned short next_write;
>> +	unsigned short last_write; /* Used to compute submitted count. */
>> +	unsigned short offset; /* Used after a device recovery when counts -> 0. */
>> +	unsigned int failure; /* Used to store chanerr for error handling. */
>> +
>> +	/* To report completions, the device will write status back here. */
>> +	volatile uint64_t status __rte_cache_aligned;
>> +
>> +	/* Pointer to the register bar. */
>> +	volatile struct ioat_registers *regs;
>> +};
>> +
>>   extern int ioat_pmd_logtype;
>>   
>>   #define IOAT_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \
>>
  

Patch

diff --git a/drivers/dma/ioat/ioat_dmadev.c b/drivers/dma/ioat/ioat_dmadev.c
index f3491d45b1..b815d30bcf 100644
--- a/drivers/dma/ioat/ioat_dmadev.c
+++ b/drivers/dma/ioat/ioat_dmadev.c
@@ -4,6 +4,7 @@ 
 
 #include <rte_bus_pci.h>
 #include <rte_dmadev_pmd.h>
+#include <rte_malloc.h>
 
 #include "ioat_internal.h"
 
@@ -14,6 +15,120 @@  RTE_LOG_REGISTER_DEFAULT(ioat_pmd_logtype, INFO);
 #define IOAT_PMD_NAME dmadev_ioat
 #define IOAT_PMD_NAME_STR RTE_STR(IOAT_PMD_NAME)
 
+/* Create a DMA device. */
+static int
+ioat_dmadev_create(const char *name, struct rte_pci_device *dev)
+{
+	static const struct rte_dma_dev_ops ioat_dmadev_ops = { };
+
+	struct rte_dma_dev *dmadev = NULL;
+	struct ioat_dmadev *ioat = NULL;
+	int retry = 0;
+
+	if (!name) {
+		IOAT_PMD_ERR("Invalid name of the device!");
+		return -EINVAL;
+	}
+
+	/* Allocate device structure. */
+	dmadev = rte_dma_pmd_allocate(name, dev->device.numa_node,
+			sizeof(dmadev->dev_private));
+	if (dmadev == NULL) {
+		IOAT_PMD_ERR("Unable to allocate dma device");
+		return -ENOMEM;
+	}
+
+	dmadev->device = &dev->device;
+
+	dmadev->data->dev_private = rte_malloc_socket(NULL, sizeof(*ioat),
+			0, dmadev->device->numa_node);
+	dmadev->dev_private = dmadev->data->dev_private;
+
+	dmadev->dev_ops = &ioat_dmadev_ops;
+
+	ioat = dmadev->data->dev_private;
+	ioat->dmadev = dmadev;
+	ioat->regs = dev->mem_resource[0].addr;
+	ioat->doorbell = &ioat->regs->dmacount;
+	ioat->qcfg.nb_desc = 0;
+	ioat->desc_ring = NULL;
+
+	/* Do device initialization - reset and set error behaviour. */
+	if (ioat->regs->chancnt != 1)
+		IOAT_PMD_WARN("%s: Channel count == %d\n", __func__,
+				ioat->regs->chancnt);
+
+	/* Locked by someone else. */
+	if (ioat->regs->chanctrl & IOAT_CHANCTRL_CHANNEL_IN_USE) {
+		IOAT_PMD_WARN("%s: Channel appears locked\n", __func__);
+		ioat->regs->chanctrl = 0;
+	}
+
+	/* clear any previous errors */
+	if (ioat->regs->chanerr != 0) {
+		uint32_t val = ioat->regs->chanerr;
+		ioat->regs->chanerr = val;
+	}
+
+	ioat->regs->chancmd = IOAT_CHANCMD_SUSPEND;
+	rte_delay_ms(1);
+	ioat->regs->chancmd = IOAT_CHANCMD_RESET;
+	rte_delay_ms(1);
+	while (ioat->regs->chancmd & IOAT_CHANCMD_RESET) {
+		ioat->regs->chainaddr = 0;
+		rte_delay_ms(1);
+		if (++retry >= 200) {
+			IOAT_PMD_ERR("%s: cannot reset device. CHANCMD=%#"PRIx8
+					", CHANSTS=%#"PRIx64", CHANERR=%#"PRIx32"\n",
+					__func__,
+					ioat->regs->chancmd,
+					ioat->regs->chansts,
+					ioat->regs->chanerr);
+			return -EIO;
+		}
+	}
+	ioat->regs->chanctrl = IOAT_CHANCTRL_ANY_ERR_ABORT_EN |
+			IOAT_CHANCTRL_ERR_COMPLETION_EN;
+
+	return 0;
+
+}
+
+/* Destroy a DMA device. */
+static int
+ioat_dmadev_destroy(const char *name)
+{
+	struct rte_dma_dev *dev;
+	struct ioat_dmadev *ioat;
+	int ret;
+
+	if (!name) {
+		IOAT_PMD_ERR("Invalid device name");
+		return -EINVAL;
+	}
+
+	dev = &rte_dma_devices[rte_dma_get_dev_id(name)];
+	if (!dev) {
+		IOAT_PMD_ERR("Invalid device name (%s)", name);
+		return -EINVAL;
+	}
+
+	ioat = dev->dev_private;
+	if (!ioat) {
+		IOAT_PMD_ERR("Error getting dev_private");
+		return -EINVAL;
+	}
+
+	dev->dev_private = NULL;
+	rte_free(ioat->desc_ring);
+
+	ret = rte_dma_pmd_release(name);
+	if (ret)
+		IOAT_PMD_DEBUG("Device cleanup failed");
+
+	return 0;
+}
+
 /* Probe DMA device. */
 static int
 ioat_dmadev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
@@ -24,7 +139,7 @@  ioat_dmadev_probe(struct rte_pci_driver *drv, struct rte_pci_device *dev)
 	IOAT_PMD_INFO("Init %s on NUMA node %d", name, dev->device.numa_node);
 
 	dev->device.driver = &drv->driver;
-	return 0;
+	return ioat_dmadev_create(name, dev);
 }
 
 /* Remove DMA device. */
@@ -38,7 +153,7 @@  ioat_dmadev_remove(struct rte_pci_device *dev)
 	IOAT_PMD_INFO("Closing %s on NUMA node %d",
 			name, dev->device.numa_node);
 
-	return 0;
+	return ioat_dmadev_destroy(name);
 }
 
 static const struct rte_pci_id pci_id_ioat_map[] = {
diff --git a/drivers/dma/ioat/ioat_hw_defs.h b/drivers/dma/ioat/ioat_hw_defs.h
index eeabba41ef..73bdf548b3 100644
--- a/drivers/dma/ioat/ioat_hw_defs.h
+++ b/drivers/dma/ioat/ioat_hw_defs.h
@@ -11,6 +11,8 @@  extern "C" {
 
 #include <stdint.h>
 
+#define IOAT_PCI_CHANERR_INT_OFFSET	0x180
+
 #define IOAT_VER_3_0	0x30
 #define IOAT_VER_3_3	0x33
 
@@ -28,6 +30,49 @@  extern "C" {
 #define IOAT_DEVICE_ID_BDXF	0x6f2F
 #define IOAT_DEVICE_ID_ICX	0x0b00
 
+#define IOAT_COMP_UPDATE_SHIFT	3
+#define IOAT_CMD_OP_SHIFT	24
+
+/* DMA Channel Registers */
+#define IOAT_CHANCTRL_CHANNEL_PRIORITY_MASK		0xF000
+#define IOAT_CHANCTRL_COMPL_DCA_EN			0x0200
+#define IOAT_CHANCTRL_CHANNEL_IN_USE			0x0100
+#define IOAT_CHANCTRL_DESCRIPTOR_ADDR_SNOOP_CONTROL	0x0020
+#define IOAT_CHANCTRL_ERR_INT_EN			0x0010
+#define IOAT_CHANCTRL_ANY_ERR_ABORT_EN			0x0008
+#define IOAT_CHANCTRL_ERR_COMPLETION_EN			0x0004
+#define IOAT_CHANCTRL_INT_REARM				0x0001
+
+struct ioat_registers {
+	uint8_t		chancnt;
+	uint8_t		xfercap;
+	uint8_t		genctrl;
+	uint8_t		intrctrl;
+	uint32_t	attnstatus;
+	uint8_t		cbver;		/* 0x08 */
+	uint8_t		reserved4[0x3]; /* 0x09 */
+	uint16_t	intrdelay;	/* 0x0C */
+	uint16_t	cs_status;	/* 0x0E */
+	uint32_t	dmacapability;	/* 0x10 */
+	uint8_t		reserved5[0x6C]; /* 0x14 */
+	uint16_t	chanctrl;	/* 0x80 */
+	uint8_t		reserved6[0x2];	/* 0x82 */
+	uint8_t		chancmd;	/* 0x84 */
+	uint8_t		reserved3[1];	/* 0x85 */
+	uint16_t	dmacount;	/* 0x86 */
+	uint64_t	chansts;	/* 0x88 */
+	uint64_t	chainaddr;	/* 0x90 */
+	uint64_t	chancmp;	/* 0x98 */
+	uint8_t		reserved2[0x8];	/* 0xA0 */
+	uint32_t	chanerr;	/* 0xA8 */
+	uint32_t	chanerrmask;	/* 0xAC */
+} __rte_packed;
+
+#define IOAT_CHANCMD_RESET	0x20
+#define IOAT_CHANCMD_SUSPEND	0x04
+
+#define IOAT_CHANCMP_ALIGN	8 /* CHANCMP address must be 64-bit aligned */
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/drivers/dma/ioat/ioat_internal.h b/drivers/dma/ioat/ioat_internal.h
index f1ec12a919..a4e323f360 100644
--- a/drivers/dma/ioat/ioat_internal.h
+++ b/drivers/dma/ioat/ioat_internal.h
@@ -7,6 +7,30 @@ 
 
 #include "ioat_hw_defs.h"
 
+struct ioat_dmadev {
+	struct rte_dma_dev *dmadev;
+	struct rte_dma_vchan_conf qcfg;
+	struct rte_dma_stats stats;
+
+	volatile uint16_t *doorbell __rte_cache_aligned;
+	phys_addr_t status_addr;
+	phys_addr_t ring_addr;
+
+	struct ioat_dma_hw_desc *desc_ring;
+
+	unsigned short next_read;
+	unsigned short next_write;
+	unsigned short last_write; /* Used to compute submitted count. */
+	unsigned short offset; /* Used after a device recovery when counts -> 0. */
+	unsigned int failure; /* Used to store chanerr for error handling. */
+
+	/* To report completions, the device will write status back here. */
+	volatile uint64_t status __rte_cache_aligned;
+
+	/* Pointer to the register bar. */
+	volatile struct ioat_registers *regs;
+};
+
 extern int ioat_pmd_logtype;
 
 #define IOAT_PMD_LOG(level, fmt, args...) rte_log(RTE_LOG_ ## level, \