[dpdk-dev,v4,12/26] net/bnxt: add support to set MTU

Message ID 20170601170723.48709-13-ajit.khaparde@broadcom.com (mailing list archive)
State Accepted, archived
Delegated to: Ferruh Yigit
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation fail apply patch file failure

Commit Message

Ajit Khaparde June 1, 2017, 5:07 p.m. UTC
  This patch adds support to modify MTU using the set_mtu dev_op.
To support frames > 2k, the PMD creates an aggregator ring.
When a frame greater than 2k is received, it is fragmented
and the resulting fragments are DMA'ed to the aggregator ring.
Now the driver can support jumbo frames upto 9500 bytes.

Signed-off-by: Steeven Li <steeven.li@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>

--
v1->v2: regroup related patches and incorporate other review comments

v2->v3:
  - rebasing to next-net tree
  - Use net/bnxt instead of just bnxt in patch subject
---
 doc/guides/nics/features/bnxt.ini      |   2 +
 drivers/net/bnxt/bnxt.h                |   4 +-
 drivers/net/bnxt/bnxt_cpr.c            |   3 +-
 drivers/net/bnxt/bnxt_ethdev.c         |  59 ++++++++++
 drivers/net/bnxt/bnxt_hwrm.c           | 121 ++++++++++++--------
 drivers/net/bnxt/bnxt_hwrm.h           |   4 +-
 drivers/net/bnxt/bnxt_ring.c           |  81 ++++++++++++--
 drivers/net/bnxt/bnxt_ring.h           |   2 +
 drivers/net/bnxt/bnxt_rxq.c            |  29 +++--
 drivers/net/bnxt/bnxt_rxq.h            |   1 +
 drivers/net/bnxt/bnxt_rxr.c            | 195 +++++++++++++++++++++++++++++----
 drivers/net/bnxt/bnxt_rxr.h            |   6 +
 drivers/net/bnxt/hsi_struct_def_dpdk.h |   1 -
 13 files changed, 422 insertions(+), 86 deletions(-)
  

Comments

Ferruh Yigit June 6, 2017, 12:47 p.m. UTC | #1
On 6/1/2017 6:07 PM, Ajit Khaparde wrote:
> This patch adds support to modify MTU using the set_mtu dev_op.
> To support frames > 2k, the PMD creates an aggregator ring.
> When a frame greater than 2k is received, it is fragmented
> and the resulting fragments are DMA'ed to the aggregator ring.
> Now the driver can support jumbo frames upto 9500 bytes.
> 
> Signed-off-by: Steeven Li <steeven.li@broadcom.com>
> Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> 
> --
> v1->v2: regroup related patches and incorporate other review comments
> 
> v2->v3:
>   - rebasing to next-net tree
>   - Use net/bnxt instead of just bnxt in patch subject

<...>

> +int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp,
> +			struct bnxt_vnic_info *vnic)
> +{
> +	int rc = 0;
> +	struct hwrm_vnic_plcmodes_cfg_input req = {.req_type = 0 };
> +	struct hwrm_vnic_plcmodes_cfg_output *resp = bp->hwrm_cmd_resp_addr;
> +	uint16_t size;
> +
> +	HWRM_PREP(req, VNIC_PLCMODES_CFG, -1, resp);
> +
> +	req.flags = rte_cpu_to_le_32(
> +//			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_REGULAR_PLACEMENT |
> +			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_JUMBO_PLACEMENT);
> +//			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV4 | //TODO
> +//			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV6);

Hi Ajit,

Would you mind if I remove these commented code, in this patch and other
patches, while applying?

Of course it would be better if you send the new version of the patch to
fix them, but I believe I can do this faster. Just let me know please.

Thanks,
ferruh

> +	req.enables = rte_cpu_to_le_32(
> +		HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_JUMBO_THRESH_VALID);
> +//		HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_HDS_THRESHOLD_VALID);
> +
> +	size = rte_pktmbuf_data_room_size(bp->rx_queues[0]->mb_pool);
> +	size -= RTE_PKTMBUF_HEADROOM;
> +
> +	req.jumbo_thresh = rte_cpu_to_le_16(size);
> +//	req.hds_threshold = rte_cpu_to_le_16(size);
> +	req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id);
> +
> +	rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
> +
> +	HWRM_CHECK_RESULT;
> +
> +	return rc;
> +}

<...>
  
Ajit Khaparde June 6, 2017, 2 p.m. UTC | #2
Ferruh, if it save times, can you please do that.

Thanks
Ajit

On Tue, Jun 6, 2017 at 7:47 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:

> On 6/1/2017 6:07 PM, Ajit Khaparde wrote:
> > This patch adds support to modify MTU using the set_mtu dev_op.
> > To support frames > 2k, the PMD creates an aggregator ring.
> > When a frame greater than 2k is received, it is fragmented
> > and the resulting fragments are DMA'ed to the aggregator ring.
> > Now the driver can support jumbo frames upto 9500 bytes.
> >
> > Signed-off-by: Steeven Li <steeven.li@broadcom.com>
> > Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
> >
> > --
> > v1->v2: regroup related patches and incorporate other review comments
> >
> > v2->v3:
> >   - rebasing to next-net tree
> >   - Use net/bnxt instead of just bnxt in patch subject
>
> <...>
>
> > +int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp,
> > +                     struct bnxt_vnic_info *vnic)
> > +{
> > +     int rc = 0;
> > +     struct hwrm_vnic_plcmodes_cfg_input req = {.req_type = 0 };
> > +     struct hwrm_vnic_plcmodes_cfg_output *resp =
> bp->hwrm_cmd_resp_addr;
> > +     uint16_t size;
> > +
> > +     HWRM_PREP(req, VNIC_PLCMODES_CFG, -1, resp);
> > +
> > +     req.flags = rte_cpu_to_le_32(
> > +//                   HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_REGULAR_PLACEMENT
> |
> > +                     HWRM_VNIC_PLCMODES_CFG_INPUT_
> FLAGS_JUMBO_PLACEMENT);
> > +//                   HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV4 |
> //TODO
> > +//                   HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV6);
>
> Hi Ajit,
>
> Would you mind if I remove these commented code, in this patch and other
> patches, while applying?
>
> Of course it would be better if you send the new version of the patch to
> fix them, but I believe I can do this faster. Just let me know please.
>
> Thanks,
> ferruh
>
> > +     req.enables = rte_cpu_to_le_32(
> > +             HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_JUMBO_THRESH_VALID);
> > +//           HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_HDS_THRESHOLD_VALID);
> > +
> > +     size = rte_pktmbuf_data_room_size(bp->rx_queues[0]->mb_pool);
> > +     size -= RTE_PKTMBUF_HEADROOM;
> > +
> > +     req.jumbo_thresh = rte_cpu_to_le_16(size);
> > +//   req.hds_threshold = rte_cpu_to_le_16(size);
> > +     req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id);
> > +
> > +     rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
> > +
> > +     HWRM_CHECK_RESULT;
> > +
> > +     return rc;
> > +}
>
> <...>
>
>
  
Ferruh Yigit June 6, 2017, 2:25 p.m. UTC | #3
On 6/6/2017 3:00 PM, Ajit Khaparde wrote:
> Ferruh, if it save times, can you please do that.

Done.

> 
> Thanks
> Ajit
> 
> On Tue, Jun 6, 2017 at 7:47 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> 
>> On 6/1/2017 6:07 PM, Ajit Khaparde wrote:
>>> This patch adds support to modify MTU using the set_mtu dev_op.
>>> To support frames > 2k, the PMD creates an aggregator ring.
>>> When a frame greater than 2k is received, it is fragmented
>>> and the resulting fragments are DMA'ed to the aggregator ring.
>>> Now the driver can support jumbo frames upto 9500 bytes.
>>>
>>> Signed-off-by: Steeven Li <steeven.li@broadcom.com>
>>> Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
>>>
>>> --
>>> v1->v2: regroup related patches and incorporate other review comments
>>>
>>> v2->v3:
>>>   - rebasing to next-net tree
>>>   - Use net/bnxt instead of just bnxt in patch subject
>>
>> <...>
>>
>>> +int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp,
>>> +                     struct bnxt_vnic_info *vnic)
>>> +{
>>> +     int rc = 0;
>>> +     struct hwrm_vnic_plcmodes_cfg_input req = {.req_type = 0 };
>>> +     struct hwrm_vnic_plcmodes_cfg_output *resp =
>> bp->hwrm_cmd_resp_addr;
>>> +     uint16_t size;
>>> +
>>> +     HWRM_PREP(req, VNIC_PLCMODES_CFG, -1, resp);
>>> +
>>> +     req.flags = rte_cpu_to_le_32(
>>> +//                   HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_REGULAR_PLACEMENT
>> |
>>> +                     HWRM_VNIC_PLCMODES_CFG_INPUT_
>> FLAGS_JUMBO_PLACEMENT);
>>> +//                   HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV4 |
>> //TODO
>>> +//                   HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV6);
>>
>> Hi Ajit,
>>
>> Would you mind if I remove these commented code, in this patch and other
>> patches, while applying?
>>
>> Of course it would be better if you send the new version of the patch to
>> fix them, but I believe I can do this faster. Just let me know please.
>>
>> Thanks,
>> ferruh
>>
>>> +     req.enables = rte_cpu_to_le_32(
>>> +             HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_JUMBO_THRESH_VALID);
>>> +//           HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_HDS_THRESHOLD_VALID);
>>> +
>>> +     size = rte_pktmbuf_data_room_size(bp->rx_queues[0]->mb_pool);
>>> +     size -= RTE_PKTMBUF_HEADROOM;
>>> +
>>> +     req.jumbo_thresh = rte_cpu_to_le_16(size);
>>> +//   req.hds_threshold = rte_cpu_to_le_16(size);
>>> +     req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id);
>>> +
>>> +     rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
>>> +
>>> +     HWRM_CHECK_RESULT;
>>> +
>>> +     return rc;
>>> +}
>>
>> <...>
>>
>>
  

Patch

diff --git a/doc/guides/nics/features/bnxt.ini b/doc/guides/nics/features/bnxt.ini
index b1be3fd..066cb01 100644
--- a/doc/guides/nics/features/bnxt.ini
+++ b/doc/guides/nics/features/bnxt.ini
@@ -17,6 +17,8 @@  Basic stats          = Y
 Extended stats       = Y
 SR-IOV               = Y
 FW version           = Y
+Jumbo frame          = Y
+MTU update           = Y
 Linux VFIO           = Y
 Linux UIO            = Y
 x86-64               = Y
diff --git a/drivers/net/bnxt/bnxt.h b/drivers/net/bnxt/bnxt.h
index fde2202..8e6689f 100644
--- a/drivers/net/bnxt/bnxt.h
+++ b/drivers/net/bnxt/bnxt.h
@@ -45,7 +45,7 @@ 
 
 #include "bnxt_cpr.h"
 
-#define BNXT_MAX_MTU		9000
+#define BNXT_MAX_MTU		9500
 #define VLAN_TAG_SIZE		4
 
 enum bnxt_hw_context {
@@ -131,6 +131,7 @@  struct bnxt {
 #define BNXT_FLAG_REGISTERED	(1 << 0)
 #define BNXT_FLAG_VF		(1 << 1)
 #define BNXT_FLAG_PORT_STATS	(1 << 2)
+#define BNXT_FLAG_JUMBO		(1 << 3)
 #define BNXT_PF(bp)		(!((bp)->flags & BNXT_FLAG_VF))
 #define BNXT_VF(bp)		((bp)->flags & BNXT_FLAG_VF)
 #define BNXT_NPAR_ENABLED(bp)	((bp)->port_partition_type)
@@ -233,4 +234,5 @@  struct rte_pmd_bnxt_mb_event_param {
 int bnxt_link_update_op(struct rte_eth_dev *eth_dev, int wait_to_complete);
 int bnxt_rcv_msg_from_vf(struct bnxt *bp, uint16_t vf_id, void *msg);
 
+#define RX_PROD_AGG_BD_TYPE_RX_PROD_AGG		0x6
 #endif
diff --git a/drivers/net/bnxt/bnxt_cpr.c b/drivers/net/bnxt/bnxt_cpr.c
index d13e6cc..f369cf6 100644
--- a/drivers/net/bnxt/bnxt_cpr.c
+++ b/drivers/net/bnxt/bnxt_cpr.c
@@ -159,7 +159,8 @@  int bnxt_alloc_def_cp_ring(struct bnxt *bp)
 
 	rc = bnxt_hwrm_ring_alloc(bp, cp_ring,
 				  HWRM_RING_ALLOC_INPUT_RING_TYPE_L2_CMPL,
-				  0, HWRM_NA_SIGNATURE);
+				  0, HWRM_NA_SIGNATURE,
+				  HWRM_NA_SIGNATURE);
 	if (rc)
 		goto err_out;
 	cpr->cp_doorbell = bp->pdev->mem_resource[2].addr;
diff --git a/drivers/net/bnxt/bnxt_ethdev.c b/drivers/net/bnxt/bnxt_ethdev.c
index e987732..6197be1 100644
--- a/drivers/net/bnxt/bnxt_ethdev.c
+++ b/drivers/net/bnxt/bnxt_ethdev.c
@@ -199,6 +199,14 @@  static int bnxt_init_chip(struct bnxt *bp)
 	struct rte_eth_link new;
 	int rc;
 
+	if (bp->eth_dev->data->mtu > ETHER_MTU) {
+		bp->eth_dev->data->dev_conf.rxmode.jumbo_frame = 1;
+		bp->flags |= BNXT_FLAG_JUMBO;
+	} else {
+		bp->eth_dev->data->dev_conf.rxmode.jumbo_frame = 0;
+		bp->flags &= ~BNXT_FLAG_JUMBO;
+	}
+
 	rc = bnxt_alloc_all_hwrm_stat_ctxs(bp);
 	if (rc) {
 		RTE_LOG(ERR, PMD, "HWRM stat ctx alloc failure rc: %x\n", rc);
@@ -1375,6 +1383,56 @@  bnxt_fw_version_get(struct rte_eth_dev *dev, char *fw_version, size_t fw_size)
 		return 0;
 }
 
+static int bnxt_mtu_set_op(struct rte_eth_dev *eth_dev, uint16_t new_mtu)
+{
+	struct bnxt *bp = eth_dev->data->dev_private;
+	struct rte_eth_dev_info dev_info;
+	uint32_t max_dev_mtu;
+	uint32_t rc = 0;
+	uint32_t i;
+
+	bnxt_dev_info_get_op(eth_dev, &dev_info);
+	max_dev_mtu = dev_info.max_rx_pktlen -
+		      ETHER_HDR_LEN - ETHER_CRC_LEN - VLAN_TAG_SIZE * 2;
+
+	if (new_mtu < ETHER_MIN_MTU || new_mtu > max_dev_mtu) {
+		RTE_LOG(ERR, PMD, "MTU requested must be within (%d, %d)\n",
+			ETHER_MIN_MTU, max_dev_mtu);
+		return -EINVAL;
+	}
+
+
+	if (new_mtu > ETHER_MTU) {
+		bp->flags |= BNXT_FLAG_JUMBO;
+		eth_dev->data->dev_conf.rxmode.jumbo_frame = 1;
+	} else {
+		eth_dev->data->dev_conf.rxmode.jumbo_frame = 0;
+		bp->flags &= ~BNXT_FLAG_JUMBO;
+	}
+
+	eth_dev->data->dev_conf.rxmode.max_rx_pkt_len =
+		new_mtu + ETHER_HDR_LEN + ETHER_CRC_LEN + VLAN_TAG_SIZE * 2;
+
+	eth_dev->data->mtu = new_mtu;
+	RTE_LOG(INFO, PMD, "New MTU is %d\n", eth_dev->data->mtu);
+
+	for (i = 0; i < bp->nr_vnics; i++) {
+		struct bnxt_vnic_info *vnic = &bp->vnic_info[i];
+
+		vnic->mru = bp->eth_dev->data->mtu + ETHER_HDR_LEN +
+					ETHER_CRC_LEN + VLAN_TAG_SIZE * 2;
+		rc = bnxt_hwrm_vnic_cfg(bp, vnic);
+		if (rc)
+			break;
+
+		rc = bnxt_hwrm_vnic_plcmode_cfg(bp, vnic);
+		if (rc)
+			return rc;
+	}
+
+	return rc;
+}
+
 /*
  * Initialization
  */
@@ -1410,6 +1468,7 @@  static const struct eth_dev_ops bnxt_dev_ops = {
 	.udp_tunnel_port_del  = bnxt_udp_tunnel_port_del_op,
 	.vlan_filter_set = bnxt_vlan_filter_set_op,
 	.vlan_offload_set = bnxt_vlan_offload_set_op,
+	.mtu_set = bnxt_mtu_set_op,
 	.mac_addr_set = bnxt_set_default_mac_addr_op,
 	.xstats_get = bnxt_dev_xstats_get_op,
 	.xstats_get_names = bnxt_dev_xstats_get_names_op,
diff --git a/drivers/net/bnxt/bnxt_hwrm.c b/drivers/net/bnxt/bnxt_hwrm.c
index 22e18d6..eb4f540 100644
--- a/drivers/net/bnxt/bnxt_hwrm.c
+++ b/drivers/net/bnxt/bnxt_hwrm.c
@@ -665,20 +665,20 @@  int bnxt_hwrm_queue_qportcfg(struct bnxt *bp)
 int bnxt_hwrm_ring_alloc(struct bnxt *bp,
 			 struct bnxt_ring *ring,
 			 uint32_t ring_type, uint32_t map_index,
-			 uint32_t stats_ctx_id)
+			 uint32_t stats_ctx_id, uint32_t cmpl_ring_id)
 {
 	int rc = 0;
+	uint32_t enables = 0;
 	struct hwrm_ring_alloc_input req = {.req_type = 0 };
 	struct hwrm_ring_alloc_output *resp = bp->hwrm_cmd_resp_addr;
 
 	HWRM_PREP(req, RING_ALLOC, -1, resp);
 
-	req.enables = rte_cpu_to_le_32(0);
-
 	req.page_tbl_addr = rte_cpu_to_le_64(ring->bd_dma);
 	req.fbo = rte_cpu_to_le_32(0);
 	/* Association of ring index with doorbell index */
 	req.logical_id = rte_cpu_to_le_16(map_index);
+	req.length = rte_cpu_to_le_32(ring->ring_size);
 
 	switch (ring_type) {
 	case HWRM_RING_ALLOC_INPUT_RING_TYPE_TX:
@@ -686,12 +686,11 @@  int bnxt_hwrm_ring_alloc(struct bnxt *bp,
 		/* FALLTHROUGH */
 	case HWRM_RING_ALLOC_INPUT_RING_TYPE_RX:
 		req.ring_type = ring_type;
-		req.cmpl_ring_id =
-		    rte_cpu_to_le_16(bp->grp_info[map_index].cp_fw_ring_id);
-		req.length = rte_cpu_to_le_32(ring->ring_size);
+		req.cmpl_ring_id = rte_cpu_to_le_16(cmpl_ring_id);
 		req.stat_ctx_id = rte_cpu_to_le_16(stats_ctx_id);
-		req.enables = rte_cpu_to_le_32(rte_le_to_cpu_32(req.enables) |
-			HWRM_RING_ALLOC_INPUT_ENABLES_STAT_CTX_ID_VALID);
+		if (stats_ctx_id != INVALID_STATS_CTX_ID)
+			enables |=
+			HWRM_RING_ALLOC_INPUT_ENABLES_STAT_CTX_ID_VALID;
 		break;
 	case HWRM_RING_ALLOC_INPUT_RING_TYPE_L2_CMPL:
 		req.ring_type = ring_type;
@@ -700,13 +699,13 @@  int bnxt_hwrm_ring_alloc(struct bnxt *bp,
 		 * HWRM_RING_ALLOC_INPUT_INT_MODE_POLL
 		 */
 		req.int_mode = HWRM_RING_ALLOC_INPUT_INT_MODE_MSIX;
-		req.length = rte_cpu_to_le_32(ring->ring_size);
 		break;
 	default:
 		RTE_LOG(ERR, PMD, "hwrm alloc invalid ring type %d\n",
 			ring_type);
 		return -1;
 	}
+	req.enables = rte_cpu_to_le_32(enables);
 
 	rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
 
@@ -837,8 +836,8 @@  int bnxt_hwrm_stat_clear(struct bnxt *bp, struct bnxt_cp_ring_info *cpr)
 	return rc;
 }
 
-int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp,
-			     struct bnxt_cp_ring_info *cpr, unsigned int idx)
+int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
+				unsigned int idx __rte_unused)
 {
 	int rc;
 	struct hwrm_stat_ctx_alloc_input req = {.req_type = 0 };
@@ -857,13 +856,12 @@  int bnxt_hwrm_stat_ctx_alloc(struct bnxt *bp,
 	HWRM_CHECK_RESULT;
 
 	cpr->hw_stats_ctx_id = rte_le_to_cpu_16(resp->stat_ctx_id);
-	bp->grp_info[idx].fw_stats_ctx = cpr->hw_stats_ctx_id;
 
 	return rc;
 }
 
-int bnxt_hwrm_stat_ctx_free(struct bnxt *bp,
-			    struct bnxt_cp_ring_info *cpr, unsigned int idx)
+int bnxt_hwrm_stat_ctx_free(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
+				unsigned int idx __rte_unused)
 {
 	int rc;
 	struct hwrm_stat_ctx_free_input req = {.req_type = 0 };
@@ -878,9 +876,6 @@  int bnxt_hwrm_stat_ctx_free(struct bnxt *bp,
 
 	HWRM_CHECK_RESULT;
 
-	cpr->hw_stats_ctx_id = HWRM_NA_SIGNATURE;
-	bp->grp_info[idx].fw_stats_ctx = cpr->hw_stats_ctx_id;
-
 	return rc;
 }
 
@@ -891,15 +886,10 @@  int bnxt_hwrm_vnic_alloc(struct bnxt *bp, struct bnxt_vnic_info *vnic)
 	struct hwrm_vnic_alloc_output *resp = bp->hwrm_cmd_resp_addr;
 
 	/* map ring groups to this vnic */
-	for (i = vnic->start_grp_id, j = 0; i <= vnic->end_grp_id; i++, j++) {
-		if (bp->grp_info[i].fw_grp_id == (uint16_t)HWRM_NA_SIGNATURE) {
-			RTE_LOG(ERR, PMD,
-				"Not enough ring groups avail:%x req:%x\n", j,
-				(vnic->end_grp_id - vnic->start_grp_id) + 1);
-			break;
-		}
+	RTE_LOG(DEBUG, PMD, "Alloc VNIC. Start %x, End %x\n",
+		vnic->start_grp_id, vnic->end_grp_id);
+	for (i = vnic->start_grp_id, j = 0; i <= vnic->end_grp_id; i++, j++)
 		vnic->fw_grp_ids[j] = bp->grp_info[i].fw_grp_id;
-	}
 	vnic->dflt_ring_grp = bp->grp_info[vnic->start_grp_id].fw_grp_id;
 	vnic->rss_rule = (uint16_t)HWRM_NA_SIGNATURE;
 	vnic->cos_rule = (uint16_t)HWRM_NA_SIGNATURE;
@@ -1151,6 +1141,39 @@  int bnxt_hwrm_vnic_rss_cfg(struct bnxt *bp,
 	return rc;
 }
 
+int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp,
+			struct bnxt_vnic_info *vnic)
+{
+	int rc = 0;
+	struct hwrm_vnic_plcmodes_cfg_input req = {.req_type = 0 };
+	struct hwrm_vnic_plcmodes_cfg_output *resp = bp->hwrm_cmd_resp_addr;
+	uint16_t size;
+
+	HWRM_PREP(req, VNIC_PLCMODES_CFG, -1, resp);
+
+	req.flags = rte_cpu_to_le_32(
+//			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_REGULAR_PLACEMENT |
+			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_JUMBO_PLACEMENT);
+//			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV4 | //TODO
+//			HWRM_VNIC_PLCMODES_CFG_INPUT_FLAGS_HDS_IPV6);
+	req.enables = rte_cpu_to_le_32(
+		HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_JUMBO_THRESH_VALID);
+//		HWRM_VNIC_PLCMODES_CFG_INPUT_ENABLES_HDS_THRESHOLD_VALID);
+
+	size = rte_pktmbuf_data_room_size(bp->rx_queues[0]->mb_pool);
+	size -= RTE_PKTMBUF_HEADROOM;
+
+	req.jumbo_thresh = rte_cpu_to_le_16(size);
+//	req.hds_threshold = rte_cpu_to_le_16(size);
+	req.vnic_id = rte_cpu_to_le_32(vnic->fw_vnic_id);
+
+	rc = bnxt_hwrm_send_message(bp, &req, sizeof(req));
+
+	HWRM_CHECK_RESULT;
+
+	return rc;
+}
+
 int bnxt_hwrm_func_vf_mac(struct bnxt *bp, uint16_t vf, const uint8_t *mac_addr)
 {
 	struct hwrm_func_cfg_input req = {0};
@@ -1209,14 +1232,20 @@  int bnxt_free_all_hwrm_stat_ctxs(struct bnxt *bp)
 	struct bnxt_cp_ring_info *cpr;
 
 	for (i = 0; i < bp->rx_cp_nr_rings + bp->tx_cp_nr_rings; i++) {
-		unsigned int idx = i + 1;
 
 		if (i >= bp->rx_cp_nr_rings)
 			cpr = bp->tx_queues[i - bp->rx_cp_nr_rings]->cp_ring;
 		else
 			cpr = bp->rx_queues[i]->cp_ring;
 		if (cpr->hw_stats_ctx_id != HWRM_NA_SIGNATURE) {
-			rc = bnxt_hwrm_stat_ctx_free(bp, cpr, idx);
+			rc = bnxt_hwrm_stat_ctx_free(bp, cpr, i);
+			cpr->hw_stats_ctx_id = HWRM_NA_SIGNATURE;
+			/*
+			 * TODO. Need a better way to reset grp_info.stats_ctx
+			 * for Rx rings only. stats_ctx is not saved for Tx
+			 * in grp_info.
+			 */
+			bp->grp_info[i].fw_stats_ctx = cpr->hw_stats_ctx_id;
 			if (rc)
 				return rc;
 		}
@@ -1233,7 +1262,6 @@  int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp)
 		struct bnxt_tx_queue *txq;
 		struct bnxt_rx_queue *rxq;
 		struct bnxt_cp_ring_info *cpr;
-		unsigned int idx = i + 1;
 
 		if (i >= bp->rx_cp_nr_rings) {
 			txq = bp->tx_queues[i - bp->rx_cp_nr_rings];
@@ -1243,7 +1271,7 @@  int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp)
 			cpr = rxq->cp_ring;
 		}
 
-		rc = bnxt_hwrm_stat_ctx_alloc(bp, cpr, idx);
+		rc = bnxt_hwrm_stat_ctx_alloc(bp, cpr, i);
 
 		if (rc)
 			return rc;
@@ -1253,11 +1281,10 @@  int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp)
 
 int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp)
 {
-	uint16_t i;
+	uint16_t idx;
 	uint32_t rc = 0;
 
-	for (i = 0; i < bp->rx_cp_nr_rings; i++) {
-		unsigned int idx = i + 1;
+	for (idx = 0; idx < bp->rx_cp_nr_rings; idx++) {
 
 		if (bp->grp_info[idx].fw_grp_id == INVALID_HW_RING_ID) {
 			RTE_LOG(ERR, PMD,
@@ -1274,8 +1301,8 @@  int bnxt_free_all_hwrm_ring_grps(struct bnxt *bp)
 	return rc;
 }
 
-static void bnxt_free_cp_ring(struct bnxt *bp,
-			      struct bnxt_cp_ring_info *cpr, unsigned int idx)
+static void bnxt_free_cp_ring(struct bnxt *bp, struct bnxt_cp_ring_info *cpr,
+				unsigned int idx __rte_unused)
 {
 	struct bnxt_ring *cp_ring = cpr->cp_ring_struct;
 
@@ -1313,8 +1340,10 @@  int bnxt_free_all_hwrm_rings(struct bnxt *bp)
 			txr->tx_prod = 0;
 			txr->tx_cons = 0;
 		}
-		if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID)
+		if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID) {
 			bnxt_free_cp_ring(bp, cpr, idx);
+			cpr->cp_ring_struct->fw_ring_id = INVALID_HW_RING_ID;
+		}
 	}
 
 	for (i = 0; i < bp->rx_cp_nr_rings; i++) {
@@ -1336,17 +1365,26 @@  int bnxt_free_all_hwrm_rings(struct bnxt *bp)
 					rxr->rx_ring_struct->ring_size *
 					sizeof(*rxr->rx_buf_ring));
 			rxr->rx_prod = 0;
+			memset(rxr->ag_buf_ring, 0,
+					rxr->ag_ring_struct->ring_size *
+					sizeof(*rxr->ag_buf_ring));
+			rxr->ag_prod = 0;
 		}
-		if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID)
+		if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID) {
 			bnxt_free_cp_ring(bp, cpr, idx);
+			bp->grp_info[i].cp_fw_ring_id = INVALID_HW_RING_ID;
+			cpr->cp_ring_struct->fw_ring_id = INVALID_HW_RING_ID;
+		}
 	}
 
 	/* Default completion ring */
 	{
 		struct bnxt_cp_ring_info *cpr = bp->def_cp_ring;
 
-		if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID)
+		if (cpr->cp_ring_struct->fw_ring_id != INVALID_HW_RING_ID) {
 			bnxt_free_cp_ring(bp, cpr, 0);
+			cpr->cp_ring_struct->fw_ring_id = INVALID_HW_RING_ID;
+		}
 	}
 
 	return rc;
@@ -1358,14 +1396,7 @@  int bnxt_alloc_all_hwrm_ring_grps(struct bnxt *bp)
 	uint32_t rc = 0;
 
 	for (i = 0; i < bp->rx_cp_nr_rings; i++) {
-		unsigned int idx = i + 1;
-
-		if (bp->grp_info[idx].cp_fw_ring_id == INVALID_HW_RING_ID ||
-		    bp->grp_info[idx].rx_fw_ring_id == INVALID_HW_RING_ID)
-			continue;
-
-		rc = bnxt_hwrm_ring_grp_alloc(bp, idx);
-
+		rc = bnxt_hwrm_ring_grp_alloc(bp, i);
 		if (rc)
 			return rc;
 	}
diff --git a/drivers/net/bnxt/bnxt_hwrm.h b/drivers/net/bnxt/bnxt_hwrm.h
index c15404f..79b27ee 100644
--- a/drivers/net/bnxt/bnxt_hwrm.h
+++ b/drivers/net/bnxt/bnxt_hwrm.h
@@ -70,7 +70,7 @@  int bnxt_hwrm_queue_qportcfg(struct bnxt *bp);
 int bnxt_hwrm_ring_alloc(struct bnxt *bp,
 			 struct bnxt_ring *ring,
 			 uint32_t ring_type, uint32_t map_index,
-			 uint32_t stats_ctx_id);
+			 uint32_t stats_ctx_id, uint32_t cmpl_ring_id);
 int bnxt_hwrm_ring_free(struct bnxt *bp,
 			struct bnxt_ring *ring, uint32_t ring_type);
 int bnxt_hwrm_ring_grp_alloc(struct bnxt *bp, unsigned int idx);
@@ -93,6 +93,8 @@  int bnxt_hwrm_vnic_ctx_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_free(struct bnxt *bp, struct bnxt_vnic_info *vnic);
 int bnxt_hwrm_vnic_rss_cfg(struct bnxt *bp,
 			   struct bnxt_vnic_info *vnic);
+int bnxt_hwrm_vnic_plcmode_cfg(struct bnxt *bp,
+				struct bnxt_vnic_info *vnic);
 
 int bnxt_alloc_all_hwrm_stat_ctxs(struct bnxt *bp);
 int bnxt_clear_all_hwrm_stat_ctxs(struct bnxt *bp);
diff --git a/drivers/net/bnxt/bnxt_ring.c b/drivers/net/bnxt/bnxt_ring.c
index 5e4236a..a12c0aa 100644
--- a/drivers/net/bnxt/bnxt_ring.c
+++ b/drivers/net/bnxt/bnxt_ring.c
@@ -115,8 +115,15 @@  int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
 	int rx_vmem_len = rx_ring_info ?
 		RTE_CACHE_LINE_ROUNDUP(rx_ring_info->
 						rx_ring_struct->vmem_size) : 0;
+	int ag_vmem_start = 0;
+	int ag_vmem_len = 0;
+	int cp_ring_start =  0;
+
+	ag_vmem_start = rx_vmem_start + rx_vmem_len;
+	ag_vmem_len = rx_ring_info ? RTE_CACHE_LINE_ROUNDUP(
+				rx_ring_info->ag_ring_struct->vmem_size) : 0;
+	cp_ring_start = ag_vmem_start + ag_vmem_len;
 
-	int cp_ring_start = rx_vmem_start + rx_vmem_len;
 	int cp_ring_len = RTE_CACHE_LINE_ROUNDUP(cp_ring->ring_size *
 						 sizeof(struct cmpl_base));
 
@@ -131,6 +138,10 @@  int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
 		sizeof(struct rx_prod_pkt_bd)) : 0;
 
 	int total_alloc_len = rx_ring_start + rx_ring_len;
+	int ag_ring_start = 0;
+
+	ag_ring_start = rx_ring_start + rx_ring_len;
+	total_alloc_len = ag_ring_start + rx_ring_len * AGG_RING_SIZE_FACTOR;
 
 	snprintf(mz_name, RTE_MEMZONE_NAMESIZE,
 		 "bnxt_%04x:%02x:%02x:%02x-%04x_%s", pdev->addr.domain,
@@ -201,6 +212,24 @@  int bnxt_alloc_rings(struct bnxt *bp, uint16_t qidx,
 			rx_ring_info->rx_buf_ring =
 			    (struct bnxt_sw_rx_bd *)rx_ring->vmem;
 		}
+
+		rx_ring = rx_ring_info->ag_ring_struct;
+
+		rx_ring->bd = ((char *)mz->addr + ag_ring_start);
+		rx_ring_info->ag_desc_ring =
+		    (struct rx_prod_pkt_bd *)rx_ring->bd;
+		rx_ring->bd_dma = mz->phys_addr + ag_ring_start;
+		rx_ring_info->ag_desc_mapping = rx_ring->bd_dma;
+		rx_ring->mem_zone = (const void *)mz;
+
+		if (!rx_ring->bd)
+			return -ENOMEM;
+		if (rx_ring->vmem_size) {
+			rx_ring->vmem =
+			    (void **)((char *)mz->addr + ag_vmem_start);
+			rx_ring_info->ag_buf_ring =
+			    (struct bnxt_sw_rx_bd *)rx_ring->vmem;
+		}
 	}
 
 	cp_ring->bd = ((char *)mz->addr + cp_ring_start);
@@ -239,35 +268,64 @@  int bnxt_alloc_hwrm_rings(struct bnxt *bp)
 		struct bnxt_rx_ring_info *rxr = rxq->rx_ring;
 		struct bnxt_ring *ring = rxr->rx_ring_struct;
 		unsigned int idx = i + 1;
+		unsigned int map_idx = idx + bp->rx_cp_nr_rings;
+
+		bp->grp_info[i].fw_stats_ctx = cpr->hw_stats_ctx_id;
 
 		/* Rx cmpl */
 		rc = bnxt_hwrm_ring_alloc(bp, cp_ring,
 					HWRM_RING_ALLOC_INPUT_RING_TYPE_L2_CMPL,
-					idx, HWRM_NA_SIGNATURE);
+					idx, HWRM_NA_SIGNATURE,
+					HWRM_NA_SIGNATURE);
 		if (rc)
 			goto err_out;
 		cpr->cp_doorbell = (char *)pci_dev->mem_resource[2].addr +
 		    idx * 0x80;
-		bp->grp_info[idx].cp_fw_ring_id = cp_ring->fw_ring_id;
+		bp->grp_info[i].cp_fw_ring_id = cp_ring->fw_ring_id;
 		B_CP_DIS_DB(cpr, cpr->cp_raw_cons);
 
 		/* Rx ring */
 		rc = bnxt_hwrm_ring_alloc(bp, ring,
 					HWRM_RING_ALLOC_INPUT_RING_TYPE_RX,
-					idx, cpr->hw_stats_ctx_id);
+					idx, cpr->hw_stats_ctx_id,
+					cp_ring->fw_ring_id);
 		if (rc)
 			goto err_out;
 		rxr->rx_prod = 0;
 		rxr->rx_doorbell = (char *)pci_dev->mem_resource[2].addr +
 		    idx * 0x80;
-		bp->grp_info[idx].rx_fw_ring_id = ring->fw_ring_id;
+		bp->grp_info[i].rx_fw_ring_id = ring->fw_ring_id;
 		B_RX_DB(rxr->rx_doorbell, rxr->rx_prod);
+
+		ring = rxr->ag_ring_struct;
+		/* Agg ring */
+		if (ring == NULL)
+			RTE_LOG(ERR, PMD, "Alloc AGG Ring is NULL!\n");
+
+		rc = bnxt_hwrm_ring_alloc(bp, ring,
+				HWRM_RING_ALLOC_INPUT_RING_TYPE_RX,
+				map_idx, HWRM_NA_SIGNATURE,
+				cp_ring->fw_ring_id);
+		if (rc)
+			goto err_out;
+		RTE_LOG(DEBUG, PMD, "Alloc AGG Done!\n");
+		rxr->ag_prod = 0;
+		rxr->ag_doorbell =
+		    (char *)pci_dev->mem_resource[2].addr +
+		    map_idx * 0x80;
+		bp->grp_info[i].ag_fw_ring_id = ring->fw_ring_id;
+		B_RX_DB(rxr->ag_doorbell, rxr->ag_prod);
+
+		rxq->rx_buf_use_size = BNXT_MAX_MTU + ETHER_HDR_LEN +
+					ETHER_CRC_LEN + (2 * VLAN_TAG_SIZE);
 		if (bnxt_init_one_rx_ring(rxq)) {
 			RTE_LOG(ERR, PMD, "bnxt_init_one_rx_ring failed!\n");
 			bnxt_rx_queue_release_op(rxq);
 			return -ENOMEM;
 		}
 		B_RX_DB(rxr->rx_doorbell, rxr->rx_prod);
+		B_RX_DB(rxr->ag_doorbell, rxr->ag_prod);
+		rxq->index = idx;
 	}
 
 	for (i = 0; i < bp->tx_cp_nr_rings; i++) {
@@ -276,29 +334,34 @@  int bnxt_alloc_hwrm_rings(struct bnxt *bp)
 		struct bnxt_ring *cp_ring = cpr->cp_ring_struct;
 		struct bnxt_tx_ring_info *txr = txq->tx_ring;
 		struct bnxt_ring *ring = txr->tx_ring_struct;
-		unsigned int idx = 1 + bp->rx_cp_nr_rings + i;
+		unsigned int idx = i + 1 + bp->rx_cp_nr_rings;
+
+		/* Account for AGG Rings. AGG ring cnt = Rx Cmpl ring cnt */
+		idx += bp->rx_cp_nr_rings;
 
 		/* Tx cmpl */
 		rc = bnxt_hwrm_ring_alloc(bp, cp_ring,
 					HWRM_RING_ALLOC_INPUT_RING_TYPE_L2_CMPL,
-					idx, HWRM_NA_SIGNATURE);
+					idx, HWRM_NA_SIGNATURE,
+					HWRM_NA_SIGNATURE);
 		if (rc)
 			goto err_out;
 
 		cpr->cp_doorbell = (char *)pci_dev->mem_resource[2].addr +
 		    idx * 0x80;
-		bp->grp_info[idx].cp_fw_ring_id = cp_ring->fw_ring_id;
 		B_CP_DIS_DB(cpr, cpr->cp_raw_cons);
 
 		/* Tx ring */
 		rc = bnxt_hwrm_ring_alloc(bp, ring,
 					HWRM_RING_ALLOC_INPUT_RING_TYPE_TX,
-					idx, cpr->hw_stats_ctx_id);
+					idx, cpr->hw_stats_ctx_id,
+					cp_ring->fw_ring_id);
 		if (rc)
 			goto err_out;
 
 		txr->tx_doorbell = (char *)pci_dev->mem_resource[2].addr +
 		    idx * 0x80;
+		txq->index = idx;
 	}
 
 err_out:
diff --git a/drivers/net/bnxt/bnxt_ring.h b/drivers/net/bnxt/bnxt_ring.h
index 8656549..b5bd287 100644
--- a/drivers/net/bnxt/bnxt_ring.h
+++ b/drivers/net/bnxt/bnxt_ring.h
@@ -58,6 +58,7 @@ 
 #define DEFAULT_TX_RING_SIZE	256
 
 #define MAX_TPA		128
+#define AGG_RING_SIZE_FACTOR 2
 
 /* These assume 4k pages */
 #define MAX_RX_DESC_CNT (8 * 1024)
@@ -65,6 +66,7 @@ 
 #define MAX_CP_DESC_CNT (16 * 1024)
 
 #define INVALID_HW_RING_ID      ((uint16_t)-1)
+#define INVALID_STATS_CTX_ID		((uint16_t)-1)
 
 struct bnxt_ring {
 	void			*bd;
diff --git a/drivers/net/bnxt/bnxt_rxq.c b/drivers/net/bnxt/bnxt_rxq.c
index 7625fb1..0d7d708 100644
--- a/drivers/net/bnxt/bnxt_rxq.c
+++ b/drivers/net/bnxt/bnxt_rxq.c
@@ -84,9 +84,8 @@  int bnxt_mq_rx_configure(struct bnxt *bp)
 
 		vnic->func_default = true;
 		vnic->ff_pool_idx = 0;
-		vnic->start_grp_id = 1;
-		vnic->end_grp_id = vnic->start_grp_id +
-				   bp->rx_cp_nr_rings - 1;
+		vnic->start_grp_id = 0;
+		vnic->end_grp_id = vnic->start_grp_id;
 		filter = bnxt_alloc_filter(bp);
 		if (!filter) {
 			RTE_LOG(ERR, PMD, "L2 filter alloc failed\n");
@@ -126,8 +125,8 @@  int bnxt_mq_rx_configure(struct bnxt *bp)
 			pools = ETH_64_POOLS;
 		}
 		nb_q_per_grp = bp->rx_cp_nr_rings / pools;
-		start_grp_id = 1;
-		end_grp_id = start_grp_id + nb_q_per_grp - 1;
+		start_grp_id = 0;
+		end_grp_id = nb_q_per_grp;
 
 		ring_idx = 0;
 		for (i = 0; i < pools; i++) {
@@ -188,9 +187,8 @@  int bnxt_mq_rx_configure(struct bnxt *bp)
 
 	vnic->func_default = true;
 	vnic->ff_pool_idx = 0;
-	vnic->start_grp_id = 1;
-	vnic->end_grp_id = vnic->start_grp_id +
-			   bp->rx_cp_nr_rings - 1;
+	vnic->start_grp_id = 0;
+	vnic->end_grp_id = bp->rx_cp_nr_rings;
 	filter = bnxt_alloc_filter(bp);
 	if (!filter) {
 		RTE_LOG(ERR, PMD, "L2 filter alloc failed\n");
@@ -228,6 +226,16 @@  static void bnxt_rx_queue_release_mbufs(struct bnxt_rx_queue *rxq)
 				}
 			}
 		}
+		/* Free up mbufs in Agg ring */
+		sw_ring = rxq->rx_ring->ag_buf_ring;
+		if (sw_ring) {
+			for (i = 0; i < rxq->nb_rx_desc; i++) {
+				if (sw_ring[i].mbuf) {
+					rte_pktmbuf_free_seg(sw_ring[i].mbuf);
+					sw_ring[i].mbuf = NULL;
+				}
+			}
+		}
 	}
 }
 
@@ -251,6 +259,8 @@  void bnxt_rx_queue_release_op(void *rx_queue)
 
 		/* Free RX ring hardware descriptors */
 		bnxt_free_ring(rxq->rx_ring->rx_ring_struct);
+		/* Free RX Agg ring hardware descriptors */
+		bnxt_free_ring(rxq->rx_ring->ag_ring_struct);
 
 		/* Free RX completion ring hardware descriptors */
 		bnxt_free_ring(rxq->cp_ring->cp_ring_struct);
@@ -295,6 +305,9 @@  int bnxt_rx_queue_setup_op(struct rte_eth_dev *eth_dev,
 	rxq->nb_rx_desc = nb_desc;
 	rxq->rx_free_thresh = rx_conf->rx_free_thresh;
 
+	RTE_LOG(DEBUG, PMD, "RX Buf size is %d\n", rxq->rx_buf_use_size);
+	RTE_LOG(DEBUG, PMD, "RX Buf MTU %d\n", eth_dev->data->mtu);
+
 	rc = bnxt_init_rx_ring_struct(rxq, socket_id);
 	if (rc)
 		goto out;
diff --git a/drivers/net/bnxt/bnxt_rxq.h b/drivers/net/bnxt/bnxt_rxq.h
index 9554329..0695214 100644
--- a/drivers/net/bnxt/bnxt_rxq.h
+++ b/drivers/net/bnxt/bnxt_rxq.h
@@ -52,6 +52,7 @@  struct bnxt_rx_queue {
 	uint8_t			crc_len; /* 0 if CRC stripped, 4 otherwise */
 
 	struct bnxt		*bp;
+	int			index;
 	struct bnxt_vnic_info	*vnic;
 
 	uint32_t			rx_buf_size;
diff --git a/drivers/net/bnxt/bnxt_rxr.c b/drivers/net/bnxt/bnxt_rxr.c
index 5d93de2..04ac673 100644
--- a/drivers/net/bnxt/bnxt_rxr.c
+++ b/drivers/net/bnxt/bnxt_rxr.c
@@ -77,6 +77,32 @@  static inline int bnxt_alloc_rx_data(struct bnxt_rx_queue *rxq,
 	return 0;
 }
 
+static inline int bnxt_alloc_ag_data(struct bnxt_rx_queue *rxq,
+				     struct bnxt_rx_ring_info *rxr,
+				     uint16_t prod)
+{
+	struct rx_prod_pkt_bd *rxbd = &rxr->ag_desc_ring[prod];
+	struct bnxt_sw_rx_bd *rx_buf = &rxr->ag_buf_ring[prod];
+	struct rte_mbuf *data;
+
+	data = __bnxt_alloc_rx_data(rxq->mb_pool);
+	if (!data)
+		return -ENOMEM;
+
+	if (rxbd == NULL)
+		RTE_LOG(ERR, PMD, "Jumbo Frame. rxbd is NULL\n");
+	if (rx_buf == NULL)
+		RTE_LOG(ERR, PMD, "Jumbo Frame. rx_buf is NULL\n");
+
+
+	rx_buf->mbuf = data;
+
+	rxbd->addr = rte_cpu_to_le_64(RTE_MBUF_DATA_DMA_ADDR(rx_buf->mbuf));
+
+	return 0;
+}
+
+#ifdef BNXT_DEBUG
 static void bnxt_reuse_rx_mbuf(struct bnxt_rx_ring_info *rxr, uint16_t cons,
 			       struct rte_mbuf *mbuf)
 {
@@ -94,6 +120,24 @@  static void bnxt_reuse_rx_mbuf(struct bnxt_rx_ring_info *rxr, uint16_t cons,
 	prod_bd->addr = cons_bd->addr;
 }
 
+static void bnxt_reuse_ag_mbuf(struct bnxt_rx_ring_info *rxr, uint16_t cons,
+			       struct rte_mbuf *mbuf)
+{
+	uint16_t prod = rxr->ag_prod;
+	struct bnxt_sw_rx_bd *prod_rx_buf;
+	struct rx_prod_pkt_bd *prod_bd, *cons_bd;
+
+	prod_rx_buf = &rxr->ag_buf_ring[prod];
+
+	prod_rx_buf->mbuf = mbuf;
+
+	prod_bd = &rxr->ag_desc_ring[prod];
+	cons_bd = &rxr->ag_desc_ring[cons];
+
+	prod_bd->addr = cons_bd->addr;
+}
+#endif
+
 static uint16_t bnxt_rx_pkt(struct rte_mbuf **rx_pkt,
 			    struct bnxt_rx_queue *rxq, uint32_t *raw_cons)
 {
@@ -104,9 +148,12 @@  static uint16_t bnxt_rx_pkt(struct rte_mbuf **rx_pkt,
 	uint32_t tmp_raw_cons = *raw_cons;
 	uint16_t cons, prod, cp_cons =
 	    RING_CMP(cpr->cp_ring_struct, tmp_raw_cons);
+	uint16_t ag_cons, ag_prod = rxr->ag_prod;
 	struct bnxt_sw_rx_bd *rx_buf;
 	struct rte_mbuf *mbuf;
 	int rc = 0;
+	uint8_t i;
+	uint8_t agg_buf = 0;
 
 	rxcmp = (struct rx_pkt_cmpl *)
 	    &cpr->cp_desc_ring[cp_cons];
@@ -126,6 +173,9 @@  static uint16_t bnxt_rx_pkt(struct rte_mbuf **rx_pkt,
 	mbuf = rx_buf->mbuf;
 	rte_prefetch0(mbuf);
 
+	if (mbuf == NULL)
+		return -ENOMEM;
+
 	mbuf->nb_segs = 1;
 	mbuf->next = NULL;
 	mbuf->pkt_len = rxcmp->len;
@@ -139,6 +189,63 @@  static uint16_t bnxt_rx_pkt(struct rte_mbuf **rx_pkt,
 		mbuf->hash.fdir.id = rxcmp1->cfa_code;
 		mbuf->ol_flags |= PKT_RX_FDIR | PKT_RX_FDIR_ID;
 	}
+
+	agg_buf = (rxcmp->agg_bufs_v1 & RX_PKT_CMPL_AGG_BUFS_MASK)
+			>> RX_PKT_CMPL_AGG_BUFS_SFT;
+	if (agg_buf) {
+		cp_cons = RING_CMP(cpr->cp_ring_struct, tmp_raw_cons + agg_buf);
+		rxcmp = (struct rx_pkt_cmpl *)
+					&cpr->cp_desc_ring[cp_cons];
+		if (!CMP_VALID(rxcmp, tmp_raw_cons + agg_buf,
+			       cpr->cp_ring_struct))
+			return -EBUSY;
+		RTE_LOG(DEBUG, PMD, "JUMBO Frame %d. %x, agg_buf %x,\n",
+			mbuf->pkt_len, rxcmp->agg_bufs_v1,  agg_buf);
+	}
+
+	for (i = 0; i < agg_buf; i++) {
+		struct bnxt_sw_rx_bd *ag_buf;
+		struct rte_mbuf *ag_mbuf;
+		tmp_raw_cons = NEXT_RAW_CMP(tmp_raw_cons);
+		cp_cons = RING_CMP(cpr->cp_ring_struct, tmp_raw_cons);
+		rxcmp = (struct rx_pkt_cmpl *)
+					&cpr->cp_desc_ring[cp_cons];
+		ag_cons = rxcmp->opaque;
+		ag_buf = &rxr->ag_buf_ring[ag_cons];
+		ag_mbuf = ag_buf->mbuf;
+		ag_mbuf->nb_segs = 1;
+		ag_mbuf->data_len = rxcmp->len;
+
+		mbuf->nb_segs++;
+		mbuf->pkt_len += ag_mbuf->data_len;
+		if (mbuf->next == NULL) {
+			mbuf->next = ag_mbuf;
+		} else {
+			struct rte_mbuf *temp_mbuf = mbuf;
+
+			while (temp_mbuf->next != NULL)
+				temp_mbuf = temp_mbuf->next;
+			temp_mbuf->next = ag_mbuf;
+		}
+		ag_buf->mbuf = NULL;
+
+		ag_prod = RING_NEXT(rxr->ag_ring_struct, ag_prod);
+		if (bnxt_alloc_ag_data(rxq, rxr, ag_prod)) {
+			RTE_LOG(ERR, PMD,
+				"agg mbuf alloc failed: prod=0x%x\n",
+				ag_prod);
+			rc = -ENOMEM;
+		}
+		rxr->ag_prod = ag_prod;
+
+#ifdef BNXT_DEBUG
+		if (!CMP_VALID((struct cmpl_base *)
+			&cpr->cp_desc_ring[cp_cons], tmp_raw_cons,
+			cpr->cp_ring_struct))
+			return -EBUSY;
+#endif
+	}
+
 	if (rxcmp1->flags2 & RX_PKT_CMPL_FLAGS2_META_FORMAT_VLAN) {
 		mbuf->vlan_tci = rxcmp1->metadata &
 			(RX_PKT_CMPL_METADATA_VID_MASK |
@@ -148,13 +255,17 @@  static uint16_t bnxt_rx_pkt(struct rte_mbuf **rx_pkt,
 	}
 
 	rx_buf->mbuf = NULL;
+#ifdef BNXT_DEBUG
 	if (rxcmp1->errors_v2 & RX_CMP_L2_ERRORS) {
 		/* Re-install the mbuf back to the rx ring */
 		bnxt_reuse_rx_mbuf(rxr, cons, mbuf);
+		if (agg_buf)
+			bnxt_reuse_ag_mbuf(rxr, ag_cons, mbuf);
 
 		rc = -EIO;
 		goto next_rx;
 	}
+#endif
 	/*
 	 * TODO: Redesign this....
 	 * If the allocation fails, the packet does not get received.
@@ -170,24 +281,23 @@  static uint16_t bnxt_rx_pkt(struct rte_mbuf **rx_pkt,
 	 * calls in favour of a tight loop with the same function being called
 	 * in it.
 	 */
+	prod = RING_NEXT(rxr->rx_ring_struct, prod);
 	if (bnxt_alloc_rx_data(rxq, rxr, prod)) {
 		RTE_LOG(ERR, PMD, "mbuf alloc failed with prod=0x%x\n", prod);
 		rc = -ENOMEM;
-		goto next_rx;
+		//goto next_rx;
 	}
-
+	rxr->rx_prod = prod;
 	/*
 	 * All MBUFs are allocated with the same size under DPDK,
 	 * no optimization for rx_copy_thresh
 	 */
 
-	/* AGG buf operation is deferred */
-
-	/* EW - VLAN reception.  Must compare against the ol_flags */
-
 	*rx_pkt = mbuf;
+#ifdef BNXT_DEBUG
 next_rx:
-	rxr->rx_prod = RING_NEXT(rxr->rx_ring_struct, prod);
+#endif
+	//rxr->rx_prod = RING_NEXT(rxr->rx_ring_struct, prod);
 
 	*raw_cons = tmp_raw_cons;
 
@@ -203,8 +313,9 @@  uint16_t bnxt_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	uint32_t raw_cons = cpr->cp_raw_cons;
 	uint32_t cons;
 	int nb_rx_pkts = 0;
-	bool rx_event = false;
 	struct rx_pkt_cmpl *rxcmp;
+	uint16_t prod = rxr->rx_prod;
+	uint16_t ag_prod = rxr->ag_prod;
 
 	/* Handle RX burst request */
 	while (1) {
@@ -224,13 +335,13 @@  uint16_t bnxt_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 				nb_rx_pkts++;
 			else if (rc == -EBUSY)	/* partial completion */
 				break;
-			rx_event = true;
 		}
 		raw_cons = NEXT_RAW_CMP(raw_cons);
 		if (nb_rx_pkts == nb_pkts)
 			break;
 	}
-	if (raw_cons == cpr->cp_raw_cons) {
+
+	if (prod == rxr->rx_prod && ag_prod == rxr->ag_prod) {
 		/*
 		 * For PMD, there is no need to keep on pushing to REARM
 		 * the doorbell if there are no new completions
@@ -240,8 +351,9 @@  uint16_t bnxt_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
 	cpr->cp_raw_cons = raw_cons;
 
 	B_CP_DIS_DB(cpr, cpr->cp_raw_cons);
-	if (rx_event)
-		B_RX_DB(rxr->rx_doorbell, rxr->rx_prod);
+	B_RX_DB(rxr->rx_doorbell, rxr->rx_prod);
+	/* Ring the AGG ring DB */
+	B_RX_DB(rxr->ag_doorbell, rxr->ag_prod);
 	return nb_rx_pkts;
 }
 
@@ -257,6 +369,12 @@  void bnxt_free_rx_rings(struct bnxt *bp)
 
 		bnxt_free_ring(rxq->rx_ring->rx_ring_struct);
 		rte_free(rxq->rx_ring->rx_ring_struct);
+
+		/* Free the Aggregator ring */
+		bnxt_free_ring(rxq->rx_ring->ag_ring_struct);
+		rte_free(rxq->rx_ring->ag_ring_struct);
+		rxq->rx_ring->ag_ring_struct = NULL;
+
 		rte_free(rxq->rx_ring);
 
 		bnxt_free_ring(rxq->cp_ring->cp_ring_struct);
@@ -270,13 +388,11 @@  void bnxt_free_rx_rings(struct bnxt *bp)
 
 int bnxt_init_rx_ring_struct(struct bnxt_rx_queue *rxq, unsigned int socket_id)
 {
-	struct bnxt *bp = rxq->bp;
 	struct bnxt_cp_ring_info *cpr;
 	struct bnxt_rx_ring_info *rxr;
 	struct bnxt_ring *ring;
 
-	rxq->rx_buf_use_size = bp->eth_dev->data->mtu +
-			       ETHER_HDR_LEN + ETHER_CRC_LEN +
+	rxq->rx_buf_use_size = BNXT_MAX_MTU + ETHER_HDR_LEN + ETHER_CRC_LEN +
 			       (2 * VLAN_TAG_SIZE);
 	rxq->rx_buf_size = rxq->rx_buf_use_size + sizeof(struct rte_mbuf);
 
@@ -313,13 +429,29 @@  int bnxt_init_rx_ring_struct(struct bnxt_rx_queue *rxq, unsigned int socket_id)
 	if (ring == NULL)
 		return -ENOMEM;
 	cpr->cp_ring_struct = ring;
-	ring->ring_size = rxr->rx_ring_struct->ring_size * 2;
+	ring->ring_size = rte_align32pow2(rxr->rx_ring_struct->ring_size *
+					  (2 + AGG_RING_SIZE_FACTOR));
 	ring->ring_mask = ring->ring_size - 1;
 	ring->bd = (void *)cpr->cp_desc_ring;
 	ring->bd_dma = cpr->cp_desc_mapping;
 	ring->vmem_size = 0;
 	ring->vmem = NULL;
 
+	/* Allocate Aggregator rings */
+	ring = rte_zmalloc_socket("bnxt_rx_ring_struct",
+				   sizeof(struct bnxt_ring),
+				   RTE_CACHE_LINE_SIZE, socket_id);
+	if (ring == NULL)
+		return -ENOMEM;
+	rxr->ag_ring_struct = ring;
+	ring->ring_size = rte_align32pow2(rxq->nb_rx_desc *
+					  AGG_RING_SIZE_FACTOR);
+	ring->ring_mask = ring->ring_size - 1;
+	ring->bd = (void *)rxr->ag_desc_ring;
+	ring->bd_dma = rxr->ag_desc_mapping;
+	ring->vmem_size = ring->ring_size * sizeof(struct bnxt_sw_rx_bd);
+	ring->vmem = (void **)&rxr->ag_buf_ring;
+
 	return 0;
 }
 
@@ -332,8 +464,8 @@  static void bnxt_init_rxbds(struct bnxt_ring *ring, uint32_t type,
 	if (!rx_bd_ring)
 		return;
 	for (j = 0; j < ring->ring_size; j++) {
-		rx_bd_ring[j].flags_type = type;
-		rx_bd_ring[j].len = len;
+		rx_bd_ring[j].flags_type = rte_cpu_to_le_16(type);
+		rx_bd_ring[j].len = rte_cpu_to_le_16(len);
 		rx_bd_ring[j].opaque = j;
 	}
 }
@@ -344,12 +476,17 @@  int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq)
 	struct bnxt_ring *ring;
 	uint32_t prod, type;
 	unsigned int i;
+	uint16_t size;
+
+	size = rte_pktmbuf_data_room_size(rxq->mb_pool) - RTE_PKTMBUF_HEADROOM;
+	if (rxq->rx_buf_use_size <= size)
+		size = rxq->rx_buf_use_size;
 
-	type = RX_PROD_PKT_BD_TYPE_RX_PROD_PKT | RX_PROD_PKT_BD_FLAGS_EOP_PAD;
+	type = RX_PROD_PKT_BD_TYPE_RX_PROD_PKT;
 
 	rxr = rxq->rx_ring;
 	ring = rxr->rx_ring_struct;
-	bnxt_init_rxbds(ring, type, rxq->rx_buf_use_size);
+	bnxt_init_rxbds(ring, type, size);
 
 	prod = rxr->rx_prod;
 	for (i = 0; i < ring->ring_size; i++) {
@@ -362,6 +499,24 @@  int bnxt_init_one_rx_ring(struct bnxt_rx_queue *rxq)
 		rxr->rx_prod = prod;
 		prod = RING_NEXT(rxr->rx_ring_struct, prod);
 	}
+	RTE_LOG(DEBUG, PMD, "%s\n", __func__);
+
+	ring = rxr->ag_ring_struct;
+	type = RX_PROD_AGG_BD_TYPE_RX_PROD_AGG;
+	bnxt_init_rxbds(ring, type, size);
+	prod = rxr->ag_prod;
+
+	for (i = 0; i < ring->ring_size; i++) {
+		if (bnxt_alloc_ag_data(rxq, rxr, prod) != 0) {
+			RTE_LOG(WARNING, PMD,
+			"init'ed AG ring %d with %d/%d mbufs only\n",
+			rxq->queue_id, i, ring->ring_size);
+			break;
+		}
+		rxr->ag_prod = prod;
+		prod = RING_NEXT(rxr->ag_ring_struct, prod);
+	}
+	RTE_LOG(DEBUG, PMD, "%s AGG Done!\n", __func__);
 
 	return 0;
 }
diff --git a/drivers/net/bnxt/bnxt_rxr.h b/drivers/net/bnxt/bnxt_rxr.h
index f766b26..e104fbd 100644
--- a/drivers/net/bnxt/bnxt_rxr.h
+++ b/drivers/net/bnxt/bnxt_rxr.h
@@ -43,14 +43,20 @@  struct bnxt_sw_rx_bd {
 
 struct bnxt_rx_ring_info {
 	uint16_t		rx_prod;
+	uint16_t		ag_prod;
 	void			*rx_doorbell;
+	void			*ag_doorbell;
 
 	struct rx_prod_pkt_bd	*rx_desc_ring;
+	struct rx_prod_pkt_bd	*ag_desc_ring;
 	struct bnxt_sw_rx_bd	*rx_buf_ring; /* sw ring */
+	struct bnxt_sw_rx_bd	*ag_buf_ring; /* sw ring */
 
 	phys_addr_t		rx_desc_mapping;
+	phys_addr_t		ag_desc_mapping;
 
 	struct bnxt_ring	*rx_ring_struct;
+	struct bnxt_ring	*ag_ring_struct;
 };
 
 uint16_t bnxt_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts,
diff --git a/drivers/net/bnxt/hsi_struct_def_dpdk.h b/drivers/net/bnxt/hsi_struct_def_dpdk.h
index 518f16d..2fcfce6 100644
--- a/drivers/net/bnxt/hsi_struct_def_dpdk.h
+++ b/drivers/net/bnxt/hsi_struct_def_dpdk.h
@@ -7252,7 +7252,6 @@  struct hwrm_vnic_tpa_cfg_output {
 	 */
 } __attribute__((packed));
 
-
 /* hwrm_ring_alloc */
 /*
  * Description: This command allocates and does basic preparation for a ring.