From patchwork Thu Nov 30 09:04:56 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Alan Dewar <alangordondewar@gmail.com>
X-Patchwork-Id: 31783
Return-Path: <dev-bounces@dpdk.org>
X-Original-To: patchwork@dpdk.org
Delivered-To: patchwork@dpdk.org
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 96F992B9E;
	Thu, 30 Nov 2017 10:05:07 +0100 (CET)
Received: from mail-wm0-f67.google.com (mail-wm0-f67.google.com
	[74.125.82.67]) by dpdk.org (Postfix) with ESMTP id 8763628EE
	for <dev@dpdk.org>; Thu, 30 Nov 2017 10:05:06 +0100 (CET)
Received: by mail-wm0-f67.google.com with SMTP id f9so10993561wmh.0
	for <dev@dpdk.org>; Thu, 30 Nov 2017 01:05:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
	h=from:to:cc:subject:date:message-id;
	bh=HkyzVHt5djuprpijDB7U9xssjA6gblj6ieJR4XOE6C4=;
	b=F3WF4HuwLeIDCHWi/+ewYJx23jvtCnatvpt5meydmuk9/O8C76fXFLUqV6R7YyiLQb
	5bUzrIbxCR6zGX0dUOsUXhD7LIXvwJFmoycd/0uegyVixUTOxuFzcGuULX6chqjc2gEt
	kWqJRJ+JplinmXVe2W4KZP0wUWR9HptWUo4bSPEkjc+NkGX7h3UbVZ47r9/C4My+Uurf
	Q1Fetd7MwMVi7qvLoTrTqw40zl1/cTlpgC/Nash39yUbyjHbe8c4giFre2uhLLPY2twl
	EkGpbFfqDU0tGCeqVOkd2X/nyEVT3Xts5TfOdv44QZBtIrZQ+Ds4nCR4GkMEYfv/ZKuW
	yKVQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
	d=1e100.net; s=20161025;
	h=x-gm-message-state:from:to:cc:subject:date:message-id;
	bh=HkyzVHt5djuprpijDB7U9xssjA6gblj6ieJR4XOE6C4=;
	b=iaurJ/jS3VBmo/t0LhShCn6reTYecn04iGoprN+VcT8JyNgAlv+ZmViIYhVQYaVIMD
	NOAfPHURDMV38urKqIc3UjXq23Y9TZlcSEzgMnjdG+Fd+H2uN6CKZyskP7JgT1WmBQ5J
	YXMpmfsdgRmGan+7y0Xoz/IMOFsJU3XEFHBeaCJ0dxOgCHtWvso86j1VNiESsW2ST+O8
	Ric8pe3aPDztSaicYycPBMtnKEZkTapNFj3WiG/DByDBlfsnYJq5AOVlylAe/zbD9qzH
	Wpa1sfozH9zxK6Xz0AeOCP3zm16H4tQ73aZr17N03JGJEtfW4ostUrMt9cyZkPSRwXTG
	2Zpw==
X-Gm-Message-State: AJaThX7f3XzDEUrl9NxW9K+YQPM19cyDEM4m37ezXFjr4uzsydTLNdIA
	pzDQanmcF/KAh/zx9AjqMlrs
X-Google-Smtp-Source: 
 AGs4zMbJLzWq2LXJB/ZojcLcdHoqRL4L5LKunryJB82UQC91ATlKzxNemMDkSQiFwKHj3RvD2rm4Aw==
X-Received: by 10.80.153.210 with SMTP id n18mr12127373edb.281.1512032704944;
	Thu, 30 Nov 2017 01:05:04 -0800 (PST)
Received: from bra-l27t7p12.vyatta.net ([213.251.34.151])
	by smtp.gmail.com with ESMTPSA id
	x5sm2622230eda.8.2017.11.30.01.05.03
	(version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128);
	Thu, 30 Nov 2017 01:05:04 -0800 (PST)
From: alangordondewar@gmail.com
X-Google-Original-From: alan.dewar@att.com
To: cristian.dumitrescu@intel.com
Cc: dev@dpdk.org,
	Alan Dewar <alan.dewar@att.com>
Date: Thu, 30 Nov 2017 09:04:56 +0000
Message-Id: <1512032696-30765-1-git-send-email-alan.dewar@att.com>
X-Mailer: git-send-email 2.1.4
Subject: [dpdk-dev] [PATCH v2] sched: fix overflow errors in WRR weighting
	code
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
	<mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
	<mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

From: Alan Dewar <alan.dewar@att.com>

Revised patch - this version fixes an issue when a small wrr_cost is
shifted so far right that its value becomes zero.

The WRR code calculates the lowest common denominator between the four
WRR weights as a uint32_t value and divides the LCD by each of the WRR
weights and casts the results as a uint8_t.  This casting can cause
the ratios of the computed wrr costs to be wrong.  For example with
WRR weights of 3, 5, 7 and 11, the LCD is computed to be
1155.  The WRR costs get computed as:

  1155/3 = 385, 1155/5 = 231, 1155/7 = 165, 1155/11 = 105.

When the value 385 is cast into an uint8_t it ends up as 129.
Rather than casting straight into a uint8_t, this patch shifts the
computed WRR costs right so that the largest value is only eight bits
wide.

In grinder_schedule, the packet length is multiplied by the WRR cost
and added to the grinder's wrr_tokens value.  The grinder's wrr_tokens
field is a uint16_t, so combination of a packet length of 1500 bytes
and a wrr cost of 44 will overflow this field on the first packet.

This patch increases the width of the grinder's wrr_tokens and
wrr_mask fields from uint16_t to uint32_t.

In grinder_wrr_store, the remaining tokens in the grinder's wrr_tokens
array are copied to the appropriate pipe's wrr_tokens array.  However
the pipe's wrr_tokens array is only a uint8_t array so unused tokens
were quite frequently lost which upsets the balance of traffic across
the four WRR queues.

This patch increases the width of the pipe's wrr_tokens array from
a uint8_t to uint32_t.

Signed-off-by: Alan Dewar <alan.dewar@att.com>
Reviewed-by: Luca Boccassi <bluca@debian.org>
---
v2 - fixed bug in the wrr_cost calculation code that could result
in a zero wrr_cost

 lib/librte_sched/rte_sched.c        | 59 +++++++++++++++++++++++++++++--------
 lib/librte_sched/rte_sched_common.h | 15 ++++++++++
 2 files changed, 61 insertions(+), 13 deletions(-)

diff --git a/lib/librte_sched/rte_sched.c b/lib/librte_sched/rte_sched.c
index 7252f85..324743d 100644
--- a/lib/librte_sched/rte_sched.c
+++ b/lib/librte_sched/rte_sched.c
@@ -130,7 +130,7 @@ struct rte_sched_pipe {
 	uint32_t tc_credits[RTE_SCHED_TRAFFIC_CLASSES_PER_PIPE];
 
 	/* Weighted Round Robin (WRR) */
-	uint8_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE];
+	uint32_t wrr_tokens[RTE_SCHED_QUEUES_PER_PIPE];
 
 	/* TC oversubscription */
 	uint32_t tc_ov_credits;
@@ -205,8 +205,8 @@ struct rte_sched_grinder {
 	struct rte_mbuf *pkt;
 
 	/* WRR */
-	uint16_t wrr_tokens[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
-	uint16_t wrr_mask[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
+	uint32_t wrr_tokens[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
+	uint32_t wrr_mask[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
 	uint8_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
 };
 
@@ -542,6 +542,17 @@ rte_sched_time_ms_to_bytes(uint32_t time_ms, uint32_t rate)
 	return time;
 }
 
+static uint32_t rte_sched_reduce_to_byte(uint32_t value)
+{
+	uint32_t shift = 0;
+
+	while (value & 0xFFFFFF00) {
+		value >>= 1;
+		shift++;
+	}
+	return shift;
+}
+
 static void
 rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port, struct rte_sched_port_params *params)
 {
@@ -583,6 +594,8 @@ rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port, struct rte
 			uint32_t wrr_cost[RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS];
 			uint32_t lcd, lcd1, lcd2;
 			uint32_t qindex;
+			uint32_t low_pos;
+			uint32_t shift;
 
 			qindex = j * RTE_SCHED_QUEUES_PER_TRAFFIC_CLASS;
 
@@ -594,12 +607,28 @@ rte_sched_port_config_pipe_profile_table(struct rte_sched_port *port, struct rte
 			lcd1 = rte_get_lcd(wrr_cost[0], wrr_cost[1]);
 			lcd2 = rte_get_lcd(wrr_cost[2], wrr_cost[3]);
 			lcd = rte_get_lcd(lcd1, lcd2);
+			low_pos = rte_min_pos_4_u32(wrr_cost);
 
 			wrr_cost[0] = lcd / wrr_cost[0];
 			wrr_cost[1] = lcd / wrr_cost[1];
 			wrr_cost[2] = lcd / wrr_cost[2];
 			wrr_cost[3] = lcd / wrr_cost[3];
 
+			shift = rte_sched_reduce_to_byte(wrr_cost[low_pos]);
+			wrr_cost[0] >>= shift;
+			wrr_cost[1] >>= shift;
+			wrr_cost[2] >>= shift;
+			wrr_cost[3] >>= shift;
+
+			if (wrr_cost[0] == 0)
+				wrr_cost[0]++;
+			if (wrr_cost[1] == 0)
+				wrr_cost[1]++;
+			if (wrr_cost[2] == 0)
+				wrr_cost[2]++;
+			if (wrr_cost[3] == 0)
+				wrr_cost[3]++;
+
 			dst->wrr_cost[qindex] = (uint8_t) wrr_cost[0];
 			dst->wrr_cost[qindex + 1] = (uint8_t) wrr_cost[1];
 			dst->wrr_cost[qindex + 2] = (uint8_t) wrr_cost[2];
@@ -1941,15 +1970,19 @@ grinder_wrr_load(struct rte_sched_port *port, uint32_t pos)
 
 	qindex = tc_index * 4;
 
-	grinder->wrr_tokens[0] = ((uint16_t) pipe->wrr_tokens[qindex]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[1] = ((uint16_t) pipe->wrr_tokens[qindex + 1]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[2] = ((uint16_t) pipe->wrr_tokens[qindex + 2]) << RTE_SCHED_WRR_SHIFT;
-	grinder->wrr_tokens[3] = ((uint16_t) pipe->wrr_tokens[qindex + 3]) << RTE_SCHED_WRR_SHIFT;
+	grinder->wrr_tokens[0] = pipe->wrr_tokens[qindex] <<
+		RTE_SCHED_WRR_SHIFT;
+	grinder->wrr_tokens[1] = pipe->wrr_tokens[qindex + 1] <<
+		RTE_SCHED_WRR_SHIFT;
+	grinder->wrr_tokens[2] = pipe->wrr_tokens[qindex + 2] <<
+		RTE_SCHED_WRR_SHIFT;
+	grinder->wrr_tokens[3] = pipe->wrr_tokens[qindex + 3] <<
+		RTE_SCHED_WRR_SHIFT;
 
-	grinder->wrr_mask[0] = (qmask & 0x1) * 0xFFFF;
-	grinder->wrr_mask[1] = ((qmask >> 1) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[2] = ((qmask >> 2) & 0x1) * 0xFFFF;
-	grinder->wrr_mask[3] = ((qmask >> 3) & 0x1) * 0xFFFF;
+	grinder->wrr_mask[0] = (qmask & 0x1) * 0xFFFFFFFF;
+	grinder->wrr_mask[1] = ((qmask >> 1) & 0x1) * 0xFFFFFFFF;
+	grinder->wrr_mask[2] = ((qmask >> 2) & 0x1) * 0xFFFFFFFF;
+	grinder->wrr_mask[3] = ((qmask >> 3) & 0x1) * 0xFFFFFFFF;
 
 	grinder->wrr_cost[0] = pipe_params->wrr_cost[qindex];
 	grinder->wrr_cost[1] = pipe_params->wrr_cost[qindex + 1];
@@ -1981,14 +2014,14 @@ static inline void
 grinder_wrr(struct rte_sched_port *port, uint32_t pos)
 {
 	struct rte_sched_grinder *grinder = port->grinder + pos;
-	uint16_t wrr_tokens_min;
+	uint32_t wrr_tokens_min;
 
 	grinder->wrr_tokens[0] |= ~grinder->wrr_mask[0];
 	grinder->wrr_tokens[1] |= ~grinder->wrr_mask[1];
 	grinder->wrr_tokens[2] |= ~grinder->wrr_mask[2];
 	grinder->wrr_tokens[3] |= ~grinder->wrr_mask[3];
 
-	grinder->qpos = rte_min_pos_4_u16(grinder->wrr_tokens);
+	grinder->qpos = rte_min_pos_4_u32(grinder->wrr_tokens);
 	wrr_tokens_min = grinder->wrr_tokens[grinder->qpos];
 
 	grinder->wrr_tokens[0] -= wrr_tokens_min;
diff --git a/lib/librte_sched/rte_sched_common.h b/lib/librte_sched/rte_sched_common.h
index aed144b..a3c6bc2 100644
--- a/lib/librte_sched/rte_sched_common.h
+++ b/lib/librte_sched/rte_sched_common.h
@@ -77,6 +77,21 @@ rte_min_pos_4_u16(uint16_t *x)
 	return pos0;
 }
 
+static inline uint32_t
+rte_min_pos_4_u32(uint32_t *x)
+{
+	uint32_t pos0 = 0;
+	uint32_t pos1 = 2;
+
+	if (x[1] <= x[0])
+		pos0 = 1;
+	if (x[3] <= x[2])
+		pos1 = 3;
+	if (x[pos1] <= x[pos0])
+		pos0 = pos1;
+
+	return pos0;
+}
 #endif
 
 /*