From patchwork Mon Aug 15 07:31:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zhang X-Patchwork-Id: 115033 X-Patchwork-Delegate: qi.z.zhang@intel.com Return-Path: X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id F0E1EA00C3; Mon, 15 Aug 2022 01:22:44 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 4FA4742B88; Mon, 15 Aug 2022 01:22:25 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by mails.dpdk.org (Postfix) with ESMTP id B0CEE42905; Mon, 15 Aug 2022 01:22:22 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1660519343; x=1692055343; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3LarfuzS9mghQcNBkML3FS9NUQeDZieUUzSMokegb/c=; b=bVIaLKpM/8mr/3+CN0BXcAAN0m25QiAkgiC+lpCa1qxbVQ09jF1Ct5ew QBF7/jOV2bMa6DdLi+/H8uYC/1HfYS+wwR1m+J4hRzgXwlUat5W7Df8Bu LHt1EyJNku/tajT5W25U+ov5BdjGADHMqAU+r2NZFlPFmWW55attm8hPf GIHU1P7rgKk536D0r+chzaKOAqb8iyQsLPZwVh06J2NjUu9g4Q2AKUimN khCZGUK+Xjz0VSMIcq7QLy97O92OwVph5iFSgvpbz0xnpubZe3SR6X+jw 6vyk3fJ3pohI4dBLBtGJ3CDUqlHkz4BeSNdKtYPcr0wSQ1cU296URpVwk Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10439"; a="291857916" X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="291857916" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Aug 2022 16:22:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.93,237,1654585200"; d="scan'208";a="635282998" Received: from dpdk-qzhan15-test02.sh.intel.com ([10.67.115.4]) by orsmga008.jf.intel.com with ESMTP; 14 Aug 2022 16:22:20 -0700 From: Qi Zhang To: qiming.yang@intel.com Cc: dev@dpdk.org, Qi Zhang , stable@dpdk.org, Jacob Keller Subject: [PATCH v2 05/70] net/ice/base: fix incorrect division during E822 PTP init Date: Mon, 15 Aug 2022 03:31:01 -0400 Message-Id: <20220815073206.2917968-6-qi.z.zhang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220815073206.2917968-1-qi.z.zhang@intel.com> References: <20220815071306.2910599-1-qi.z.zhang@intel.com> <20220815073206.2917968-1-qi.z.zhang@intel.com> MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org When initializing the device hardware for PTP, the E822 devices requirea number of values to be calculated and programmed to hardware.These values are calculated using unsigned 64-bit division. The DIV_64BIT macro currently translates into a specific Linux functionthat triggers a *signed* division. This produces incorrect results when operating on a dividend larger than an s64. The division calculation effectively overflows and results in totally unexpected behavior. In this case, the UIX value for 10Gb/40Gb link speeds are calculated incorrectly. This ultimately cascades into a failure of the Tx timestamps. Specifically, the reported Tx timestamps become wildly inaccurate and not representing nominal time. The root cause of this bug is the assumption that DIV_64BIT can correctly handle both signed and unsigned division. In fact the entire reason we need this is because the Linux kernel compilation target does not provide native 64 bit division ops, and requires explicit use of kernel functions which explicitly do either signed or unsigned division. To correctly solve this, introduce new functions, DIV_U64 and DIV_S64 which are specifically intended for signed or unsigned division. To help catch issues, use static inline functions so that we get strict type checking. Fixes: 97f4f78bbd9f ("net/ice/base: add functions for device clock control") Cc: stable@dpdk.org Signed-off-by: Jacob Keller Signed-off-by: Qi Zhang --- drivers/net/ice/base/ice_ptp_hw.c | 56 +++++++++++++++---------------- drivers/net/ice/base/ice_sched.c | 24 ++++++------- drivers/net/ice/base/ice_type.h | 30 +++++++++++++++-- 3 files changed, 68 insertions(+), 42 deletions(-) diff --git a/drivers/net/ice/base/ice_ptp_hw.c b/drivers/net/ice/base/ice_ptp_hw.c index 632a3f5bae..76119364e4 100644 --- a/drivers/net/ice/base/ice_ptp_hw.c +++ b/drivers/net/ice/base/ice_ptp_hw.c @@ -1634,7 +1634,7 @@ static enum ice_status ice_phy_cfg_uix_e822(struct ice_hw *hw, u8 port) #define LINE_UI_25G_100G 256 /* 6600 UIs is 256 nanoseconds at 25Gb/100Gb */ /* Program the 10Gb/40Gb conversion ratio */ - uix = DIV_64BIT(tu_per_sec * LINE_UI_10G_40G, 390625000); + uix = DIV_U64(tu_per_sec * LINE_UI_10G_40G, 390625000); status = ice_write_64b_phy_reg_e822(hw, port, P_REG_UIX66_10G_40G_L, uix); @@ -1645,7 +1645,7 @@ static enum ice_status ice_phy_cfg_uix_e822(struct ice_hw *hw, u8 port) } /* Program the 25Gb/100Gb conversion ratio */ - uix = DIV_64BIT(tu_per_sec * LINE_UI_25G_100G, 390625000); + uix = DIV_U64(tu_per_sec * LINE_UI_25G_100G, 390625000); status = ice_write_64b_phy_reg_e822(hw, port, P_REG_UIX66_25G_100G_L, uix); @@ -1727,8 +1727,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_PAR_TX_TUS */ if (e822_vernier[link_spd].tx_par_clk) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].tx_par_clk); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].tx_par_clk); else phy_tus = 0; @@ -1739,8 +1739,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_PAR_RX_TUS */ if (e822_vernier[link_spd].rx_par_clk) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].rx_par_clk); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].rx_par_clk); else phy_tus = 0; @@ -1751,8 +1751,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_PCS_TX_TUS */ if (e822_vernier[link_spd].tx_pcs_clk) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].tx_pcs_clk); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].tx_pcs_clk); else phy_tus = 0; @@ -1763,8 +1763,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_PCS_RX_TUS */ if (e822_vernier[link_spd].rx_pcs_clk) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].rx_pcs_clk); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].rx_pcs_clk); else phy_tus = 0; @@ -1775,8 +1775,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_DESK_PAR_TX_TUS */ if (e822_vernier[link_spd].tx_desk_rsgb_par) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].tx_desk_rsgb_par); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].tx_desk_rsgb_par); else phy_tus = 0; @@ -1787,8 +1787,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_DESK_PAR_RX_TUS */ if (e822_vernier[link_spd].rx_desk_rsgb_par) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].rx_desk_rsgb_par); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].rx_desk_rsgb_par); else phy_tus = 0; @@ -1799,8 +1799,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_DESK_PCS_TX_TUS */ if (e822_vernier[link_spd].tx_desk_rsgb_pcs) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].tx_desk_rsgb_pcs); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].tx_desk_rsgb_pcs); else phy_tus = 0; @@ -1811,8 +1811,8 @@ static enum ice_status ice_phy_cfg_parpcs_e822(struct ice_hw *hw, u8 port) /* P_REG_DESK_PCS_RX_TUS */ if (e822_vernier[link_spd].rx_desk_rsgb_pcs) - phy_tus = DIV_64BIT(tu_per_sec, - e822_vernier[link_spd].rx_desk_rsgb_pcs); + phy_tus = DIV_U64(tu_per_sec, + e822_vernier[link_spd].rx_desk_rsgb_pcs); else phy_tus = 0; @@ -1844,9 +1844,9 @@ ice_calc_fixed_tx_offset_e822(struct ice_hw *hw, enum ice_ptp_link_spd link_spd) * overflows 64 bit integer arithmetic, so break it up into two * divisions by 1e4 first then by 1e7. */ - fixed_offset = DIV_64BIT(tu_per_sec, 10000); + fixed_offset = DIV_U64(tu_per_sec, 10000); fixed_offset *= e822_vernier[link_spd].tx_fixed_delay; - fixed_offset = DIV_64BIT(fixed_offset, 10000000); + fixed_offset = DIV_U64(fixed_offset, 10000000); return fixed_offset; } @@ -2074,9 +2074,9 @@ ice_phy_calc_pmd_adj_e822(struct ice_hw *hw, u8 port, * divide by 125, and then handle remaining divisor based on the link * speed pmd_adj_divisor value. */ - adj = DIV_64BIT(tu_per_sec, 125); + adj = DIV_U64(tu_per_sec, 125); adj *= mult; - adj = DIV_64BIT(adj, pmd_adj_divisor); + adj = DIV_U64(adj, pmd_adj_divisor); /* Finally, for 25G-RS and 50G-RS, a further adjustment for the Rx * cycle count is necessary. @@ -2097,9 +2097,9 @@ ice_phy_calc_pmd_adj_e822(struct ice_hw *hw, u8 port, if (rx_cycle) { mult = (4 - rx_cycle) * 40; - cycle_adj = DIV_64BIT(tu_per_sec, 125); + cycle_adj = DIV_U64(tu_per_sec, 125); cycle_adj *= mult; - cycle_adj = DIV_64BIT(cycle_adj, pmd_adj_divisor); + cycle_adj = DIV_U64(cycle_adj, pmd_adj_divisor); adj += cycle_adj; } @@ -2119,9 +2119,9 @@ ice_phy_calc_pmd_adj_e822(struct ice_hw *hw, u8 port, if (rx_cycle) { mult = rx_cycle * 40; - cycle_adj = DIV_64BIT(tu_per_sec, 125); + cycle_adj = DIV_U64(tu_per_sec, 125); cycle_adj *= mult; - cycle_adj = DIV_64BIT(cycle_adj, pmd_adj_divisor); + cycle_adj = DIV_U64(cycle_adj, pmd_adj_divisor); adj += cycle_adj; } @@ -2157,9 +2157,9 @@ ice_calc_fixed_rx_offset_e822(struct ice_hw *hw, enum ice_ptp_link_spd link_spd) * overflows 64 bit integer arithmetic, so break it up into two * divisions by 1e4 first then by 1e7. */ - fixed_offset = DIV_64BIT(tu_per_sec, 10000); + fixed_offset = DIV_U64(tu_per_sec, 10000); fixed_offset *= e822_vernier[link_spd].rx_fixed_delay; - fixed_offset = DIV_64BIT(fixed_offset, 10000000); + fixed_offset = DIV_U64(fixed_offset, 10000000); return fixed_offset; } diff --git a/drivers/net/ice/base/ice_sched.c b/drivers/net/ice/base/ice_sched.c index 1b060d3567..71b5677f43 100644 --- a/drivers/net/ice/base/ice_sched.c +++ b/drivers/net/ice/base/ice_sched.c @@ -3916,8 +3916,8 @@ static u16 ice_sched_calc_wakeup(struct ice_hw *hw, s32 bw) u16 wakeup = 0; /* Get the wakeup integer value */ - bytes_per_sec = DIV_64BIT(((s64)bw * 1000), BITS_PER_BYTE); - wakeup_int = DIV_64BIT(hw->psm_clk_freq, bytes_per_sec); + bytes_per_sec = DIV_S64((s64)bw * 1000, BITS_PER_BYTE); + wakeup_int = DIV_S64(hw->psm_clk_freq, bytes_per_sec); if (wakeup_int > 63) { wakeup = (u16)((1 << 15) | wakeup_int); } else { @@ -3925,18 +3925,18 @@ static u16 ice_sched_calc_wakeup(struct ice_hw *hw, s32 bw) * Convert Integer value to a constant multiplier */ wakeup_b = (s64)ICE_RL_PROF_MULTIPLIER * wakeup_int; - wakeup_a = DIV_64BIT((s64)ICE_RL_PROF_MULTIPLIER * - hw->psm_clk_freq, bytes_per_sec); + wakeup_a = DIV_S64((s64)ICE_RL_PROF_MULTIPLIER * + hw->psm_clk_freq, bytes_per_sec); /* Get Fraction value */ wakeup_f = wakeup_a - wakeup_b; /* Round up the Fractional value via Ceil(Fractional value) */ - if (wakeup_f > DIV_64BIT(ICE_RL_PROF_MULTIPLIER, 2)) + if (wakeup_f > DIV_S64(ICE_RL_PROF_MULTIPLIER, 2)) wakeup_f += 1; - wakeup_f_int = (s32)DIV_64BIT(wakeup_f * ICE_RL_PROF_FRACTION, - ICE_RL_PROF_MULTIPLIER); + wakeup_f_int = (s32)DIV_S64(wakeup_f * ICE_RL_PROF_FRACTION, + ICE_RL_PROF_MULTIPLIER); wakeup |= (u16)(wakeup_int << 9); wakeup |= (u16)(0x1ff & wakeup_f_int); } @@ -3968,20 +3968,20 @@ ice_sched_bw_to_rl_profile(struct ice_hw *hw, u32 bw, return status; /* Bytes per second from Kbps */ - bytes_per_sec = DIV_64BIT(((s64)bw * 1000), BITS_PER_BYTE); + bytes_per_sec = DIV_S64((s64)bw * 1000, BITS_PER_BYTE); /* encode is 6 bits but really useful are 5 bits */ for (i = 0; i < 64; i++) { u64 pow_result = BIT_ULL(i); - ts_rate = DIV_64BIT((s64)hw->psm_clk_freq, - pow_result * ICE_RL_PROF_TS_MULTIPLIER); + ts_rate = DIV_S64((s64)hw->psm_clk_freq, + pow_result * ICE_RL_PROF_TS_MULTIPLIER); if (ts_rate <= 0) continue; /* Multiplier value */ - mv_tmp = DIV_64BIT(bytes_per_sec * ICE_RL_PROF_MULTIPLIER, - ts_rate); + mv_tmp = DIV_S64(bytes_per_sec * ICE_RL_PROF_MULTIPLIER, + ts_rate); /* Round to the nearest ICE_RL_PROF_MULTIPLIER */ mv = round_up_64bit(mv_tmp, ICE_RL_PROF_MULTIPLIER); diff --git a/drivers/net/ice/base/ice_type.h b/drivers/net/ice/base/ice_type.h index d4d0cab089..3da3de38af 100644 --- a/drivers/net/ice/base/ice_type.h +++ b/drivers/net/ice/base/ice_type.h @@ -87,11 +87,37 @@ static inline bool ice_is_tc_ena(ice_bitmap_t bitmap, u8 tc) return ice_is_bit_set(&bitmap, tc); } -#define DIV_64BIT(n, d) ((n) / (d)) +/** + * DIV_S64 - Divide signed 64-bit value with signed 64-bit divisor + * @dividend: value to divide + * @divisor: value to divide by + * + * Use DIV_S64 for any 64-bit divide which operates on signed 64-bit dividends. + * Do not use this for unsigned 64-bit dividends as it will not produce + * correct results if the dividend is larger than S64_MAX. + */ +static inline s64 DIV_S64(s64 dividend, s64 divisor) +{ + return dividend / divisor; +} + +/** + * DIV_U64 - Divide unsigned 64-bit value by unsigned 64-bit divisor + * @dividend: value to divide + * @divisor: value to divide by + * + * Use DIV_U64 for any 64-bit divide which operates on unsigned 64-bit + * dividends. Do not use this for signed 64-bit dividends as it will not + * handle negative values correctly. + */ +static inline u64 DIV_U64(u64 dividend, u64 divisor) +{ + return dividend / divisor; +} static inline u64 round_up_64bit(u64 a, u32 b) { - return DIV_64BIT(((a) + (b) / 2), (b)); + return DIV_U64(((a) + (b) / 2), (b)); } static inline u32 ice_round_to_num(u32 N, u32 R)