Message ID | 20230613165845.19109-1-viacheslavo@nvidia.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 912E842CA7; Tue, 13 Jun 2023 18:59:34 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B60E742BAC; Tue, 13 Jun 2023 18:59:23 +0200 (CEST) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2046.outbound.protection.outlook.com [40.107.237.46]) by mails.dpdk.org (Postfix) with ESMTP id 4886242BAC for <dev@dpdk.org>; Tue, 13 Jun 2023 18:59:22 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Y9dlw7th2rk3xLNGaa9yo/vwlnVLatK4CSaiWFICl4qsR5oGV5GWwmhYXjgN3cy98z6LhJnkc6V/nMHU4jxkfqQ3F07JPRQVUkHVSh77MGLaDLlqMOJLz5H+FjO1CyjzEVurPiR9sq5ZgXqLQmKR5K/q+yBTtHMz/5okQqq8rq3M4JfAxLRqB7ESK7cKkCi/rKlG4jjENhgoCMAmr9ocXo/jUj0zm20XIbPu3qkhZvW5GZhLZpF8SwamrBALUWM/EjMBMmMXTMck2axQ+Nkwxg29imQCN+bi+RghIO7+iAOSvz927KBUfbZYvoGeIG/CCTzHZzFMz3v6qJp3XevGZg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=hJmpgCOUvo4vsovc936/Fz7O/ThuFH2qt89oybPLb/M=; b=a5dAsJfGcwMmojlK0t4FhVp45wolvnskIGna3j1KZ9S1UEKPWz+nSz1GOlg5QjQrtRgnsRpFq8a9XyM56ySlLpxzuevtdUrSZB35Ifttz5BuGHNLGzed+urCGP08CF/RgQZl5oOSDz3KUaXVUOk8+2tp6oxOQ3v70gBFcWjqEZdepcGKRPHmzbA+fb6WfGYcKl8PpTf3gChLMhCK0+LqsIn9A7HPaKiTgOV1tt6YjGUpLkqGzFB1fDc4tugz8VlYdTSEFZFLevh7d334bxUu2aQCjRKNQdtjda8pcsERqtL25NvjDhUbP+ewjWexcTa45XmvNUSp3Oi1fQjpCA2pGA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.117.161) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=hJmpgCOUvo4vsovc936/Fz7O/ThuFH2qt89oybPLb/M=; b=iWst8QxbIDmP050iYMEMcaFfjYCejj0NwPxpHZ7z/hSbl/C8HiZMyacb5PU4ygrn/rLGRQ+NF2img11LZkbG+q+Oh7I/SkAa0jW74H9sRh6TZcWG/3mVFKSf18F6cCJGPiLSTMSYgnhkk7dulqqbmCSbFoYoZ0S0plvGMRpoak8C4yOjoQshCaVAmTE3/l4S1ntJm6Jecq3UJt8xI7Zj/afu3V479qWucncf+kQ4+cUC3Ai3EfSHzD9Pw27P/5sxLNCKqatncGv/010Br/qvOHIFfzRLdNvs34JucVA6lhMByyFDGySTCNSrZitaMKE1RWRoRWRtK5ZhEha+QiEhCw== Received: from BN7PR06CA0070.namprd06.prod.outlook.com (2603:10b6:408:34::47) by SA1PR12MB6846.namprd12.prod.outlook.com (2603:10b6:806:25d::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6455.33; Tue, 13 Jun 2023 16:59:20 +0000 Received: from BN8NAM11FT103.eop-nam11.prod.protection.outlook.com (2603:10b6:408:34:cafe::22) by BN7PR06CA0070.outlook.office365.com (2603:10b6:408:34::47) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6477.35 via Frontend Transport; Tue, 13 Jun 2023 16:59:20 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.117.161) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.117.161 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.117.161; helo=mail.nvidia.com; pr=C Received: from mail.nvidia.com (216.228.117.161) by BN8NAM11FT103.mail.protection.outlook.com (10.13.176.181) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6500.23 via Frontend Transport; Tue, 13 Jun 2023 16:59:20 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by mail.nvidia.com (10.129.200.67) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.5; Tue, 13 Jun 2023 09:59:02 -0700 Received: from nvidia.com (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.37; Tue, 13 Jun 2023 09:59:01 -0700 From: Viacheslav Ovsiienko <viacheslavo@nvidia.com> To: <dev@dpdk.org> Subject: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing Date: Tue, 13 Jun 2023 19:58:40 +0300 Message-ID: <20230613165845.19109-1-viacheslavo@nvidia.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20230420100803.494-1-viacheslavo@nvidia.com> References: <20230420100803.494-1-viacheslavo@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: rnnvmail203.nvidia.com (10.129.68.9) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT103:EE_|SA1PR12MB6846:EE_ X-MS-Office365-Filtering-Correlation-Id: 95a7c49a-1ec8-4b20-8288-08db6c2f880b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: A2qZv88ivijdrf87kR6yCS5QS1PZOwdoPXvPB7dZCrPThfoe+ff4oNk1bcCTByB+mogTEwWsICpPUuEvuUSMQoHhsbCSj+Xjs5DK3faFMtf7DImqoWDPMirVoeNf9SW+0ELfrIOpRGn2dUqqBYcM5CjAxu6k+PG5/zvcV67jChfRtutAVxupVaf6HY7neKM0vtFZdYht10/D+G1WXa3abG39OCtkUPcxUnh3RNB0RzKGuvogKXZ5yPC/LpnwokKuxfTuNsoNn12/au/M4lgfs5eR5iGdYLpbwuukLWEw8KKBo3CWwWLkGcmgLuBBazTzX+7/pjXjNmFTV2nhYDoNQmRXV0UdKOAKMuK8/nulcRWw5nsB1kscvhmI9UtebiCVd2yJN3ay3Mm7D4RKQtPAax2lN+AEEW4SCxzzkEOuPMpweOmahEz9UGD50o2rFlWtcGw1exPXneHv2FAfUA6REMKaWXaZwRrZVWg3mbd8EHSzISj6iLPbWT5ZtxXVTkYOslrnH/sMj80RTetj1o3AUR+AfB9hWnbX+FhC04EidU1RsTguFO7ZJbyMtDcIHG/tKOHsir65RhwVWHtfxacwTrSEDR1ujjqIz7mayxtbj3XyLaL8i70JrrZtzjmYAtiFBNJYIm7PVnkt2ad2TBKbsCcSWy9UNJcTMx12wPfZmqlbB7yFbyI6jNDKW1QLpQeNgGb0Nou2ym8f9SnbwNBRPegCWRp3kgaA4Hcn6Qc59LQ8cAGtT2bYqJUmThVyN6XbvIX12Cw/ZZZ9muuu3XF896yoDqzLXt9KCfyOwP8PCHg= X-Forefront-Antispam-Report: CIP:216.228.117.161; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:dc6edge2.nvidia.com; CAT:NONE; SFS:(13230028)(4636009)(39860400002)(346002)(396003)(376002)(136003)(451199021)(36840700001)(40470700004)(46966006)(8936002)(8676002)(5660300002)(316002)(6916009)(70586007)(70206006)(2906002)(41300700001)(36860700001)(966005)(6666004)(40460700003)(82740400003)(478600001)(7696005)(7636003)(40480700001)(55016003)(356005)(1076003)(426003)(336012)(26005)(16526019)(6286002)(186003)(36756003)(83380400001)(2616005)(47076005)(86362001)(82310400005); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Jun 2023 16:59:20.1494 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 95a7c49a-1ec8-4b20-8288-08db6c2f880b X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.117.161]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT103.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB6846 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org |
Series |
net/mlx5: introduce Tx datapath tracing
|
|
Message
Slava Ovsiienko
June 13, 2023, 4:58 p.m. UTC
The mlx5 provides the send scheduling on specific moment of time,
and for the related kind of applications it would be extremely useful
to have extra debug information - when and how packets were scheduled
and when the actual sending was completed by the NIC hardware (it helps
application to track the internal delay issues).
Because the DPDK tx datapath API does not suppose getting any feedback
from the driver and the feature looks like to be mlx5 specific, it seems
to be reasonable to engage exisiting DPDK datapath tracing capability.
The work cycle is supposed to be:
- compile appplication with enabled tracing
- run application with EAL parameters configuring the tracing in mlx5
Tx datapath
- store the dump file with gathered tracing information
- run analyzing scrypt (in Python) to combine related events (packet
firing and completion) and see the data in human-readable view
Below is the detailed instruction "how to" with mlx5 NIC to gather
all the debug data including the full timings information.
1. Build DPDK application with enabled datapath tracing
The meson option should be specified:
--enable_trace_fp=true
The c_args shoudl be specified:
-DALLOW_EXPERIMENTAL_API
The DPDK configuration examples:
meson configure --buildtype=debug -Denable_trace_fp=true
-Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build
meson configure --buildtype=debug -Denable_trace_fp=true
-Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build
meson configure --buildtype=release -Denable_trace_fp=true
-Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build
meson configure --buildtype=release -Denable_trace_fp=true
-Dc_args='-DALLOW_EXPERIMENTAL_API' build
2. Configuring the NIC
If the sending completion timings are important the NIC should be configured
to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings parameter
should be configured to TRUE, for example with command (and with following
FW/driver reset):
sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s REAL_TIME_CLOCK_ENABLE=1
3. Run DPDK application to gather the traces
EAL parameters controlling trace capability in runtime
--trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints
with matching names at least "pmd.net.mlx5.tx"
must be enabled to gather all events needed
to analyze mlx5 Tx datapath and its timings.
By default all tracepoints are disabled.
--trace-dir=/var/log - trace storing directory
--trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size
per thread. The default is 1MB.
--trace-mode=overwrite|discard - optional, selects trace data buffer mode.
4. Installing or Building Babeltrace2 Package
The gathered trace data can be analyzed with a developed Python script.
To parse the trace, the data script uses the Babeltrace2 library.
The package should be either installed or built from source code as
shown below:
git clone https://github.com/efficios/babeltrace.git
cd babeltrace
./bootstrap
./configure -help
./configure --disable-api-doc --disable-man-pages
--disable-python-bindings-doc --enbale-python-plugins
--enable-python-binding
5. Running the Analyzing Script
The analyzing script is located in the folder: ./drivers/net/mlx5/tools
It requires Python3.6, Babeltrace2 packages and it takes the only parameter
of trace data file. For example:
./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39
6. Interpreting the Script Output Data
All the timings are given in nanoseconds.
The list of Tx (and coming Rx) bursts per port/queue is presented in the output.
Each list element contains the list of built WQEs with specific opcodes, and
each WQE contains the list of the encompassed packets to send.
Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
--
v2: - comment addressed: "dump_trace" command is replaced with "save_trace"
- Windows build failure addressed, Windows does not support tracing
Viacheslav Ovsiienko (5):
app/testpmd: add trace save command
common/mlx5: introduce tracepoints for mlx5 drivers
net/mlx5: add Tx datapath tracing
net/mlx5: add comprehensive send completion trace
net/mlx5: add Tx datapath trace analyzing script
app/test-pmd/cmdline.c | 38 ++++
drivers/common/mlx5/meson.build | 1 +
drivers/common/mlx5/mlx5_trace.c | 25 +++
drivers/common/mlx5/mlx5_trace.h | 72 +++++++
drivers/common/mlx5/version.map | 8 +
drivers/net/mlx5/linux/mlx5_verbs.c | 8 +-
drivers/net/mlx5/mlx5_devx.c | 8 +-
drivers/net/mlx5/mlx5_rx.h | 19 --
drivers/net/mlx5/mlx5_rxtx.h | 19 ++
drivers/net/mlx5/mlx5_tx.c | 9 +
drivers/net/mlx5/mlx5_tx.h | 88 ++++++++-
drivers/net/mlx5/tools/mlx5_trace.py | 271 +++++++++++++++++++++++++++
12 files changed, 537 insertions(+), 29 deletions(-)
create mode 100644 drivers/common/mlx5/mlx5_trace.c
create mode 100644 drivers/common/mlx5/mlx5_trace.h
create mode 100755 drivers/net/mlx5/tools/mlx5_trace.py
Comments
Hi, > -----Original Message----- > From: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > Sent: Tuesday, June 13, 2023 7:59 PM > To: dev@dpdk.org > Subject: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing > > The mlx5 provides the send scheduling on specific moment of time, > and for the related kind of applications it would be extremely useful > to have extra debug information - when and how packets were scheduled > and when the actual sending was completed by the NIC hardware (it helps > application to track the internal delay issues). > > Because the DPDK tx datapath API does not suppose getting any feedback > from the driver and the feature looks like to be mlx5 specific, it seems > to be reasonable to engage exisiting DPDK datapath tracing capability. > > The work cycle is supposed to be: > - compile appplication with enabled tracing > - run application with EAL parameters configuring the tracing in mlx5 > Tx datapath > - store the dump file with gathered tracing information > - run analyzing scrypt (in Python) to combine related events (packet > firing and completion) and see the data in human-readable view > > Below is the detailed instruction "how to" with mlx5 NIC to gather > all the debug data including the full timings information. > > > 1. Build DPDK application with enabled datapath tracing > > The meson option should be specified: > --enable_trace_fp=true > > The c_args shoudl be specified: > -DALLOW_EXPERIMENTAL_API > > The DPDK configuration examples: > > meson configure --buildtype=debug -Denable_trace_fp=true > -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT - > DALLOW_EXPERIMENTAL_API' build > > meson configure --buildtype=debug -Denable_trace_fp=true > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build > > meson configure --buildtype=release -Denable_trace_fp=true > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build > > meson configure --buildtype=release -Denable_trace_fp=true > -Dc_args='-DALLOW_EXPERIMENTAL_API' build > > > 2. Configuring the NIC > > If the sending completion timings are important the NIC should be configured > to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings > parameter > should be configured to TRUE, for example with command (and with following > FW/driver reset): > > sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s > REAL_TIME_CLOCK_ENABLE=1 > > > 3. Run DPDK application to gather the traces > > EAL parameters controlling trace capability in runtime > > --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints > with matching names at least "pmd.net.mlx5.tx" > must be enabled to gather all events needed > to analyze mlx5 Tx datapath and its timings. > By default all tracepoints are disabled. > > --trace-dir=/var/log - trace storing directory > > --trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size > per thread. The default is 1MB. > > --trace-mode=overwrite|discard - optional, selects trace data buffer mode. > > > 4. Installing or Building Babeltrace2 Package > > The gathered trace data can be analyzed with a developed Python script. > To parse the trace, the data script uses the Babeltrace2 library. > The package should be either installed or built from source code as > shown below: > > git clone https://github.com/efficios/babeltrace.git > cd babeltrace > ./bootstrap > ./configure -help > ./configure --disable-api-doc --disable-man-pages > --disable-python-bindings-doc --enbale-python-plugins > --enable-python-binding > > 5. Running the Analyzing Script > > The analyzing script is located in the folder: ./drivers/net/mlx5/tools > It requires Python3.6, Babeltrace2 packages and it takes the only parameter > of trace data file. For example: > > ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39 > > > 6. Interpreting the Script Output Data > > All the timings are given in nanoseconds. > The list of Tx (and coming Rx) bursts per port/queue is presented in the > output. > Each list element contains the list of built WQEs with specific opcodes, and > each WQE contains the list of the encompassed packets to send. > > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > > -- > v2: - comment addressed: "dump_trace" command is replaced with > "save_trace" > - Windows build failure addressed, Windows does not support tracing > > Viacheslav Ovsiienko (5): > app/testpmd: add trace save command > common/mlx5: introduce tracepoints for mlx5 drivers > net/mlx5: add Tx datapath tracing > net/mlx5: add comprehensive send completion trace > net/mlx5: add Tx datapath trace analyzing script > > app/test-pmd/cmdline.c | 38 ++++ > drivers/common/mlx5/meson.build | 1 + > drivers/common/mlx5/mlx5_trace.c | 25 +++ > drivers/common/mlx5/mlx5_trace.h | 72 +++++++ > drivers/common/mlx5/version.map | 8 + > drivers/net/mlx5/linux/mlx5_verbs.c | 8 +- > drivers/net/mlx5/mlx5_devx.c | 8 +- > drivers/net/mlx5/mlx5_rx.h | 19 -- > drivers/net/mlx5/mlx5_rxtx.h | 19 ++ > drivers/net/mlx5/mlx5_tx.c | 9 + > drivers/net/mlx5/mlx5_tx.h | 88 ++++++++- > drivers/net/mlx5/tools/mlx5_trace.py | 271 > +++++++++++++++++++++++++++ > 12 files changed, 537 insertions(+), 29 deletions(-) > create mode 100644 drivers/common/mlx5/mlx5_trace.c > create mode 100644 drivers/common/mlx5/mlx5_trace.h > create mode 100755 drivers/net/mlx5/tools/mlx5_trace.py > > -- > 2.18.1 Series applied to next-net-mlx, Kindest regards Raslan Darawsheh
20/06/2023 14:00, Raslan Darawsheh: > Hi, > > > -----Original Message----- > > From: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > > Sent: Tuesday, June 13, 2023 7:59 PM > > To: dev@dpdk.org > > Subject: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing > > > > The mlx5 provides the send scheduling on specific moment of time, > > and for the related kind of applications it would be extremely useful > > to have extra debug information - when and how packets were scheduled > > and when the actual sending was completed by the NIC hardware (it helps > > application to track the internal delay issues). > > > > Because the DPDK tx datapath API does not suppose getting any feedback > > from the driver and the feature looks like to be mlx5 specific, it seems > > to be reasonable to engage exisiting DPDK datapath tracing capability. > > > > The work cycle is supposed to be: > > - compile appplication with enabled tracing > > - run application with EAL parameters configuring the tracing in mlx5 > > Tx datapath > > - store the dump file with gathered tracing information > > - run analyzing scrypt (in Python) to combine related events (packet > > firing and completion) and see the data in human-readable view > > > > Below is the detailed instruction "how to" with mlx5 NIC to gather > > all the debug data including the full timings information. > > > > > > 1. Build DPDK application with enabled datapath tracing > > > > The meson option should be specified: > > --enable_trace_fp=true > > > > The c_args shoudl be specified: > > -DALLOW_EXPERIMENTAL_API > > > > The DPDK configuration examples: > > > > meson configure --buildtype=debug -Denable_trace_fp=true > > -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT - > > DALLOW_EXPERIMENTAL_API' build > > > > meson configure --buildtype=debug -Denable_trace_fp=true > > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build > > > > meson configure --buildtype=release -Denable_trace_fp=true > > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' build > > > > meson configure --buildtype=release -Denable_trace_fp=true > > -Dc_args='-DALLOW_EXPERIMENTAL_API' build > > > > > > 2. Configuring the NIC > > > > If the sending completion timings are important the NIC should be configured > > to provide realtime timestamps, the REAL_TIME_CLOCK_ENABLE NV settings > > parameter > > should be configured to TRUE, for example with command (and with following > > FW/driver reset): > > > > sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s > > REAL_TIME_CLOCK_ENABLE=1 > > > > > > 3. Run DPDK application to gather the traces > > > > EAL parameters controlling trace capability in runtime > > > > --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints > > with matching names at least "pmd.net.mlx5.tx" > > must be enabled to gather all events needed > > to analyze mlx5 Tx datapath and its timings. > > By default all tracepoints are disabled. > > > > --trace-dir=/var/log - trace storing directory > > > > --trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size > > per thread. The default is 1MB. > > > > --trace-mode=overwrite|discard - optional, selects trace data buffer mode. > > > > > > 4. Installing or Building Babeltrace2 Package > > > > The gathered trace data can be analyzed with a developed Python script. > > To parse the trace, the data script uses the Babeltrace2 library. > > The package should be either installed or built from source code as > > shown below: > > > > git clone https://github.com/efficios/babeltrace.git > > cd babeltrace > > ./bootstrap > > ./configure -help > > ./configure --disable-api-doc --disable-man-pages > > --disable-python-bindings-doc --enbale-python-plugins > > --enable-python-binding > > > > 5. Running the Analyzing Script > > > > The analyzing script is located in the folder: ./drivers/net/mlx5/tools > > It requires Python3.6, Babeltrace2 packages and it takes the only parameter > > of trace data file. For example: > > > > ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39 > > > > > > 6. Interpreting the Script Output Data > > > > All the timings are given in nanoseconds. > > The list of Tx (and coming Rx) bursts per port/queue is presented in the > > output. > > Each list element contains the list of built WQEs with specific opcodes, and > > each WQE contains the list of the encompassed packets to send. This information should be in the documentation. I think we should request a review of the Python script from people familiar with tracing and from people more familiar with Python scripting for user tools.
> -----Original Message----- > From: Thomas Monjalon <thomas@monjalon.net> > Sent: Tuesday, June 27, 2023 3:46 AM > To: Slava Ovsiienko <viacheslavo@nvidia.com> > Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; > rjarry@redhat.com; jerinj@marvell.com > Subject: Re: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing > > 20/06/2023 14:00, Raslan Darawsheh: > > Hi, > > > > > -----Original Message----- > > > From: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > > > Sent: Tuesday, June 13, 2023 7:59 PM > > > To: dev@dpdk.org > > > Subject: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing > > > > > > The mlx5 provides the send scheduling on specific moment of time, > > > and for the related kind of applications it would be extremely > > > useful to have extra debug information - when and how packets were > > > scheduled and when the actual sending was completed by the NIC > > > hardware (it helps application to track the internal delay issues). > > > > > > Because the DPDK tx datapath API does not suppose getting any > > > feedback from the driver and the feature looks like to be mlx5 > > > specific, it seems to be reasonable to engage exisiting DPDK datapath > tracing capability. > > > > > > The work cycle is supposed to be: > > > - compile appplication with enabled tracing > > > - run application with EAL parameters configuring the tracing in mlx5 > > > Tx datapath > > > - store the dump file with gathered tracing information > > > - run analyzing scrypt (in Python) to combine related events (packet > > > firing and completion) and see the data in human-readable view > > > > > > Below is the detailed instruction "how to" with mlx5 NIC to gather > > > all the debug data including the full timings information. > > > > > > > > > 1. Build DPDK application with enabled datapath tracing > > > > > > The meson option should be specified: > > > --enable_trace_fp=true > > > > > > The c_args shoudl be specified: > > > -DALLOW_EXPERIMENTAL_API > > > > > > The DPDK configuration examples: > > > > > > meson configure --buildtype=debug -Denable_trace_fp=true > > > -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT - > > > DALLOW_EXPERIMENTAL_API' build > > > > > > meson configure --buildtype=debug -Denable_trace_fp=true > > > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' > > > build > > > > > > meson configure --buildtype=release -Denable_trace_fp=true > > > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' > > > build > > > > > > meson configure --buildtype=release -Denable_trace_fp=true > > > -Dc_args='-DALLOW_EXPERIMENTAL_API' build > > > > > > > > > 2. Configuring the NIC > > > > > > If the sending completion timings are important the NIC should be > > > configured to provide realtime timestamps, the > > > REAL_TIME_CLOCK_ENABLE NV settings parameter should be configured > to > > > TRUE, for example with command (and with following FW/driver reset): > > > > > > sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s > > > REAL_TIME_CLOCK_ENABLE=1 > > > > > > > > > 3. Run DPDK application to gather the traces > > > > > > EAL parameters controlling trace capability in runtime > > > > > > --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints > > > with matching names at least "pmd.net.mlx5.tx" > > > must be enabled to gather all events needed > > > to analyze mlx5 Tx datapath and its timings. > > > By default all tracepoints are disabled. > > > > > > --trace-dir=/var/log - trace storing directory > > > > > > --trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size > > > per thread. The default is 1MB. > > > > > > --trace-mode=overwrite|discard - optional, selects trace data buffer > mode. > > > > > > > > > 4. Installing or Building Babeltrace2 Package > > > > > > The gathered trace data can be analyzed with a developed Python script. > > > To parse the trace, the data script uses the Babeltrace2 library. > > > The package should be either installed or built from source code as > > > shown below: > > > > > > git clone https://github.com/efficios/babeltrace.git > > > cd babeltrace > > > ./bootstrap > > > ./configure -help > > > ./configure --disable-api-doc --disable-man-pages > > > --disable-python-bindings-doc --enbale-python-plugins > > > --enable-python-binding > > > > > > 5. Running the Analyzing Script > > > > > > The analyzing script is located in the folder: > > > ./drivers/net/mlx5/tools It requires Python3.6, Babeltrace2 packages > > > and it takes the only parameter of trace data file. For example: > > > > > > ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39 > > > > > > > > > 6. Interpreting the Script Output Data > > > > > > All the timings are given in nanoseconds. > > > The list of Tx (and coming Rx) bursts per port/queue is presented in > > > the output. > > > Each list element contains the list of built WQEs with specific > > > opcodes, and each WQE contains the list of the encompassed packets to > send. > > This information should be in the documentation. OK, should we make this cover-letter part of mlx5.rst? > > I think we should request a review of the Python script from people familiar > with tracing and from people more familiar with Python scripting for user > tools. Would be very helpful, could you recommend/ask someone? With best regards, Slava >
27/06/2023 13:24, Slava Ovsiienko: > > > -----Original Message----- > > From: Thomas Monjalon <thomas@monjalon.net> > > Sent: Tuesday, June 27, 2023 3:46 AM > > To: Slava Ovsiienko <viacheslavo@nvidia.com> > > Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; > > rjarry@redhat.com; jerinj@marvell.com > > Subject: Re: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing > > > > 20/06/2023 14:00, Raslan Darawsheh: > > > Hi, > > > > > > > -----Original Message----- > > > > From: Viacheslav Ovsiienko <viacheslavo@nvidia.com> > > > > Sent: Tuesday, June 13, 2023 7:59 PM > > > > To: dev@dpdk.org > > > > Subject: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing > > > > > > > > The mlx5 provides the send scheduling on specific moment of time, > > > > and for the related kind of applications it would be extremely > > > > useful to have extra debug information - when and how packets were > > > > scheduled and when the actual sending was completed by the NIC > > > > hardware (it helps application to track the internal delay issues). > > > > > > > > Because the DPDK tx datapath API does not suppose getting any > > > > feedback from the driver and the feature looks like to be mlx5 > > > > specific, it seems to be reasonable to engage exisiting DPDK datapath > > tracing capability. > > > > > > > > The work cycle is supposed to be: > > > > - compile appplication with enabled tracing > > > > - run application with EAL parameters configuring the tracing in mlx5 > > > > Tx datapath > > > > - store the dump file with gathered tracing information > > > > - run analyzing scrypt (in Python) to combine related events (packet > > > > firing and completion) and see the data in human-readable view > > > > > > > > Below is the detailed instruction "how to" with mlx5 NIC to gather > > > > all the debug data including the full timings information. > > > > > > > > > > > > 1. Build DPDK application with enabled datapath tracing > > > > > > > > The meson option should be specified: > > > > --enable_trace_fp=true > > > > > > > > The c_args shoudl be specified: > > > > -DALLOW_EXPERIMENTAL_API > > > > > > > > The DPDK configuration examples: > > > > > > > > meson configure --buildtype=debug -Denable_trace_fp=true > > > > -Dc_args='-DRTE_LIBRTE_MLX5_DEBUG -DRTE_ENABLE_ASSERT - > > > > DALLOW_EXPERIMENTAL_API' build > > > > > > > > meson configure --buildtype=debug -Denable_trace_fp=true > > > > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' > > > > build > > > > > > > > meson configure --buildtype=release -Denable_trace_fp=true > > > > -Dc_args='-DRTE_ENABLE_ASSERT -DALLOW_EXPERIMENTAL_API' > > > > build > > > > > > > > meson configure --buildtype=release -Denable_trace_fp=true > > > > -Dc_args='-DALLOW_EXPERIMENTAL_API' build > > > > > > > > > > > > 2. Configuring the NIC > > > > > > > > If the sending completion timings are important the NIC should be > > > > configured to provide realtime timestamps, the > > > > REAL_TIME_CLOCK_ENABLE NV settings parameter should be configured > > to > > > > TRUE, for example with command (and with following FW/driver reset): > > > > > > > > sudo mlxconfig -d /dev/mst/mt4125_pciconf0 s > > > > REAL_TIME_CLOCK_ENABLE=1 > > > > > > > > > > > > 3. Run DPDK application to gather the traces > > > > > > > > EAL parameters controlling trace capability in runtime > > > > > > > > --trace=pmd.net.mlx5.tx - the regular expression enabling the tracepoints > > > > with matching names at least "pmd.net.mlx5.tx" > > > > must be enabled to gather all events needed > > > > to analyze mlx5 Tx datapath and its timings. > > > > By default all tracepoints are disabled. > > > > > > > > --trace-dir=/var/log - trace storing directory > > > > > > > > --trace-bufsz=<val>B|<val>K|<val>M - optional, trace data buffer size > > > > per thread. The default is 1MB. > > > > > > > > --trace-mode=overwrite|discard - optional, selects trace data buffer > > mode. > > > > > > > > > > > > 4. Installing or Building Babeltrace2 Package > > > > > > > > The gathered trace data can be analyzed with a developed Python script. > > > > To parse the trace, the data script uses the Babeltrace2 library. > > > > The package should be either installed or built from source code as > > > > shown below: > > > > > > > > git clone https://github.com/efficios/babeltrace.git > > > > cd babeltrace > > > > ./bootstrap > > > > ./configure -help > > > > ./configure --disable-api-doc --disable-man-pages > > > > --disable-python-bindings-doc --enbale-python-plugins > > > > --enable-python-binding > > > > > > > > 5. Running the Analyzing Script > > > > > > > > The analyzing script is located in the folder: > > > > ./drivers/net/mlx5/tools It requires Python3.6, Babeltrace2 packages > > > > and it takes the only parameter of trace data file. For example: > > > > > > > > ./mlx5_trace.py /var/log/rte-2023-01-23-AM-11-52-39 > > > > > > > > > > > > 6. Interpreting the Script Output Data > > > > > > > > All the timings are given in nanoseconds. > > > > The list of Tx (and coming Rx) bursts per port/queue is presented in > > > > the output. > > > > Each list element contains the list of built WQEs with specific > > > > opcodes, and each WQE contains the list of the encompassed packets to > > send. > > > > This information should be in the documentation. > OK, should we make this cover-letter part of mlx5.rst? Kind of, yes. > > I think we should request a review of the Python script from people familiar > > with tracing and from people more familiar with Python scripting for user > > tools. > Would be very helpful, could you recommend/ask someone? Jerin, what do you think of such a script? Robin, would you have time to look at this trace processing script please?
Thomas Monjalon, Jun 27, 2023 at 13:34: > Robin, would you have time to look at this trace processing script > please? Hi there, I've had a brief look at the script. I don't exactly know what it is taking as input and should be producing as output. Could you give some examples? Maybe I could suggest a few ideas to make it "feel" more python-esque. Cheers,
Hi, Robin Thank you for your courtesy about script reviewing. Please see an attachment - the raw data gathered as a result of tracing, and brief description. With best regards, Slava > -----Original Message----- > From: Robin Jarry <rjarry@redhat.com> > Sent: Wednesday, June 28, 2023 5:19 PM > To: NBU-Contact-Thomas Monjalon (EXTERNAL) <thomas@monjalon.net>; > Slava Ovsiienko <viacheslavo@nvidia.com> > Cc: dev@dpdk.org; Raslan Darawsheh <rasland@nvidia.com>; > jerinj@marvell.com; david.marchand@redhat.com > Subject: Re: [PATCH v2 0/5] net/mlx5: introduce Tx datapath tracing > > Thomas Monjalon, Jun 27, 2023 at 13:34: > > Robin, would you have time to look at this trace processing script > > please? > > Hi there, > > I've had a brief look at the script. I don't exactly know what it is taking as > input and should be producing as output. Could you give some examples? > > Maybe I could suggest a few ideas to make it "feel" more python-esque. > > Cheers,
Slava Ovsiienko, Jun 29, 2023 at 09:16: > Hi, Robin > > Thank you for your courtesy about script reviewing. > Please see an attachment - the raw data gathered as a result of tracing, and brief description. Thanks for the details. I think that most of the contents of the included pdf file should go into the docs and/or into the script help. As for the script itself, the first thing to do would be to fix all warnings reported by pylint: $ pylint --enable=all mlx5_trace.py After that, I have a few general remarks: * do not use global variables except for constants * most of the time, there is no need to use sys.exit() explicitly * print errors on stderr * remember that python has exceptions, it makes error handling easier I would also advise to format your code using [black][1] so that you don't have to bother about coding style. [1]: https://github.com/psf/black Feel free to inspire from the general structure that is present in some of the scripts that I have written: * usertools/dpdk-pmdinfo.py * usertools/dpdk-rss-flows.py (not yet applied, http://patches.dpdk.org/project/dpdk/patch/20230628134748.117697-3-rjarry@redhat.com/) Cheers, Robin