Message ID | 20190327061935.19572-1-barbette@kth.se (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@dpdk.org Delivered-To: patchwork@dpdk.org Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 715C25A6A; Wed, 27 Mar 2019 07:19:59 +0100 (CET) Received: from smtp-4.sys.kth.se (smtp-4.sys.kth.se [130.237.48.193]) by dpdk.org (Postfix) with ESMTP id 9F7215A44 for <dev@dpdk.org>; Wed, 27 Mar 2019 07:19:57 +0100 (CET) Received: from smtp-4.sys.kth.se (localhost.localdomain [127.0.0.1]) by smtp-4.sys.kth.se (Postfix) with ESMTP id 551BD266E; Wed, 27 Mar 2019 07:19:57 +0100 (CET) X-Virus-Scanned: by amavisd-new at kth.se Received: from smtp-4.sys.kth.se ([127.0.0.1]) by smtp-4.sys.kth.se (smtp-4.sys.kth.se [127.0.0.1]) (amavisd-new, port 10024) with LMTP id qkn8lVOmq6A2; Wed, 27 Mar 2019 07:19:56 +0100 (CET) X-KTH-Auth: barbette [130.237.20.142] DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kth.se; s=default; t=1553667596; bh=bzZz7nOqb6fccAeihmqrnP91tft+A/1ZAqprSY8cUU0=; h=From:To:Cc:Subject:Date; b=FC+dKXDiq+/HHJiuqbuoJFGrhlkTMnbmrHSvcpLzFqrxPIK6DZse7uuXnQI6AcUFf T9/guC39ql6QODBT4wpk0/O2bmHTfpJa8sx5nhL1XRqD3LpIKbMroyHXgspLSlU1n1 9mE+WD9HxiW2mBHTIJksUOIPfbLskby1HxeYSS5M= X-KTH-mail-from: barbette@kth.se Received: from tom-kth-workstation.it.kth.se (s2587.it.kth.se [130.237.20.142]) by smtp-4.sys.kth.se (Postfix) with ESMTPSA id 7E8956711; Wed, 27 Mar 2019 07:19:54 +0100 (CET) From: Tom Barbette <barbette@kth.se> To: dev@dpdk.org Cc: bruce.richardson@intel.com, john.mcnamara@intel.com, Thomas Monjalon <thomas@monjalon.net>, Ferruh Yigit <ferruh.yigit@intel.com>, Andrew Rybchenko <arybchenko@solarflare.com>, Shahaf Shuler <shahafs@mellanox.com>, Yongseok Koh <yskoh@mellanox.com>, Tom Barbette <barbette@kth.se> Date: Wed, 27 Mar 2019 07:19:32 +0100 Message-Id: <20190327061935.19572-1-barbette@kth.se> X-Mailer: git-send-email 2.17.1 Subject: [dpdk-dev] [PATCH v2 0/3] Add rte_eth_read_clock API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series | Add rte_eth_read_clock API | |
Message
Tom Barbette
March 27, 2019, 6:19 a.m. UTC
Some NICs allow to timestamp packets, but do not support the full PTP synchronization process. Hence, the value set in the mbuf timestamp field is only the raw value of an internal clock. To make sense of this value, one at least needs to be able to query the current hardware clock value. As with the TSC, from there a frequency can be derieved by querying multiple time the current value of the internal clock with some known delay between the queries (example provided in the API doc). This patch series adds support for MLX5. An example app is provided in the rxtx_callback application. It has been updated to display, on top of the software latency in cycles, the total latency since the packet was received in hardware. The API is used to compute a delta in the TX callback. The raw amount of ticks is converted to cycles using a variation of the technique describe above. Aside from offloading timestamping, which relieve the software from a few operations, this allows to get much more precision when studying the source of the latency in a system. Eg. in our 100G, CX5 setup the rxtx callback application shows SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency including NIC processing, PCIe, and queuing is around 196 cycles. One may think at first this API is overlapping with te_eth_timesync_read_time. rte_eth_timesync_read_time is clearly identified as part of a set of functions to use PTP synchronization. The device raw clock is not "sync" in any way. More importantly, the returned value is not a timeval, but an amount of ticks. We could have a cast-based solution, but on top of being an ugly solution, some people seeing the timeval type of rte_eth_timesync_read_time could use it blindly. Change in v2: - Rebase on current master Tom Barbette (3): rte_ethdev: Add API function to read dev clock mlx5: Implement support for read_clock rxtx_callbacks: Add support for HW timestamp doc/guides/nics/features.rst | 1 + doc/guides/sample_app_ug/rxtx_callbacks.rst | 9 ++- drivers/net/mlx5/mlx5.c | 1 + drivers/net/mlx5/mlx5.h | 1 + drivers/net/mlx5/mlx5_ethdev.c | 29 +++++++ drivers/net/mlx5/mlx5_glue.c | 8 ++ drivers/net/mlx5/mlx5_glue.h | 2 + examples/rxtx_callbacks/Makefile | 2 + examples/rxtx_callbacks/main.c | 86 ++++++++++++++++++++- examples/rxtx_callbacks/meson.build | 1 + lib/librte_ethdev/rte_ethdev.c | 13 ++++ lib/librte_ethdev/rte_ethdev.h | 44 +++++++++++ lib/librte_ethdev/rte_ethdev_core.h | 6 ++ lib/librte_ethdev/rte_ethdev_version.map | 1 + lib/librte_mbuf/rte_mbuf.h | 2 + 15 files changed, 201 insertions(+), 5 deletions(-)
Comments
On Wed, 27 Mar 2019 07:19:32 +0100 Tom Barbette <barbette@kth.se> wrote: > Some NICs allow to timestamp packets, but do not support the full > PTP synchronization process. Hence, the value set in the mbuf > timestamp field is only the raw value of an internal clock. > > To make sense of this value, one at least needs to be able to query > the current hardware clock value. As with the TSC, from there > a frequency can be derieved by querying multiple time the current value of the > internal clock with some known delay between the queries (example > provided in the API doc). > > This patch series adds support for MLX5. > > An example app is provided in the rxtx_callback application. > It has been updated to display, on top of the software latency > in cycles, the total latency since the packet was received in hardware. > The API is used to compute a delta in the TX callback. The raw amount of > ticks is converted to cycles using a variation of the technique describe above. > > Aside from offloading timestamping, which relieve the > software from a few operations, this allows to get much more precision > when studying the source of the latency in a system. > Eg. in our 100G, CX5 setup the rxtx callback application shows > SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency > including NIC processing, PCIe, and queuing is around 196 cycles. > > One may think at first this API is overlapping with te_eth_timesync_read_time. > rte_eth_timesync_read_time is clearly identified as part of a set of functions > to use PTP synchronization. > The device raw clock is not "sync" in any way. More importantly, the returned > value is not a timeval, but an amount of ticks. We could have a cast-based > solution, but on top of being an ugly solution, some people seeing the timeval > type of rte_eth_timesync_read_time could use it blindly. > > Change in v2: > - Rebase on current master > > Tom Barbette (3): > rte_ethdev: Add API function to read dev clock > mlx5: Implement support for read_clock > rxtx_callbacks: Add support for HW timestamp > > doc/guides/nics/features.rst | 1 + > doc/guides/sample_app_ug/rxtx_callbacks.rst | 9 ++- > drivers/net/mlx5/mlx5.c | 1 + > drivers/net/mlx5/mlx5.h | 1 + > drivers/net/mlx5/mlx5_ethdev.c | 29 +++++++ > drivers/net/mlx5/mlx5_glue.c | 8 ++ > drivers/net/mlx5/mlx5_glue.h | 2 + > examples/rxtx_callbacks/Makefile | 2 + > examples/rxtx_callbacks/main.c | 86 ++++++++++++++++++++- > examples/rxtx_callbacks/meson.build | 1 + > lib/librte_ethdev/rte_ethdev.c | 13 ++++ > lib/librte_ethdev/rte_ethdev.h | 44 +++++++++++ > lib/librte_ethdev/rte_ethdev_core.h | 6 ++ > lib/librte_ethdev/rte_ethdev_version.map | 1 + > lib/librte_mbuf/rte_mbuf.h | 2 + > 15 files changed, 201 insertions(+), 5 deletions(-) I like this approach but would like to see the same API supported on multiple devices. The current timestamp API is a mess because not all devices behave the same way. Trying to write an application that uses timestamping is therefore very difficult.
27/03/2019 15:41, Stephen Hemminger: > On Wed, 27 Mar 2019 07:19:32 +0100 > Tom Barbette <barbette@kth.se> wrote: > > > Some NICs allow to timestamp packets, but do not support the full > > PTP synchronization process. Hence, the value set in the mbuf > > timestamp field is only the raw value of an internal clock. > > > > To make sense of this value, one at least needs to be able to query > > the current hardware clock value. As with the TSC, from there > > a frequency can be derieved by querying multiple time the current value of the > > internal clock with some known delay between the queries (example > > provided in the API doc). > > > > This patch series adds support for MLX5. > > > > An example app is provided in the rxtx_callback application. > > It has been updated to display, on top of the software latency > > in cycles, the total latency since the packet was received in hardware. > > The API is used to compute a delta in the TX callback. The raw amount of > > ticks is converted to cycles using a variation of the technique describe above. > > > > Aside from offloading timestamping, which relieve the > > software from a few operations, this allows to get much more precision > > when studying the source of the latency in a system. > > Eg. in our 100G, CX5 setup the rxtx callback application shows > > SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency > > including NIC processing, PCIe, and queuing is around 196 cycles. > > > > One may think at first this API is overlapping with te_eth_timesync_read_time. > > rte_eth_timesync_read_time is clearly identified as part of a set of functions > > to use PTP synchronization. > > The device raw clock is not "sync" in any way. More importantly, the returned > > value is not a timeval, but an amount of ticks. We could have a cast-based > > solution, but on top of being an ugly solution, some people seeing the timeval > > type of rte_eth_timesync_read_time could use it blindly. > > > > Change in v2: > > - Rebase on current master > > > > Tom Barbette (3): > > rte_ethdev: Add API function to read dev clock > > mlx5: Implement support for read_clock > > rxtx_callbacks: Add support for HW timestamp > > I like this approach but would like to see the same API supported > on multiple devices. > > The current timestamp API is a mess because not all devices behave the > same way. Trying to write an application that uses timestamping is therefore > very difficult. So what do you suggest?
> On Mar 27, 2019, at 9:41 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: > > On Wed, 27 Mar 2019 07:19:32 +0100 > Tom Barbette <barbette@kth.se> wrote: > >> Some NICs allow to timestamp packets, but do not support the full >> PTP synchronization process. Hence, the value set in the mbuf >> timestamp field is only the raw value of an internal clock. >> >> To make sense of this value, one at least needs to be able to query >> the current hardware clock value. As with the TSC, from there >> a frequency can be derieved by querying multiple time the current value of the >> internal clock with some known delay between the queries (example >> provided in the API doc). >> >> This patch series adds support for MLX5. >> >> An example app is provided in the rxtx_callback application. >> It has been updated to display, on top of the software latency >> in cycles, the total latency since the packet was received in hardware. >> The API is used to compute a delta in the TX callback. The raw amount of >> ticks is converted to cycles using a variation of the technique describe above. >> >> Aside from offloading timestamping, which relieve the >> software from a few operations, this allows to get much more precision >> when studying the source of the latency in a system. >> Eg. in our 100G, CX5 setup the rxtx callback application shows >> SW latency is around 74 cycles (TSC is 3.2Ghz), but the latency >> including NIC processing, PCIe, and queuing is around 196 cycles. >> >> One may think at first this API is overlapping with te_eth_timesync_read_time. >> rte_eth_timesync_read_time is clearly identified as part of a set of functions >> to use PTP synchronization. >> The device raw clock is not "sync" in any way. More importantly, the returned >> value is not a timeval, but an amount of ticks. We could have a cast-based >> solution, but on top of being an ugly solution, some people seeing the timeval >> type of rte_eth_timesync_read_time could use it blindly. >> >> Change in v2: >> - Rebase on current master >> >> Tom Barbette (3): >> rte_ethdev: Add API function to read dev clock >> mlx5: Implement support for read_clock >> rxtx_callbacks: Add support for HW timestamp >> >> doc/guides/nics/features.rst | 1 + >> doc/guides/sample_app_ug/rxtx_callbacks.rst | 9 ++- >> drivers/net/mlx5/mlx5.c | 1 + >> drivers/net/mlx5/mlx5.h | 1 + >> drivers/net/mlx5/mlx5_ethdev.c | 29 +++++++ >> drivers/net/mlx5/mlx5_glue.c | 8 ++ >> drivers/net/mlx5/mlx5_glue.h | 2 + >> examples/rxtx_callbacks/Makefile | 2 + >> examples/rxtx_callbacks/main.c | 86 ++++++++++++++++++++- >> examples/rxtx_callbacks/meson.build | 1 + >> lib/librte_ethdev/rte_ethdev.c | 13 ++++ >> lib/librte_ethdev/rte_ethdev.h | 44 +++++++++++ >> lib/librte_ethdev/rte_ethdev_core.h | 6 ++ >> lib/librte_ethdev/rte_ethdev_version.map | 1 + >> lib/librte_mbuf/rte_mbuf.h | 2 + >> 15 files changed, 201 insertions(+), 5 deletions(-) > > > I like this approach but would like to see the same API supported > on multiple devices. > > The current timestamp API is a mess because not all devices behave the > same way. Trying to write an application that uses timestamping is therefore > very difficult. Another question is this an optional API for a PMD? I assume it is. I know the API rte_eht_read_clock() is attempting to read the NIC for this timestamp, but if the PMD does not support this request can we just default to the rte_rdtsc() as a return value? Regards, Keith
On 2019-03-27 15:54, Wiles, Keith wrote: > I know the API rte_eht_read_clock() is attempting to read the NIC for this timestamp, but if the PMD does not support this request can we just default to the rte_rdtsc() as a return value? I would not advise that, because the goal of the function is to have something that is from the same unit than the hardware timestamp given in the mbufs. > The current timestamp API is a mess because not all devices behave the > same way. Trying to write an application that uses timestamping is > therefore > very difficult. This is different than timesync, no other devices implement hardware timestamping. For me, it's a different feature. Cheers, Tom