Message ID | cover.1594638050.git.vladimir.medvedkin@intel.com (mailing list archive) |
---|---|
Headers |
Return-Path: <dev-bounces@dpdk.org> X-Original-To: patchwork@inbox.dpdk.org Delivered-To: patchwork@inbox.dpdk.org Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id F3EC8A0540; Mon, 13 Jul 2020 13:11:36 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 91EB51D5B6; Mon, 13 Jul 2020 13:11:35 +0200 (CEST) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by dpdk.org (Postfix) with ESMTP id A89BD1D5B2 for <dev@dpdk.org>; Mon, 13 Jul 2020 13:11:33 +0200 (CEST) IronPort-SDR: rdpsVs8QRnp5G28JopW+ZHHgKa/EWv8NLkLlDHcfNUON1RWwA2m22PyjIeuLhWthS0U/Oirf9w +PqVlhMA7uyw== X-IronPort-AV: E=McAfee;i="6000,8403,9680"; a="136062705" X-IronPort-AV: E=Sophos;i="5.75,347,1589266800"; d="scan'208";a="136062705" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Jul 2020 04:11:32 -0700 IronPort-SDR: om69aAQOa9tVTzFemtWmiihk58RmWIF3SeBHGNaRzSGo2rYb438az7OMvPr5xkgU2X8YcdaqbX 1FNxSHNonpSg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,347,1589266800"; d="scan'208";a="307424854" Received: from silpixa00400322.ir.intel.com ([10.237.214.86]) by fmsmga004.fm.intel.com with ESMTP; 13 Jul 2020 04:11:30 -0700 From: Vladimir Medvedkin <vladimir.medvedkin@intel.com> To: dev@dpdk.org Cc: david.marchand@redhat.com, jerinj@marvell.com, mdr@ashroe.eu, thomas@monjalon.net, konstantin.ananyev@intel.com, bruce.richardson@intel.com Date: Mon, 13 Jul 2020 12:11:19 +0100 Message-Id: <cover.1594638050.git.vladimir.medvedkin@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <cover.1594389240.git.vladimir.medvedkin@intel.com> References: <cover.1594389240.git.vladimir.medvedkin@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Subject: [dpdk-dev] [PATCH v6 0/8] fib: implement AVX512 vector lookup X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions <dev.dpdk.org> List-Unsubscribe: <https://mails.dpdk.org/options/dev>, <mailto:dev-request@dpdk.org?subject=unsubscribe> List-Archive: <http://mails.dpdk.org/archives/dev/> List-Post: <mailto:dev@dpdk.org> List-Help: <mailto:dev-request@dpdk.org?subject=help> List-Subscribe: <https://mails.dpdk.org/listinfo/dev>, <mailto:dev-request@dpdk.org?subject=subscribe> Errors-To: dev-bounces@dpdk.org Sender: "dev" <dev-bounces@dpdk.org> |
Series |
fib: implement AVX512 vector lookup
|
|
Message
Vladimir Medvedkin
July 13, 2020, 11:11 a.m. UTC
This patch series implements vectorized lookup using AVX512 for ipv4 dir24_8 and ipv6 trie algorithms. Also introduced rte_fib_set_lookup_fn() to change lookup function type. Added option to select lookup function type in testfib application. v6: - style fixes v5: - prefix zmm macro in rte_vect.h with RTE_X86 - remove unnecessary typedef for _x86_zmm_t - reword commit title - fix typos v4: - use __rte_aligned() instead of using compiler attribute directly - rework and add comments to meson.build v3: - separate out the AVX-512 code into a separate file v2: - rename rte_zmm to __rte_x86_zmm to reflect its internal usage - make runtime decision to use avx512 lookup Vladimir Medvedkin (8): eal/x86: introduce AVX 512-bit type fib: make lookup function type configurable fib: move lookup definition into the header file fib: introduce AVX512 lookup fib6: make lookup function type configurable fib6: move lookup definition into the header file fib6: introduce AVX512 lookup app/testfib: add support for different lookup functions app/test-fib/main.c | 58 +++++- lib/librte_eal/x86/include/rte_vect.h | 19 ++ lib/librte_fib/Makefile | 24 +++ lib/librte_fib/dir24_8.c | 281 +++++--------------------- lib/librte_fib/dir24_8.h | 226 ++++++++++++++++++++- lib/librte_fib/dir24_8_avx512.c | 165 +++++++++++++++ lib/librte_fib/dir24_8_avx512.h | 24 +++ lib/librte_fib/meson.build | 31 +++ lib/librte_fib/rte_fib.c | 21 +- lib/librte_fib/rte_fib.h | 24 +++ lib/librte_fib/rte_fib6.c | 20 +- lib/librte_fib/rte_fib6.h | 22 ++ lib/librte_fib/rte_fib_version.map | 2 + lib/librte_fib/trie.c | 161 +++------------ lib/librte_fib/trie.h | 119 ++++++++++- lib/librte_fib/trie_avx512.c | 269 ++++++++++++++++++++++++ lib/librte_fib/trie_avx512.h | 20 ++ 17 files changed, 1114 insertions(+), 372 deletions(-) create mode 100644 lib/librte_fib/dir24_8_avx512.c create mode 100644 lib/librte_fib/dir24_8_avx512.h create mode 100644 lib/librte_fib/trie_avx512.c create mode 100644 lib/librte_fib/trie_avx512.h
Comments
On Mon, 13 Jul 2020 12:11:19 +0100 Vladimir Medvedkin <vladimir.medvedkin@intel.com> wrote: > This patch series implements vectorized lookup using AVX512 for > ipv4 dir24_8 and ipv6 trie algorithms. > Also introduced rte_fib_set_lookup_fn() to change lookup function type. > Added option to select lookup function type in testfib application. > > v6: > - style fixes > > v5: > - prefix zmm macro in rte_vect.h with RTE_X86 > - remove unnecessary typedef for _x86_zmm_t > - reword commit title > - fix typos > > v4: > - use __rte_aligned() instead of using compiler attribute directly > - rework and add comments to meson.build > > v3: > - separate out the AVX-512 code into a separate file > > v2: > - rename rte_zmm to __rte_x86_zmm to reflect its internal usage > - make runtime decision to use avx512 lookup > > Vladimir Medvedkin (8): > eal/x86: introduce AVX 512-bit type > fib: make lookup function type configurable > fib: move lookup definition into the header file > fib: introduce AVX512 lookup > fib6: make lookup function type configurable > fib6: move lookup definition into the header file > fib6: introduce AVX512 lookup > app/testfib: add support for different lookup functions > > app/test-fib/main.c | 58 +++++- > lib/librte_eal/x86/include/rte_vect.h | 19 ++ > lib/librte_fib/Makefile | 24 +++ > lib/librte_fib/dir24_8.c | 281 +++++--------------------- > lib/librte_fib/dir24_8.h | 226 ++++++++++++++++++++- > lib/librte_fib/dir24_8_avx512.c | 165 +++++++++++++++ > lib/librte_fib/dir24_8_avx512.h | 24 +++ > lib/librte_fib/meson.build | 31 +++ > lib/librte_fib/rte_fib.c | 21 +- > lib/librte_fib/rte_fib.h | 24 +++ > lib/librte_fib/rte_fib6.c | 20 +- > lib/librte_fib/rte_fib6.h | 22 ++ > lib/librte_fib/rte_fib_version.map | 2 + > lib/librte_fib/trie.c | 161 +++------------ > lib/librte_fib/trie.h | 119 ++++++++++- > lib/librte_fib/trie_avx512.c | 269 ++++++++++++++++++++++++ > lib/librte_fib/trie_avx512.h | 20 ++ > 17 files changed, 1114 insertions(+), 372 deletions(-) > create mode 100644 lib/librte_fib/dir24_8_avx512.c > create mode 100644 lib/librte_fib/dir24_8_avx512.h > create mode 100644 lib/librte_fib/trie_avx512.c > create mode 100644 lib/librte_fib/trie_avx512.h > Did anyone else see the recent AVX512 discussion from Linus: "I hope AVX512 dies a painful death, and that Intel starts fixing real problems instead of trying to create magic instructions to then create benchmarks that they can look good on.
On 13/07/2020 23:19, Stephen Hemminger wrote: > On Mon, 13 Jul 2020 12:11:19 +0100 > Vladimir Medvedkin <vladimir.medvedkin@intel.com> wrote: > >> This patch series implements vectorized lookup using AVX512 for >> ipv4 dir24_8 and ipv6 trie algorithms. >> Also introduced rte_fib_set_lookup_fn() to change lookup function type. >> Added option to select lookup function type in testfib application. >> >> v6: >> - style fixes >> >> v5: >> - prefix zmm macro in rte_vect.h with RTE_X86 >> - remove unnecessary typedef for _x86_zmm_t >> - reword commit title >> - fix typos >> >> v4: >> - use __rte_aligned() instead of using compiler attribute directly >> - rework and add comments to meson.build >> >> v3: >> - separate out the AVX-512 code into a separate file >> >> v2: >> - rename rte_zmm to __rte_x86_zmm to reflect its internal usage >> - make runtime decision to use avx512 lookup >> >> Vladimir Medvedkin (8): >> eal/x86: introduce AVX 512-bit type >> fib: make lookup function type configurable >> fib: move lookup definition into the header file >> fib: introduce AVX512 lookup >> fib6: make lookup function type configurable >> fib6: move lookup definition into the header file >> fib6: introduce AVX512 lookup >> app/testfib: add support for different lookup functions >> >> app/test-fib/main.c | 58 +++++- >> lib/librte_eal/x86/include/rte_vect.h | 19 ++ >> lib/librte_fib/Makefile | 24 +++ >> lib/librte_fib/dir24_8.c | 281 +++++--------------------- >> lib/librte_fib/dir24_8.h | 226 ++++++++++++++++++++- >> lib/librte_fib/dir24_8_avx512.c | 165 +++++++++++++++ >> lib/librte_fib/dir24_8_avx512.h | 24 +++ >> lib/librte_fib/meson.build | 31 +++ >> lib/librte_fib/rte_fib.c | 21 +- >> lib/librte_fib/rte_fib.h | 24 +++ >> lib/librte_fib/rte_fib6.c | 20 +- >> lib/librte_fib/rte_fib6.h | 22 ++ >> lib/librte_fib/rte_fib_version.map | 2 + >> lib/librte_fib/trie.c | 161 +++------------ >> lib/librte_fib/trie.h | 119 ++++++++++- >> lib/librte_fib/trie_avx512.c | 269 ++++++++++++++++++++++++ >> lib/librte_fib/trie_avx512.h | 20 ++ >> 17 files changed, 1114 insertions(+), 372 deletions(-) >> create mode 100644 lib/librte_fib/dir24_8_avx512.c >> create mode 100644 lib/librte_fib/dir24_8_avx512.h >> create mode 100644 lib/librte_fib/trie_avx512.c >> create mode 100644 lib/librte_fib/trie_avx512.h >> > > Did anyone else see the recent AVX512 discussion from Linus: > "I hope AVX512 dies a painful death, and that Intel starts fixing real problems > instead of trying to create magic instructions to then create benchmarks that they can look good on. Yup - I saw this one. Sweeping statements like these are good to provoke debate, the truth is generally more nuanced. If you continue to read the post, Linus appears to be mostly questioning microprocessor design decisions. That is an interesting discussion, however the reality is that the technology does exists and may be beneficial for Packet Processing. I would suggest, we continue to apply the same logic governing adoption of any technology by DPDK. When the technology is present and a clear benefit is shown, we use it with caution. In the case of Vladimir's patch, the user has to explicitly switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. Thanks, Ray K
On Tue, 14 Jul 2020 08:31:32 +0100 "Kinsella, Ray" <mdr@ashroe.eu> wrote: > On 13/07/2020 23:19, Stephen Hemminger wrote: > > On Mon, 13 Jul 2020 12:11:19 +0100 > > Vladimir Medvedkin <vladimir.medvedkin@intel.com> wrote: > > > >> This patch series implements vectorized lookup using AVX512 for > >> ipv4 dir24_8 and ipv6 trie algorithms. > >> Also introduced rte_fib_set_lookup_fn() to change lookup function type. > >> Added option to select lookup function type in testfib application. > >> > >> v6: > >> - style fixes > >> > >> v5: > >> - prefix zmm macro in rte_vect.h with RTE_X86 > >> - remove unnecessary typedef for _x86_zmm_t > >> - reword commit title > >> - fix typos > >> > >> v4: > >> - use __rte_aligned() instead of using compiler attribute directly > >> - rework and add comments to meson.build > >> > >> v3: > >> - separate out the AVX-512 code into a separate file > >> > >> v2: > >> - rename rte_zmm to __rte_x86_zmm to reflect its internal usage > >> - make runtime decision to use avx512 lookup > >> > >> Vladimir Medvedkin (8): > >> eal/x86: introduce AVX 512-bit type > >> fib: make lookup function type configurable > >> fib: move lookup definition into the header file > >> fib: introduce AVX512 lookup > >> fib6: make lookup function type configurable > >> fib6: move lookup definition into the header file > >> fib6: introduce AVX512 lookup > >> app/testfib: add support for different lookup functions > >> > >> app/test-fib/main.c | 58 +++++- > >> lib/librte_eal/x86/include/rte_vect.h | 19 ++ > >> lib/librte_fib/Makefile | 24 +++ > >> lib/librte_fib/dir24_8.c | 281 +++++--------------------- > >> lib/librte_fib/dir24_8.h | 226 ++++++++++++++++++++- > >> lib/librte_fib/dir24_8_avx512.c | 165 +++++++++++++++ > >> lib/librte_fib/dir24_8_avx512.h | 24 +++ > >> lib/librte_fib/meson.build | 31 +++ > >> lib/librte_fib/rte_fib.c | 21 +- > >> lib/librte_fib/rte_fib.h | 24 +++ > >> lib/librte_fib/rte_fib6.c | 20 +- > >> lib/librte_fib/rte_fib6.h | 22 ++ > >> lib/librte_fib/rte_fib_version.map | 2 + > >> lib/librte_fib/trie.c | 161 +++------------ > >> lib/librte_fib/trie.h | 119 ++++++++++- > >> lib/librte_fib/trie_avx512.c | 269 ++++++++++++++++++++++++ > >> lib/librte_fib/trie_avx512.h | 20 ++ > >> 17 files changed, 1114 insertions(+), 372 deletions(-) > >> create mode 100644 lib/librte_fib/dir24_8_avx512.c > >> create mode 100644 lib/librte_fib/dir24_8_avx512.h > >> create mode 100644 lib/librte_fib/trie_avx512.c > >> create mode 100644 lib/librte_fib/trie_avx512.h > >> > > > > Did anyone else see the recent AVX512 discussion from Linus: > > "I hope AVX512 dies a painful death, and that Intel starts fixing real problems > > instead of trying to create magic instructions to then create benchmarks that they can look good on. > > Yup - I saw this one. > Sweeping statements like these are good to provoke debate, the truth is generally more nuanced. > If you continue to read the post, Linus appears to be mostly questioning microprocessor design decisions. > > That is an interesting discussion, however the reality is that the technology does exists and may be beneficial for Packet Processing. > > I would suggest, we continue to apply the same logic governing adoption of any technology by DPDK. > When the technology is present and a clear benefit is shown, we use it with caution. > > In the case of Vladimir's patch, > the user has to explicitly switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. > Using what is available makes sense in DPDK.
14/07/2020 16:38, Stephen Hemminger: > "Kinsella, Ray" <mdr@ashroe.eu> wrote: > > On 13/07/2020 23:19, Stephen Hemminger wrote: > > > Did anyone else see the recent AVX512 discussion from Linus: > > > "I hope AVX512 dies a painful death, and that Intel starts fixing real problems > > > instead of trying to create magic instructions to then create benchmarks that they can look good on. > > > > Yup - I saw this one. > > Sweeping statements like these are good to provoke debate, the truth is generally more nuanced. > > If you continue to read the post, Linus appears to be mostly questioning microprocessor design decisions. > > > > That is an interesting discussion, however the reality is that the technology does exists and may be beneficial for Packet Processing. > > > > I would suggest, we continue to apply the same logic governing adoption of any technology by DPDK. > > When the technology is present and a clear benefit is shown, we use it with caution. > > > > In the case of Vladimir's patch, > > the user has to explicitly switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. > > Using what is available makes sense in DPDK. Why does it require explicit enabling in application? AVX512 is not reliable enough to be automatically used when available?
On 15/07/2020 10:47, Thomas Monjalon wrote: > 14/07/2020 16:38, Stephen Hemminger: >> "Kinsella, Ray" <mdr@ashroe.eu> wrote: >>> On 13/07/2020 23:19, Stephen Hemminger wrote: >>>> Did anyone else see the recent AVX512 discussion from Linus: >>>> "I hope AVX512 dies a painful death, and that Intel starts fixing real problems >>>> instead of trying to create magic instructions to then create benchmarks that they can look good on. >>> >>> Yup - I saw this one. >>> Sweeping statements like these are good to provoke debate, the truth is generally more nuanced. >>> If you continue to read the post, Linus appears to be mostly questioning microprocessor design decisions. >>> >>> That is an interesting discussion, however the reality is that the technology does exists and may be beneficial for Packet Processing. >>> >>> I would suggest, we continue to apply the same logic governing adoption of any technology by DPDK. >>> When the technology is present and a clear benefit is shown, we use it with caution. >>> >>> In the case of Vladimir's patch, >>> the user has to explicitly switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. >> >> Using what is available makes sense in DPDK. > > Why does it require explicit enabling in application? > AVX512 is not reliable enough to be automatically used when available? > It is reliable enough. User have to explicitly trigger to avx512 lookup because using avx512 instructions can reduce the frequency of your cores. The user knows their environment better. So the scalar version is used so as not to affect the frequency. > >
15/07/2020 12:35, Medvedkin, Vladimir: > On 15/07/2020 10:47, Thomas Monjalon wrote: > > 14/07/2020 16:38, Stephen Hemminger: > >> "Kinsella, Ray" <mdr@ashroe.eu> wrote: > >>> On 13/07/2020 23:19, Stephen Hemminger wrote: > >>>> Did anyone else see the recent AVX512 discussion from Linus: > >>>> "I hope AVX512 dies a painful death, and that Intel starts fixing real problems > >>>> instead of trying to create magic instructions to then create benchmarks that they can look good on. > >>> > >>> Yup - I saw this one. > >>> Sweeping statements like these are good to provoke debate, the truth is generally more nuanced. > >>> If you continue to read the post, Linus appears to be mostly questioning microprocessor design decisions. > >>> > >>> That is an interesting discussion, however the reality is that the technology does exists and may be beneficial for Packet Processing. > >>> > >>> I would suggest, we continue to apply the same logic governing adoption of any technology by DPDK. > >>> When the technology is present and a clear benefit is shown, we use it with caution. > >>> > >>> In the case of Vladimir's patch, > >>> the user has to explicitly switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. > >> > >> Using what is available makes sense in DPDK. > > > > Why does it require explicit enabling in application? > > AVX512 is not reliable enough to be automatically used when available? > > It is reliable enough. User have to explicitly trigger to avx512 lookup > because using avx512 instructions can reduce the frequency of your > cores. The user knows their environment better. So the scalar version is > used so as not to affect the frequency. So the user must know which micro-optimization is better for a code they don't know. Reminder: an user is not a developper. I understand we have no better solution though. Can we improve the user experience with some recommendations, numbers, etc?
On 15/07/2020 12:59, Thomas Monjalon wrote: > 15/07/2020 12:35, Medvedkin, Vladimir: >> On 15/07/2020 10:47, Thomas Monjalon wrote: >>> 14/07/2020 16:38, Stephen Hemminger: >>>> "Kinsella, Ray" <mdr@ashroe.eu> wrote: >>>>> On 13/07/2020 23:19, Stephen Hemminger wrote: >>>>>> Did anyone else see the recent AVX512 discussion from Linus: >>>>>> "I hope AVX512 dies a painful death, and that Intel starts fixing real problems >>>>>> instead of trying to create magic instructions to then create benchmarks that they can look good on. >>>>> >>>>> Yup - I saw this one. >>>>> Sweeping statements like these are good to provoke debate, the truth is generally more nuanced. >>>>> If you continue to read the post, Linus appears to be mostly questioning microprocessor design decisions. >>>>> >>>>> That is an interesting discussion, however the reality is that the technology does exists and may be beneficial for Packet Processing. >>>>> >>>>> I would suggest, we continue to apply the same logic governing adoption of any technology by DPDK. >>>>> When the technology is present and a clear benefit is shown, we use it with caution. >>>>> >>>>> In the case of Vladimir's patch, >>>>> the user has to explicitly switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. >>>> >>>> Using what is available makes sense in DPDK. >>> >>> Why does it require explicit enabling in application? >>> AVX512 is not reliable enough to be automatically used when available? >> >> It is reliable enough. User have to explicitly trigger to avx512 lookup >> because using avx512 instructions can reduce the frequency of your >> cores. The user knows their environment better. So the scalar version is >> used so as not to affect the frequency. > > So the user must know which micro-optimization is better for a code > they don't know. Reminder: an user is not a developper. > I understand we have no better solution though. > Can we improve the user experience with some recommendations, numbers, etc? > In case where a user is a developer (dpdk users are mostly devs, aren't they?) who uses the fib library in their app may decide to switch to avx512 lookup using rte_fib_set_lookup_fn() when they know that their code is already using avx512 (ifdef, startup check, etc). In other case an app developer, for example, could provide to user command line option or some interactive command to switch lookup function. I'd recommend to run testfib app with various "-v" options to evaluate lookup performance on a target system to make a decision. >
15/07/2020 14:29, Medvedkin, Vladimir: > On 15/07/2020 12:59, Thomas Monjalon wrote: > > 15/07/2020 12:35, Medvedkin, Vladimir: > >> On 15/07/2020 10:47, Thomas Monjalon wrote: > >>> 14/07/2020 16:38, Stephen Hemminger: > >>>> "Kinsella, Ray" <mdr@ashroe.eu> wrote: > >>>>> On 13/07/2020 23:19, Stephen Hemminger wrote: > >>>>>> Did anyone else see the recent AVX512 discussion from Linus: > >>>>>> "I hope AVX512 dies a painful death, and that Intel starts fixing real problems > >>>>>> instead of trying to create magic instructions to then create benchmarks that they can look good on. > >>>>> > >>>>> Yup - I saw this one. > >>>>> Sweeping statements like these are good to provoke debate, the truth is generally more nuanced. > >>>>> If you continue to read the post, Linus appears to be mostly questioning microprocessor design decisions. > >>>>> > >>>>> That is an interesting discussion, however the reality is that the technology does exists and may be beneficial for Packet Processing. > >>>>> > >>>>> I would suggest, we continue to apply the same logic governing adoption of any technology by DPDK. > >>>>> When the technology is present and a clear benefit is shown, we use it with caution. > >>>>> > >>>>> In the case of Vladimir's patch, > >>>>> the user has to explicitly switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. > >>>> > >>>> Using what is available makes sense in DPDK. > >>> > >>> Why does it require explicit enabling in application? > >>> AVX512 is not reliable enough to be automatically used when available? > >> > >> It is reliable enough. User have to explicitly trigger to avx512 lookup > >> because using avx512 instructions can reduce the frequency of your > >> cores. The user knows their environment better. So the scalar version is > >> used so as not to affect the frequency. > > > > So the user must know which micro-optimization is better for a code > > they don't know. Reminder: an user is not a developper. > > I understand we have no better solution though. > > Can we improve the user experience with some recommendations, numbers, etc? > > > > In case where a user is a developer (dpdk users are mostly devs, aren't > they?) who uses the fib library in their app may decide to switch to > avx512 lookup using rte_fib_set_lookup_fn() when they know that their > code is already using avx512 (ifdef, startup check, etc). > In other case an app developer, for example, could provide to user > command line option or some interactive command to switch lookup function. > I'd recommend to run testfib app with various "-v" options to evaluate > lookup performance on a target system to make a decision. I think this is the difference between a library for hackers, and a product for end-users. We are not building a product, but we can make a step in that direction by documenting some knowledge. I don't know exactly what it means in this case, so I'll let others suggest some doc improvements (if anyone cares).
> -----Original Message----- > From: Thomas Monjalon <thomas@monjalon.net> > Sent: Wednesday, July 15, 2020 1:45 PM > To: Medvedkin, Vladimir <vladimir.medvedkin@intel.com> > Cc: Kinsella, Ray <mdr@ashroe.eu>; Stephen Hemminger > <stephen@networkplumber.org>; dev@dpdk.org; david.marchand@redhat.com; > jerinj@marvell.com; Ananyev, Konstantin <konstantin.ananyev@intel.com>; > Richardson, Bruce <bruce.richardson@intel.com>; Mcnamara, John > <john.mcnamara@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com> > Subject: Re: [dpdk-dev] [PATCH v6 0/8] fib: implement AVX512 vector lookup > > 15/07/2020 14:29, Medvedkin, Vladimir: > > On 15/07/2020 12:59, Thomas Monjalon wrote: > > > 15/07/2020 12:35, Medvedkin, Vladimir: > > >> On 15/07/2020 10:47, Thomas Monjalon wrote: > > >>> 14/07/2020 16:38, Stephen Hemminger: > > >>>> "Kinsella, Ray" <mdr@ashroe.eu> wrote: > > >>>>> On 13/07/2020 23:19, Stephen Hemminger wrote: > > >>>>>> Did anyone else see the recent AVX512 discussion from Linus: > > >>>>>> "I hope AVX512 dies a painful death, and that Intel starts > fixing real problems > > >>>>>> instead of trying to create magic instructions to then > create benchmarks that they can look good on. > > >>>>> > > >>>>> Yup - I saw this one. > > >>>>> Sweeping statements like these are good to provoke debate, the > truth is generally more nuanced. > > >>>>> If you continue to read the post, Linus appears to be mostly > questioning microprocessor design decisions. > > >>>>> > > >>>>> That is an interesting discussion, however the reality is that the > technology does exists and may be beneficial for Packet Processing. > > >>>>> > > >>>>> I would suggest, we continue to apply the same logic governing > adoption of any technology by DPDK. > > >>>>> When the technology is present and a clear benefit is shown, we > use it with caution. > > >>>>> > > >>>>> In the case of Vladimir's patch, the user has to explicitly > > >>>>> switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. > > >>>> > > >>>> Using what is available makes sense in DPDK. > > >>> > > >>> Why does it require explicit enabling in application? > > >>> AVX512 is not reliable enough to be automatically used when > available? > > >> > > >> It is reliable enough. User have to explicitly trigger to avx512 > > >> lookup because using avx512 instructions can reduce the frequency > > >> of your cores. The user knows their environment better. So the > > >> scalar version is used so as not to affect the frequency. > > > > > > So the user must know which micro-optimization is better for a code > > > they don't know. Reminder: an user is not a developper. > > > I understand we have no better solution though. > > > Can we improve the user experience with some recommendations, numbers, > etc? > > > > > > > In case where a user is a developer (dpdk users are mostly devs, > > aren't > > they?) who uses the fib library in their app may decide to switch to > > avx512 lookup using rte_fib_set_lookup_fn() when they know that their > > code is already using avx512 (ifdef, startup check, etc). > > In other case an app developer, for example, could provide to user > > command line option or some interactive command to switch lookup > function. > > I'd recommend to run testfib app with various "-v" options to evaluate > > lookup performance on a target system to make a decision. > > I think this is the difference between a library for hackers, and a > product for end-users. > We are not building a product, but we can make a step in that direction by > documenting some knowledge. > I don't know exactly what it means in this case, so I'll let others > suggest some doc improvements (if anyone cares). > We have got a patchset in the works to try and make AVX-512 use simpler for 20.11, by providing both developer APIs and end-user cmdline flags to control this centrally for DPDK, rather than having each library provide its own magic hooks to optionally enable this support. As part of that set, we'll see about what doc updates need to be made also - again covering both developer and end-app user. Hopefully we can get that set out soon to get early feedback and reach a good conclusion. /Bruce
17/07/2020 18:43, Richardson, Bruce: > From: Thomas Monjalon <thomas@monjalon.net> > > 15/07/2020 14:29, Medvedkin, Vladimir: > > > On 15/07/2020 12:59, Thomas Monjalon wrote: > > > > 15/07/2020 12:35, Medvedkin, Vladimir: > > > >> On 15/07/2020 10:47, Thomas Monjalon wrote: > > > >>> 14/07/2020 16:38, Stephen Hemminger: > > > >>>> "Kinsella, Ray" <mdr@ashroe.eu> wrote: > > > >>>>> On 13/07/2020 23:19, Stephen Hemminger wrote: > > > >>>>>> Did anyone else see the recent AVX512 discussion from Linus: > > > >>>>>> "I hope AVX512 dies a painful death, and that Intel starts > > fixing real problems > > > >>>>>> instead of trying to create magic instructions to then > > create benchmarks that they can look good on. > > > >>>>> > > > >>>>> Yup - I saw this one. > > > >>>>> Sweeping statements like these are good to provoke debate, the > > truth is generally more nuanced. > > > >>>>> If you continue to read the post, Linus appears to be mostly > > questioning microprocessor design decisions. > > > >>>>> > > > >>>>> That is an interesting discussion, however the reality is that the > > technology does exists and may be beneficial for Packet Processing. > > > >>>>> > > > >>>>> I would suggest, we continue to apply the same logic governing > > adoption of any technology by DPDK. > > > >>>>> When the technology is present and a clear benefit is shown, we > > use it with caution. > > > >>>>> > > > >>>>> In the case of Vladimir's patch, the user has to explicitly > > > >>>>> switch on the AVX512 lookup with RTE_FIB_DIR24_8_VECTOR_AVX512. > > > >>>> > > > >>>> Using what is available makes sense in DPDK. > > > >>> > > > >>> Why does it require explicit enabling in application? > > > >>> AVX512 is not reliable enough to be automatically used when > > available? > > > >> > > > >> It is reliable enough. User have to explicitly trigger to avx512 > > > >> lookup because using avx512 instructions can reduce the frequency > > > >> of your cores. The user knows their environment better. So the > > > >> scalar version is used so as not to affect the frequency. > > > > > > > > So the user must know which micro-optimization is better for a code > > > > they don't know. Reminder: an user is not a developper. > > > > I understand we have no better solution though. > > > > Can we improve the user experience with some recommendations, numbers, > > etc? > > > > > > > > > > In case where a user is a developer (dpdk users are mostly devs, > > > aren't > > > they?) who uses the fib library in their app may decide to switch to > > > avx512 lookup using rte_fib_set_lookup_fn() when they know that their > > > code is already using avx512 (ifdef, startup check, etc). > > > In other case an app developer, for example, could provide to user > > > command line option or some interactive command to switch lookup > > function. > > > I'd recommend to run testfib app with various "-v" options to evaluate > > > lookup performance on a target system to make a decision. > > > > I think this is the difference between a library for hackers, and a > > product for end-users. > > We are not building a product, but we can make a step in that direction by > > documenting some knowledge. > > I don't know exactly what it means in this case, so I'll let others > > suggest some doc improvements (if anyone cares). > > > > We have got a patchset in the works to try and make AVX-512 use simpler for 20.11, > by providing both developer APIs and end-user cmdline flags to control this > centrally for DPDK, rather than having each library provide its own magic hooks > to optionally enable this support. As part of that set, we'll see about what > doc updates need to be made also - again covering both developer and end-app user. > > Hopefully we can get that set out soon to get early feedback and reach a good > conclusion. We cannot merge anymore in 20.08 because we passed -rc1. I am in favor of merging this feature the day after 20.08 release.