[v2] devtools: add script to check for non inclusive naming
Checks
Commit Message
Shell script to find use of words that not be used.
By default it prints matches. The -q (quiet) option
is used to just count. There is also -l option
which lists lines matching (like grep -l).
Uses the word lists from Inclusive Naming Initiative
see https://inclusivenaming.org/word-lists/
Examples:
$ ./devtools/check-naming-policy.sh -q
Total files: 37 errors, 90 warnings, 2 suggestions
$ ./devtools/check-naming-policy.sh -q -l lib/eal
Total lines: 32 errors, 8 warnings, 0 suggestions
Add MAINTAINERS file entry for the new tool and resort
the list files back into to alphabetic order
Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
v2 - fix typo in words
- add subtree (pathspec) option
- update maintainers file (and fix alphabetic order)
MAINTAINERS | 8 ++-
devtools/check-naming-policy.sh | 107 ++++++++++++++++++++++++++++++++
devtools/naming/tier1.txt | 8 +++
devtools/naming/tier2.txt | 1 +
devtools/naming/tier3.txt | 4 ++
5 files changed, 125 insertions(+), 3 deletions(-)
create mode 100755 devtools/check-naming-policy.sh
create mode 100644 devtools/naming/tier1.txt
create mode 100644 devtools/naming/tier2.txt
create mode 100644 devtools/naming/tier3.txt
Comments
On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> Shell script to find use of words that not be used.
> By default it prints matches. The -q (quiet) option
> is used to just count. There is also -l option
> which lists lines matching (like grep -l).
>
> Uses the word lists from Inclusive Naming Initiative
> see https://inclusivenaming.org/word-lists/
>
> Examples:
> $ ./devtools/check-naming-policy.sh -q
> Total files: 37 errors, 90 warnings, 2 suggestions
>
> $ ./devtools/check-naming-policy.sh -q -l lib/eal
> Total lines: 32 errors, 8 warnings, 0 suggestions
>
> Add MAINTAINERS file entry for the new tool and resort
> the list files back into to alphabetic order
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> v2 - fix typo in words
> - add subtree (pathspec) option
> - update maintainers file (and fix alphabetic order)
There's a json file on the website, how about downloading that on the
fly rather than storing a local copy that will go out of date?
https://inclusivenaming.org/word-lists/index.json
On Tue, 4 Apr 2023 00:08:30 +0100
Luca Boccassi <bluca@debian.org> wrote:
> On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > Shell script to find use of words that not be used.
> > By default it prints matches. The -q (quiet) option
> > is used to just count. There is also -l option
> > which lists lines matching (like grep -l).
> >
> > Uses the word lists from Inclusive Naming Initiative
> > see https://inclusivenaming.org/word-lists/
> >
> > Examples:
> > $ ./devtools/check-naming-policy.sh -q
> > Total files: 37 errors, 90 warnings, 2 suggestions
> >
> > $ ./devtools/check-naming-policy.sh -q -l lib/eal
> > Total lines: 32 errors, 8 warnings, 0 suggestions
> >
> > Add MAINTAINERS file entry for the new tool and resort
> > the list files back into to alphabetic order
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > v2 - fix typo in words
> > - add subtree (pathspec) option
> > - update maintainers file (and fix alphabetic order)
>
> There's a json file on the website, how about downloading that on the
> fly rather than storing a local copy that will go out of date?
> https://inclusivenaming.org/word-lists/index.json
Ok, but that would mean using python and would also mean that terms like
segreation which are not on the official list would not be caught
On Mon, 2023-04-03 at 19:17 -0700, Stephen Hemminger wrote:
> On Tue, 4 Apr 2023 00:08:30 +0100
> Luca Boccassi <bluca@debian.org> wrote:
>
> > On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > >
> > > Shell script to find use of words that not be used.
> > > By default it prints matches. The -q (quiet) option
> > > is used to just count. There is also -l option
> > > which lists lines matching (like grep -l).
> > >
> > > Uses the word lists from Inclusive Naming Initiative
> > > see https://inclusivenaming.org/word-lists/
> > >
> > > Examples:
> > > $ ./devtools/check-naming-policy.sh -q
> > > Total files: 37 errors, 90 warnings, 2 suggestions
> > >
> > > $ ./devtools/check-naming-policy.sh -q -l lib/eal
> > > Total lines: 32 errors, 8 warnings, 0 suggestions
> > >
> > > Add MAINTAINERS file entry for the new tool and resort
> > > the list files back into to alphabetic order
> > >
> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > > ---
> > > v2 - fix typo in words
> > > - add subtree (pathspec) option
> > > - update maintainers file (and fix alphabetic order)
> >
> > There's a json file on the website, how about downloading that on the
> > fly rather than storing a local copy that will go out of date?
> > https://inclusivenaming.org/word-lists/index.json
>
> Ok, but that would mean using python and would also mean that terms like
> segreation which are not on the official list would not be caught
No need for python, it can be done with 'jq' very easily. Also there's
'segregate' which is close enough, it's tier 3. eg:
$ wget https://inclusivenaming.org/word-lists/index.json -q -O- | jq -r '.data[] | select ((.tier == "3")) | .term'
man-in-the-middle
Segregate
On Tue, 04 Apr 2023 23:00:42 +0100
Luca Boccassi <bluca@debian.org> wrote:
> >
> > Ok, but that would mean using python and would also mean that terms like
> > segreation which are not on the official list would not be caught
>
> No need for python, it can be done with 'jq' very easily. Also there's
> 'segregate' which is close enough, it's tier 3. eg:
>
> $ wget https://inclusivenaming.org/word-lists/index.json -q -O- | jq -r '.data[] | select ((.tier == "3")) | .term'
> man-in-the-middle
> Segregate
Doing it in python allows for better UI. And makes it easier to do enhancements
like show tier 1 only, or add more words on command line, or exclude additional directories.
Feature creep can be fun...
@@ -83,26 +83,28 @@ Developers and Maintainers Tools
M: Thomas Monjalon <thomas@monjalon.net>
F: MAINTAINERS
F: devtools/build-dict.sh
-F: devtools/check-abi.sh
F: devtools/check-abi-version.sh
+F: devtools/check-abi.sh
F: devtools/check-doc-vs-code.sh
F: devtools/check-dup-includes.sh
-F: devtools/check-maintainers.sh
F: devtools/check-forbidden-tokens.awk
F: devtools/check-git-log.sh
+F: devtools/check-maintainers.sh
+F: devtools/check-naming-policy.sh
F: devtools/check-spdx-tag.sh
F: devtools/check-symbol-change.sh
F: devtools/check-symbol-maps.sh
F: devtools/checkpatches.sh
F: devtools/get-maintainer.sh
F: devtools/git-log-fixes.sh
+F: devtools/libabigail.abignore
F: devtools/load-devel-config
+F: devtools/naming/
F: devtools/parse-flow-support.sh
F: devtools/process-iwyu.py
F: devtools/update-abi.sh
F: devtools/update-patches.py
F: devtools/update_version_map_abi.py
-F: devtools/libabigail.abignore
F: devtools/words-case.txt
F: license/
F: .editorconfig
new file mode 100755
@@ -0,0 +1,107 @@
+#! /bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2023 Stephen Hemminger
+#
+# This script scans the source tree and creates list of files
+# containing words that are recommended to bavoide by the
+# Inclusive Naming Initiative.
+# See: https://inclusivenaming.org/word-lists/
+#
+# The options are:
+# -q = quiet mode, produces summary count only
+# -l = show lines instead of files with recommendations
+# -v = verbose, show a header between each tier
+#
+# Default is to scan all of DPDK source and documentation.
+# Optional pathspec can be used to limit specific tree.
+#
+# Example:
+# check-naming-policy.sh -q doc/*
+#
+
+errors=0
+warnings=0
+suggestions=0
+quiet=false
+veborse=false
+lines='-l'
+
+print_usage () {
+ echo "usage: $(basename $0) [-l] [-q] [-v] [<pathspec>]"
+ exit 1
+}
+
+# Locate word list files
+selfdir=$(dirname $(readlink -f $0))
+words=$selfdir/naming
+
+# These give false positives
+skipfiles=( ':^devtools/naming/' \
+ ':^doc/guides/rel_notes/' \
+ ':^doc/guides/contributing/coding_style.rst' \
+ ':^doc/guides/prog_guide/glossary.rst' \
+)
+# These are obsolete
+skipfiles+=( \
+ ':^drivers/net/liquidio/' \
+ ':^drivers/net/bnx2x/' \
+ ':^lib/table/' \
+ ':^lib/port/' \
+ ':^lib/pipeline/' \
+ ':^examples/pipeline/' \
+)
+
+#
+# check_wordlist wordfile description
+check_wordlist() {
+ local list=$words/$1
+ local description=$2
+
+ git grep -i $lines -f $list -- ${skipfiles[@]} $pathspec > $tmpfile
+ count=$(wc -l < $tmpfile)
+ if ! $quiet; then
+ if [ $count -gt 0 ]; then
+ if $verbose; then
+ echo $description
+ echo $description | tr '[:print:]' '-'
+ fi
+ cat $tmpfile
+ echo
+ fi
+ fi
+ return $count
+}
+
+while getopts lqvh ARG ; do
+ case $ARG in
+ l ) lines= ;;
+ q ) quiet=true ;;
+ v ) verbose=true ;;
+ h ) print_usage ; exit 0 ;;
+ ? ) print_usage ; exit 1 ;;
+ esac
+done
+shift $(($OPTIND - 1))
+
+tmpfile=$(mktemp -t dpdk.checknames.XXXXXX)
+trap 'rm -f -- "$tmpfile"' INT TERM HUP EXIT
+
+pathspec=$*
+
+check_wordlist tier1.txt "Tier 1: Replace immediately"
+errors=$?
+
+check_wordlist tier2.txt "Tier 2: Strongly consider replacing"
+warnings=$?
+
+check_wordlist tier3.txt "Tier 3: Recommend to replace"
+suggestions=$?
+
+if [ -z "$lines" ] ; then
+ echo -n "Total lines: "
+else
+ echo -n "Total files: "
+fi
+
+echo $errors "errors," $warnings "warnings," $suggestions "suggestions"
+exit $errors
new file mode 100644
@@ -0,0 +1,8 @@
+abort
+blackhat
+blacklist
+cripple
+master
+slave
+whitehat
+whitelist
new file mode 100644
@@ -0,0 +1 @@
+sanity
new file mode 100644
@@ -0,0 +1,4 @@
+man.in.the.middle
+segregate
+segregation
+tribe