[v2] devtools: add script to check for non inclusive naming

Message ID 20230403144707.8413-1-stephen@networkplumber.org (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers
Series [v2] devtools: add script to check for non inclusive naming |

Checks

Context Check Description
ci/checkpatch warning coding style issues
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-mellanox-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-unit-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-x86_64-compile-testing success Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-unit-testing success Testing PASS
ci/iol-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/intel-Functional success Functional PASS
ci/iol-aarch64-compile-testing success Testing PASS
ci/iol-abi-testing success Testing PASS

Commit Message

Stephen Hemminger April 3, 2023, 2:47 p.m. UTC
  Shell script to find use of words that not be used.
By default it prints matches.  The -q (quiet) option
is used to just count. There is also -l option
which lists lines matching (like grep -l).

Uses the word lists from Inclusive Naming Initiative
see https://inclusivenaming.org/word-lists/

Examples:
 $ ./devtools/check-naming-policy.sh -q
 Total files: 37 errors, 90 warnings, 2 suggestions

 $ ./devtools/check-naming-policy.sh -q -l lib/eal
 Total lines: 32 errors, 8 warnings, 0 suggestions

Add MAINTAINERS file entry for the new tool and resort
the list files back into to alphabetic order

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
v2 - fix typo in words
   - add subtree (pathspec) option
   - update maintainers file (and fix alphabetic order)

 MAINTAINERS                     |   8 ++-
 devtools/check-naming-policy.sh | 107 ++++++++++++++++++++++++++++++++
 devtools/naming/tier1.txt       |   8 +++
 devtools/naming/tier2.txt       |   1 +
 devtools/naming/tier3.txt       |   4 ++
 5 files changed, 125 insertions(+), 3 deletions(-)
 create mode 100755 devtools/check-naming-policy.sh
 create mode 100644 devtools/naming/tier1.txt
 create mode 100644 devtools/naming/tier2.txt
 create mode 100644 devtools/naming/tier3.txt
  

Comments

Luca Boccassi April 3, 2023, 11:08 p.m. UTC | #1
On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> Shell script to find use of words that not be used.
> By default it prints matches.  The -q (quiet) option
> is used to just count. There is also -l option
> which lists lines matching (like grep -l).
>
> Uses the word lists from Inclusive Naming Initiative
> see https://inclusivenaming.org/word-lists/
>
> Examples:
>  $ ./devtools/check-naming-policy.sh -q
>  Total files: 37 errors, 90 warnings, 2 suggestions
>
>  $ ./devtools/check-naming-policy.sh -q -l lib/eal
>  Total lines: 32 errors, 8 warnings, 0 suggestions
>
> Add MAINTAINERS file entry for the new tool and resort
> the list files back into to alphabetic order
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> v2 - fix typo in words
>    - add subtree (pathspec) option
>    - update maintainers file (and fix alphabetic order)

There's a json file on the website, how about downloading that on the
fly rather than storing a local copy that will go out of date?
https://inclusivenaming.org/word-lists/index.json
  
Stephen Hemminger April 4, 2023, 2:17 a.m. UTC | #2
On Tue, 4 Apr 2023 00:08:30 +0100
Luca Boccassi <bluca@debian.org> wrote:

> On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > Shell script to find use of words that not be used.
> > By default it prints matches.  The -q (quiet) option
> > is used to just count. There is also -l option
> > which lists lines matching (like grep -l).
> >
> > Uses the word lists from Inclusive Naming Initiative
> > see https://inclusivenaming.org/word-lists/
> >
> > Examples:
> >  $ ./devtools/check-naming-policy.sh -q
> >  Total files: 37 errors, 90 warnings, 2 suggestions
> >
> >  $ ./devtools/check-naming-policy.sh -q -l lib/eal
> >  Total lines: 32 errors, 8 warnings, 0 suggestions
> >
> > Add MAINTAINERS file entry for the new tool and resort
> > the list files back into to alphabetic order
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > v2 - fix typo in words
> >    - add subtree (pathspec) option
> >    - update maintainers file (and fix alphabetic order)  
> 
> There's a json file on the website, how about downloading that on the
> fly rather than storing a local copy that will go out of date?
> https://inclusivenaming.org/word-lists/index.json

Ok, but that would mean using python and would also mean that terms like
segreation which are not on the official list would not be caught
  
Luca Boccassi April 4, 2023, 10 p.m. UTC | #3
On Mon, 2023-04-03 at 19:17 -0700, Stephen Hemminger wrote:
> On Tue, 4 Apr 2023 00:08:30 +0100
> Luca Boccassi <bluca@debian.org> wrote:
> 
> > On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > > 
> > > Shell script to find use of words that not be used.
> > > By default it prints matches.  The -q (quiet) option
> > > is used to just count. There is also -l option
> > > which lists lines matching (like grep -l).
> > > 
> > > Uses the word lists from Inclusive Naming Initiative
> > > see https://inclusivenaming.org/word-lists/
> > > 
> > > Examples:
> > >  $ ./devtools/check-naming-policy.sh -q
> > >  Total files: 37 errors, 90 warnings, 2 suggestions
> > > 
> > >  $ ./devtools/check-naming-policy.sh -q -l lib/eal
> > >  Total lines: 32 errors, 8 warnings, 0 suggestions
> > > 
> > > Add MAINTAINERS file entry for the new tool and resort
> > > the list files back into to alphabetic order
> > > 
> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > > ---
> > > v2 - fix typo in words
> > >    - add subtree (pathspec) option
> > >    - update maintainers file (and fix alphabetic order)  
> > 
> > There's a json file on the website, how about downloading that on the
> > fly rather than storing a local copy that will go out of date?
> > https://inclusivenaming.org/word-lists/index.json
> 
> Ok, but that would mean using python and would also mean that terms like
> segreation which are not on the official list would not be caught

No need for python, it can be done with 'jq' very easily. Also there's
'segregate' which is close enough, it's tier 3. eg:

$ wget https://inclusivenaming.org/word-lists/index.json -q -O- | jq -r '.data[] | select ((.tier == "3")) | .term'
man-in-the-middle
Segregate
  
Stephen Hemminger April 5, 2023, 1:23 a.m. UTC | #4
On Tue, 04 Apr 2023 23:00:42 +0100
Luca Boccassi <bluca@debian.org> wrote:

> > 
> > Ok, but that would mean using python and would also mean that terms like
> > segreation which are not on the official list would not be caught  
> 
> No need for python, it can be done with 'jq' very easily. Also there's
> 'segregate' which is close enough, it's tier 3. eg:
> 
> $ wget https://inclusivenaming.org/word-lists/index.json -q -O- | jq -r '.data[] | select ((.tier == "3")) | .term'
> man-in-the-middle
> Segregate

Doing it in python allows for better UI. And makes it easier to do enhancements
like show tier 1 only, or add more words on command line, or exclude additional directories.

Feature creep can be fun...
  

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df23e50999f..b5881113ba85 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -83,26 +83,28 @@  Developers and Maintainers Tools
 M: Thomas Monjalon <thomas@monjalon.net>
 F: MAINTAINERS
 F: devtools/build-dict.sh
-F: devtools/check-abi.sh
 F: devtools/check-abi-version.sh
+F: devtools/check-abi.sh
 F: devtools/check-doc-vs-code.sh
 F: devtools/check-dup-includes.sh
-F: devtools/check-maintainers.sh
 F: devtools/check-forbidden-tokens.awk
 F: devtools/check-git-log.sh
+F: devtools/check-maintainers.sh
+F: devtools/check-naming-policy.sh
 F: devtools/check-spdx-tag.sh
 F: devtools/check-symbol-change.sh
 F: devtools/check-symbol-maps.sh
 F: devtools/checkpatches.sh
 F: devtools/get-maintainer.sh
 F: devtools/git-log-fixes.sh
+F: devtools/libabigail.abignore
 F: devtools/load-devel-config
+F: devtools/naming/
 F: devtools/parse-flow-support.sh
 F: devtools/process-iwyu.py
 F: devtools/update-abi.sh
 F: devtools/update-patches.py
 F: devtools/update_version_map_abi.py
-F: devtools/libabigail.abignore
 F: devtools/words-case.txt
 F: license/
 F: .editorconfig
diff --git a/devtools/check-naming-policy.sh b/devtools/check-naming-policy.sh
new file mode 100755
index 000000000000..90347b415652
--- /dev/null
+++ b/devtools/check-naming-policy.sh
@@ -0,0 +1,107 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2023 Stephen Hemminger
+#
+# This script scans the source tree and creates list of files
+# containing words that are recommended to bavoide by the
+# Inclusive Naming Initiative.
+# See: https://inclusivenaming.org/word-lists/
+#
+# The options are:
+#   -q = quiet mode, produces summary count only
+#   -l = show lines instead of files with recommendations
+#   -v = verbose, show a header between each tier
+#
+# Default is to scan all of DPDK source and documentation.
+# Optional pathspec can be used to limit specific tree.
+#
+#  Example:
+#    check-naming-policy.sh -q doc/*
+#
+
+errors=0
+warnings=0
+suggestions=0
+quiet=false
+veborse=false
+lines='-l'
+
+print_usage () {
+    echo "usage: $(basename $0) [-l] [-q] [-v] [<pathspec>]"
+    exit 1
+}
+
+# Locate word list files
+selfdir=$(dirname $(readlink -f $0))
+words=$selfdir/naming
+
+# These give false positives
+skipfiles=( ':^devtools/naming/' \
+	    ':^doc/guides/rel_notes/' \
+	    ':^doc/guides/contributing/coding_style.rst' \
+	    ':^doc/guides/prog_guide/glossary.rst' \
+)
+# These are obsolete
+skipfiles+=( \
+	    ':^drivers/net/liquidio/' \
+	    ':^drivers/net/bnx2x/' \
+	    ':^lib/table/' \
+	    ':^lib/port/' \
+	    ':^lib/pipeline/' \
+	    ':^examples/pipeline/' \
+)
+
+#
+# check_wordlist wordfile description
+check_wordlist() {
+    local list=$words/$1
+    local description=$2
+
+    git grep -i $lines -f $list -- ${skipfiles[@]} $pathspec > $tmpfile
+    count=$(wc -l < $tmpfile)
+    if ! $quiet; then
+	if [ $count -gt 0 ]; then
+	    if $verbose; then
+   		    echo $description
+		    echo $description | tr '[:print:]' '-'
+	    fi
+   	    cat $tmpfile
+	    echo
+	fi
+    fi
+    return $count
+}
+
+while getopts lqvh ARG ; do
+	case $ARG in
+		l ) lines= ;;
+		q ) quiet=true ;;
+		v ) verbose=true ;;
+		h ) print_usage ; exit 0 ;;
+		? ) print_usage ; exit 1 ;;
+	esac
+done
+shift $(($OPTIND - 1))
+
+tmpfile=$(mktemp -t dpdk.checknames.XXXXXX)
+trap 'rm -f -- "$tmpfile"' INT TERM HUP EXIT
+
+pathspec=$*
+
+check_wordlist tier1.txt "Tier 1: Replace immediately"
+errors=$?
+
+check_wordlist tier2.txt "Tier 2: Strongly consider replacing"
+warnings=$?
+
+check_wordlist tier3.txt "Tier 3: Recommend to replace"
+suggestions=$?
+
+if [ -z "$lines" ] ; then
+    echo -n "Total lines: "
+else
+    echo -n "Total files: "
+fi
+
+echo $errors "errors," $warnings "warnings," $suggestions "suggestions"
+exit $errors
diff --git a/devtools/naming/tier1.txt b/devtools/naming/tier1.txt
new file mode 100644
index 000000000000..a0e9b549c218
--- /dev/null
+++ b/devtools/naming/tier1.txt
@@ -0,0 +1,8 @@ 
+abort
+blackhat
+blacklist
+cripple
+master
+slave
+whitehat
+whitelist
diff --git a/devtools/naming/tier2.txt b/devtools/naming/tier2.txt
new file mode 100644
index 000000000000..cd4280d1625c
--- /dev/null
+++ b/devtools/naming/tier2.txt
@@ -0,0 +1 @@ 
+sanity
diff --git a/devtools/naming/tier3.txt b/devtools/naming/tier3.txt
new file mode 100644
index 000000000000..072f6468ea47
--- /dev/null
+++ b/devtools/naming/tier3.txt
@@ -0,0 +1,4 @@ 
+man.in.the.middle
+segregate
+segregation
+tribe