diff mbox series

[v2] devtools: add script to check for non inclusive naming

Message ID	20230403144707.8413-1-stephen@networkplumber.org (mailing list archive)
State	Superseded, archived
Delegated to:	Thomas Monjalon
Headers	From: Stephen Hemminger <stephen@networkplumber.org> To: dev@dpdk.org Cc: Stephen Hemminger <stephen@networkplumber.org> Subject: [PATCH v2] devtools: add script to check for non inclusive naming Date: Mon, 3 Apr 2023 07:47:07 -0700 Message-Id: <20230403144707.8413-1-stephen@networkplumber.org> In-Reply-To: <20230331200824.195294-1-stephen@networkplumber.org> References: <20230331200824.195294-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: dev-bounces@dpdk.org
Series	[v2] devtools: add script to check for non inclusive naming \| [v2] devtools: add script to check for non inclusive naming

Checks

Context	Check	Description
ci/checkpatch	warning	coding style issues
ci/loongarch-compilation	success	Compilation OK
ci/loongarch-unit-testing	success	Unit Testing PASS
ci/Intel-compilation	success	Compilation OK
ci/iol-mellanox-Performance	success	Performance Testing PASS
ci/iol-broadcom-Functional	success	Functional Testing PASS
ci/iol-intel-Performance	success	Performance Testing PASS
ci/iol-aarch64-unit-testing	success	Testing PASS
ci/iol-broadcom-Performance	success	Performance Testing PASS
ci/iol-x86_64-compile-testing	success	Testing PASS
ci/intel-Testing	success	Testing PASS
ci/iol-intel-Functional	success	Functional Testing PASS
ci/iol-unit-testing	success	Testing PASS
ci/iol-testing	success	Testing PASS
ci/iol-x86_64-unit-testing	success	Testing PASS
ci/github-robot: build	success	github build: passed
ci/intel-Functional	success	Functional PASS
ci/iol-aarch64-compile-testing	success	Testing PASS
ci/iol-abi-testing	success	Testing PASS

Commit Message

Stephen Hemminger April 3, 2023, 2:47 p.m. UTC

  Shell script to find use of words that not be used.
By default it prints matches.  The -q (quiet) option
is used to just count. There is also -l option
which lists lines matching (like grep -l).

Uses the word lists from Inclusive Naming Initiative
see https://inclusivenaming.org/word-lists/

Examples:
 $ ./devtools/check-naming-policy.sh -q
 Total files: 37 errors, 90 warnings, 2 suggestions

 $ ./devtools/check-naming-policy.sh -q -l lib/eal
 Total lines: 32 errors, 8 warnings, 0 suggestions

Add MAINTAINERS file entry for the new tool and resort
the list files back into to alphabetic order

Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
---
v2 - fix typo in words
   - add subtree (pathspec) option
   - update maintainers file (and fix alphabetic order)

 MAINTAINERS                     |   8 ++-
 devtools/check-naming-policy.sh | 107 ++++++++++++++++++++++++++++++++
 devtools/naming/tier1.txt       |   8 +++
 devtools/naming/tier2.txt       |   1 +
 devtools/naming/tier3.txt       |   4 ++
 5 files changed, 125 insertions(+), 3 deletions(-)
 create mode 100755 devtools/check-naming-policy.sh
 create mode 100644 devtools/naming/tier1.txt
 create mode 100644 devtools/naming/tier2.txt
 create mode 100644 devtools/naming/tier3.txt

Comments

Luca Boccassi April 3, 2023, 11:08 p.m. UTC | #1

On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> Shell script to find use of words that not be used.
> By default it prints matches.  The -q (quiet) option
> is used to just count. There is also -l option
> which lists lines matching (like grep -l).
>
> Uses the word lists from Inclusive Naming Initiative
> see https://inclusivenaming.org/word-lists/
>
> Examples:
>  $ ./devtools/check-naming-policy.sh -q
>  Total files: 37 errors, 90 warnings, 2 suggestions
>
>  $ ./devtools/check-naming-policy.sh -q -l lib/eal
>  Total lines: 32 errors, 8 warnings, 0 suggestions
>
> Add MAINTAINERS file entry for the new tool and resort
> the list files back into to alphabetic order
>
> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> ---
> v2 - fix typo in words
>    - add subtree (pathspec) option
>    - update maintainers file (and fix alphabetic order)

There's a json file on the website, how about downloading that on the
fly rather than storing a local copy that will go out of date?
https://inclusivenaming.org/word-lists/index.json

Stephen Hemminger April 4, 2023, 2:17 a.m. UTC | #2

On Tue, 4 Apr 2023 00:08:30 +0100
Luca Boccassi <bluca@debian.org> wrote:

> On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> >
> > Shell script to find use of words that not be used.
> > By default it prints matches.  The -q (quiet) option
> > is used to just count. There is also -l option
> > which lists lines matching (like grep -l).
> >
> > Uses the word lists from Inclusive Naming Initiative
> > see https://inclusivenaming.org/word-lists/
> >
> > Examples:
> >  $ ./devtools/check-naming-policy.sh -q
> >  Total files: 37 errors, 90 warnings, 2 suggestions
> >
> >  $ ./devtools/check-naming-policy.sh -q -l lib/eal
> >  Total lines: 32 errors, 8 warnings, 0 suggestions
> >
> > Add MAINTAINERS file entry for the new tool and resort
> > the list files back into to alphabetic order
> >
> > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > ---
> > v2 - fix typo in words
> >    - add subtree (pathspec) option
> >    - update maintainers file (and fix alphabetic order)  
> 
> There's a json file on the website, how about downloading that on the
> fly rather than storing a local copy that will go out of date?
> https://inclusivenaming.org/word-lists/index.json

Ok, but that would mean using python and would also mean that terms like
segreation which are not on the official list would not be caught

Luca Boccassi April 4, 2023, 10 p.m. UTC | #3

On Mon, 2023-04-03 at 19:17 -0700, Stephen Hemminger wrote:
> On Tue, 4 Apr 2023 00:08:30 +0100
> Luca Boccassi <bluca@debian.org> wrote:
> 
> > On Mon, 3 Apr 2023 at 15:47, Stephen Hemminger
> > <stephen@networkplumber.org> wrote:
> > > 
> > > Shell script to find use of words that not be used.
> > > By default it prints matches.  The -q (quiet) option
> > > is used to just count. There is also -l option
> > > which lists lines matching (like grep -l).
> > > 
> > > Uses the word lists from Inclusive Naming Initiative
> > > see https://inclusivenaming.org/word-lists/
> > > 
> > > Examples:
> > >  $ ./devtools/check-naming-policy.sh -q
> > >  Total files: 37 errors, 90 warnings, 2 suggestions
> > > 
> > >  $ ./devtools/check-naming-policy.sh -q -l lib/eal
> > >  Total lines: 32 errors, 8 warnings, 0 suggestions
> > > 
> > > Add MAINTAINERS file entry for the new tool and resort
> > > the list files back into to alphabetic order
> > > 
> > > Signed-off-by: Stephen Hemminger <stephen@networkplumber.org>
> > > ---
> > > v2 - fix typo in words
> > >    - add subtree (pathspec) option
> > >    - update maintainers file (and fix alphabetic order)  
> > 
> > There's a json file on the website, how about downloading that on the
> > fly rather than storing a local copy that will go out of date?
> > https://inclusivenaming.org/word-lists/index.json
> 
> Ok, but that would mean using python and would also mean that terms like
> segreation which are not on the official list would not be caught

No need for python, it can be done with 'jq' very easily. Also there's
'segregate' which is close enough, it's tier 3. eg:

$ wget https://inclusivenaming.org/word-lists/index.json -q -O- | jq -r '.data[] | select ((.tier == "3")) | .term'
man-in-the-middle
Segregate

Stephen Hemminger April 5, 2023, 1:23 a.m. UTC | #4

On Tue, 04 Apr 2023 23:00:42 +0100
Luca Boccassi <bluca@debian.org> wrote:

> > 
> > Ok, but that would mean using python and would also mean that terms like
> > segreation which are not on the official list would not be caught  
> 
> No need for python, it can be done with 'jq' very easily. Also there's
> 'segregate' which is close enough, it's tier 3. eg:
> 
> $ wget https://inclusivenaming.org/word-lists/index.json -q -O- | jq -r '.data[] | select ((.tier == "3")) | .term'
> man-in-the-middle
> Segregate

Doing it in python allows for better UI. And makes it easier to do enhancements
like show tier 1 only, or add more words on command line, or exclude additional directories.

Feature creep can be fun...

diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index 8df23e50999f..b5881113ba85 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -83,26 +83,28 @@  Developers and Maintainers Tools
 M: Thomas Monjalon <thomas@monjalon.net>
 F: MAINTAINERS
 F: devtools/build-dict.sh
-F: devtools/check-abi.sh
 F: devtools/check-abi-version.sh
+F: devtools/check-abi.sh
 F: devtools/check-doc-vs-code.sh
 F: devtools/check-dup-includes.sh
-F: devtools/check-maintainers.sh
 F: devtools/check-forbidden-tokens.awk
 F: devtools/check-git-log.sh
+F: devtools/check-maintainers.sh
+F: devtools/check-naming-policy.sh
 F: devtools/check-spdx-tag.sh
 F: devtools/check-symbol-change.sh
 F: devtools/check-symbol-maps.sh
 F: devtools/checkpatches.sh
 F: devtools/get-maintainer.sh
 F: devtools/git-log-fixes.sh
+F: devtools/libabigail.abignore
 F: devtools/load-devel-config
+F: devtools/naming/
 F: devtools/parse-flow-support.sh
 F: devtools/process-iwyu.py
 F: devtools/update-abi.sh
 F: devtools/update-patches.py
 F: devtools/update_version_map_abi.py
-F: devtools/libabigail.abignore
 F: devtools/words-case.txt
 F: license/
 F: .editorconfig
diff --git a/devtools/check-naming-policy.sh b/devtools/check-naming-policy.sh
new file mode 100755
index 000000000000..90347b415652
--- /dev/null
+++ b/devtools/check-naming-policy.sh
@@ -0,0 +1,107 @@ 
+#! /bin/bash
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright 2023 Stephen Hemminger
+#
+# This script scans the source tree and creates list of files
+# containing words that are recommended to bavoide by the
+# Inclusive Naming Initiative.
+# See: https://inclusivenaming.org/word-lists/
+#
+# The options are:
+#   -q = quiet mode, produces summary count only
+#   -l = show lines instead of files with recommendations
+#   -v = verbose, show a header between each tier
+#
+# Default is to scan all of DPDK source and documentation.
+# Optional pathspec can be used to limit specific tree.
+#
+#  Example:
+#    check-naming-policy.sh -q doc/*
+#
+
+errors=0
+warnings=0
+suggestions=0
+quiet=false
+veborse=false
+lines='-l'
+
+print_usage () {
+    echo "usage: $(basename $0) [-l] [-q] [-v] [<pathspec>]"
+    exit 1
+}
+
+# Locate word list files
+selfdir=$(dirname $(readlink -f $0))
+words=$selfdir/naming
+
+# These give false positives
+skipfiles=( ':^devtools/naming/' \
+	    ':^doc/guides/rel_notes/' \
+	    ':^doc/guides/contributing/coding_style.rst' \
+	    ':^doc/guides/prog_guide/glossary.rst' \
+)
+# These are obsolete
+skipfiles+=( \
+	    ':^drivers/net/liquidio/' \
+	    ':^drivers/net/bnx2x/' \
+	    ':^lib/table/' \
+	    ':^lib/port/' \
+	    ':^lib/pipeline/' \
+	    ':^examples/pipeline/' \
+)
+
+#
+# check_wordlist wordfile description
+check_wordlist() {
+    local list=$words/$1
+    local description=$2
+
+    git grep -i $lines -f $list -- ${skipfiles[@]} $pathspec > $tmpfile
+    count=$(wc -l < $tmpfile)
+    if ! $quiet; then
+	if [ $count -gt 0 ]; then
+	    if $verbose; then
+   		    echo $description
+		    echo $description | tr '[:print:]' '-'
+	    fi
+   	    cat $tmpfile
+	    echo
+	fi
+    fi
+    return $count
+}
+
+while getopts lqvh ARG ; do
+	case $ARG in
+		l ) lines= ;;
+		q ) quiet=true ;;
+		v ) verbose=true ;;
+		h ) print_usage ; exit 0 ;;
+		? ) print_usage ; exit 1 ;;
+	esac
+done
+shift $(($OPTIND - 1))
+
+tmpfile=$(mktemp -t dpdk.checknames.XXXXXX)
+trap 'rm -f -- "$tmpfile"' INT TERM HUP EXIT
+
+pathspec=$*
+
+check_wordlist tier1.txt "Tier 1: Replace immediately"
+errors=$?
+
+check_wordlist tier2.txt "Tier 2: Strongly consider replacing"
+warnings=$?
+
+check_wordlist tier3.txt "Tier 3: Recommend to replace"
+suggestions=$?
+
+if [ -z "$lines" ] ; then
+    echo -n "Total lines: "
+else
+    echo -n "Total files: "
+fi
+
+echo $errors "errors," $warnings "warnings," $suggestions "suggestions"
+exit $errors
diff --git a/devtools/naming/tier1.txt b/devtools/naming/tier1.txt
new file mode 100644
index 000000000000..a0e9b549c218
--- /dev/null
+++ b/devtools/naming/tier1.txt
@@ -0,0 +1,8 @@ 
+abort
+blackhat
+blacklist
+cripple
+master
+slave
+whitehat
+whitelist
diff --git a/devtools/naming/tier2.txt b/devtools/naming/tier2.txt
new file mode 100644
index 000000000000..cd4280d1625c
--- /dev/null
+++ b/devtools/naming/tier2.txt
@@ -0,0 +1 @@ 
+sanity
diff --git a/devtools/naming/tier3.txt b/devtools/naming/tier3.txt
new file mode 100644
index 000000000000..072f6468ea47
--- /dev/null
+++ b/devtools/naming/tier3.txt
@@ -0,0 +1,4 @@ 
+man.in.the.middle
+segregate
+segregation
+tribe