[dpdk-dev] devtools: add script to find duplicated includes

Message ID 20170714170707.22464-1-thomas@monjalon.net (mailing list archive)
State Superseded, archived
Delegated to: Thomas Monjalon
Headers

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/Intel-compilation success Compilation OK

Commit Message

Thomas Monjalon July 14, 2017, 5:07 p.m. UTC
  Based on Stephen's idea (originally implemented in a Perl script),
this is a shell script to find duplicated includes in a file.
It looks for all the .c and .h files of the git repository.

It is fast enough because automatically well parallelized.

Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
---
 devtools/check-dup-includes.sh | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)
 create mode 100755 devtools/check-dup-includes.sh
  

Comments

Wiles, Keith July 14, 2017, 6:39 p.m. UTC | #1
> On Jul 14, 2017, at 12:07 PM, Thomas Monjalon <thomas@monjalon.net> wrote:

> 

> Based on Stephen's idea (originally implemented in a Perl script),

> this is a shell script to find duplicated includes in a file.

> It looks for all the .c and .h files of the git repository.

> 

> It is fast enough because automatically well parallelized.

> 

> Suggested-by: Stephen Hemminger <stephen@networkplumber.org>

> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

> ---

> devtools/check-dup-includes.sh | 35 +++++++++++++++++++++++++++++++++++

> 1 file changed, 35 insertions(+)

> create mode 100755 devtools/check-dup-includes.sh

> 

> diff --git a/devtools/check-dup-includes.sh b/devtools/check-dup-includes.sh

> new file mode 100755

> index 000000000..ba9871255

> --- /dev/null

> +++ b/devtools/check-dup-includes.sh

> @@ -0,0 +1,35 @@

> +#! /bin/sh -e

> +

> +# Copyright 2017 Mellanox Technologies, Ltd. All rights reserved.

> +#

> +# Redistribution and use in source and binary forms, with or without

> +# modification, are permitted provided that the following conditions

> +# are met:

> +#

> +#   * Redistributions of source code must retain the above copyright

> +#     notice, this list of conditions and the following disclaimer.

> +#   * Redistributions in binary form must reproduce the above copyright

> +#     notice, this list of conditions and the following disclaimer in

> +#     the documentation and/or other materials provided with the

> +#     distribution.

> +#

> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

> +# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

> +# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

> +# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

> +# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

> +# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

> +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE  SOFTWARE.

> +

> +# Check C files in the git repository for duplicated includes.

> +

> +cd $(dirname $(readlink -m $0))/..


I tried the script it works, but I am concerned about the ‘cd’ above as it does all of the the repo. I would like to see it do just the directory and sub-directories if a path is given. If I am testing a specific PMD or library then I would not want to do all of the repo.

> +

> +# speed up by ignoring Unicode details

> +export LC_ALL=C

> +

> +for file in $(git ls-files '*.[ch]') ; do

> +	sed -rn 's,^[[:space:]]*#include[[:space:]]*[<"](.*)[>"].*,\1,p' $file |

> +	sort | uniq -d |

> +	sed "s,^,$file: duplicated include: ,"

> +done

> -- 

> 2.13.2

> 


Regards,
Keith
  
Wiles, Keith July 14, 2017, 6:43 p.m. UTC | #2
> On Jul 14, 2017, at 1:39 PM, Wiles, Keith <keith.wiles@intel.com> wrote:

> 

> 

>> On Jul 14, 2017, at 12:07 PM, Thomas Monjalon <thomas@monjalon.net> wrote:

>> 

>> Based on Stephen's idea (originally implemented in a Perl script),

>> this is a shell script to find duplicated includes in a file.

>> It looks for all the .c and .h files of the git repository.

>> 

>> It is fast enough because automatically well parallelized.

>> 

>> Suggested-by: Stephen Hemminger <stephen@networkplumber.org>

>> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>

>> ---

>> devtools/check-dup-includes.sh | 35 +++++++++++++++++++++++++++++++++++

>> 1 file changed, 35 insertions(+)

>> create mode 100755 devtools/check-dup-includes.sh

>> 

>> diff --git a/devtools/check-dup-includes.sh b/devtools/check-dup-includes.sh

>> new file mode 100755

>> index 000000000..ba9871255

>> --- /dev/null

>> +++ b/devtools/check-dup-includes.sh

>> @@ -0,0 +1,35 @@

>> +#! /bin/sh -e

>> +

>> +# Copyright 2017 Mellanox Technologies, Ltd. All rights reserved.

>> +#

>> +# Redistribution and use in source and binary forms, with or without

>> +# modification, are permitted provided that the following conditions

>> +# are met:

>> +#

>> +#   * Redistributions of source code must retain the above copyright

>> +#     notice, this list of conditions and the following disclaimer.

>> +#   * Redistributions in binary form must reproduce the above copyright

>> +#     notice, this list of conditions and the following disclaimer in

>> +#     the documentation and/or other materials provided with the

>> +#     distribution.

>> +#

>> +# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,

>> +# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF

>> +# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.

>> +# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY

>> +# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,

>> +# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE

>> +# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE  SOFTWARE.

>> +

>> +# Check C files in the git repository for duplicated includes.

>> +

>> +cd $(dirname $(readlink -m $0))/..

> 

> I tried the script it works, but I am concerned about the ‘cd’ above as it does all of the the repo. I would like to see it do just the directory and sub-directories if a path is given. If I am testing a specific PMD or library then I would not want to do all of the repo.

> 


I was thinking about something like the following:

if [ "$1x" = "x" ]; then
    cd $(dirname $(readlink -m $0))/..
fi

>> +

>> +# speed up by ignoring Unicode details

>> +export LC_ALL=C

>> +

>> +for file in $(git ls-files '*.[ch]') ; do

>> +	sed -rn 's,^[[:space:]]*#include[[:space:]]*[<"](.*)[>"].*,\1,p' $file |

>> +	sort | uniq -d |

>> +	sed "s,^,$file: duplicated include: ,"

>> +done

>> -- 

>> 2.13.2

>> 

> 

> Regards,

> Keith

> 


Regards,
Keith
  
Thomas Monjalon July 15, 2017, 9:44 a.m. UTC | #3
14/07/2017 20:43, Wiles, Keith:
> > On Jul 14, 2017, at 1:39 PM, Wiles, Keith <keith.wiles@intel.com> wrote:
> >> On Jul 14, 2017, at 12:07 PM, Thomas Monjalon <thomas@monjalon.net> wrote:
> >> +cd $(dirname $(readlink -m $0))/..
> > 
> > I tried the script it works, but I am concerned about the ‘cd’ above as it does all of the the repo. I would like to see it do just the directory and sub-directories if a path is given. If I am testing a specific PMD or library then I would not want to do all of the repo.
> > 
> 
> I was thinking about something like the following:
> 
> if [ "$1x" = "x" ]; then
>     cd $(dirname $(readlink -m $0))/..
> fi

Yes we can add a parameter to the script.
The parameter would be the path to check, and the default value
is the whole repo.
So the script could be used also for another repo - like pktgen ;)

I'll send a v2
  
Neil Horman July 17, 2017, 11:50 a.m. UTC | #4
On Fri, Jul 14, 2017 at 07:07:07PM +0200, Thomas Monjalon wrote:
> Based on Stephen's idea (originally implemented in a Perl script),
> this is a shell script to find duplicated includes in a file.
> It looks for all the .c and .h files of the git repository.
> 
> It is fast enough because automatically well parallelized.
> 
> Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> ---
>  devtools/check-dup-includes.sh | 35 +++++++++++++++++++++++++++++++++++
>  1 file changed, 35 insertions(+)
>  create mode 100755 devtools/check-dup-includes.sh
> 
How many times has this been a problem?  Find the same file included twice in
any single file seems like a pretty rudimentary thing to catch by visual
inspection during development.  At the very least I recall coverity having a
header file analyer which would indicate that a header file was unused, and I
think that offered detection of duplicate includes.  

Neil
  
Thomas Monjalon July 17, 2017, 1:01 p.m. UTC | #5
17/07/2017 14:50, Neil Horman:
> On Fri, Jul 14, 2017 at 07:07:07PM +0200, Thomas Monjalon wrote:
> > Based on Stephen's idea (originally implemented in a Perl script),
> > this is a shell script to find duplicated includes in a file.
> > It looks for all the .c and .h files of the git repository.
> > 
> > It is fast enough because automatically well parallelized.
> > 
> > Suggested-by: Stephen Hemminger <stephen@networkplumber.org>
> > Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
> > ---
> >  devtools/check-dup-includes.sh | 35 +++++++++++++++++++++++++++++++++++
> >  1 file changed, 35 insertions(+)
> >  create mode 100755 devtools/check-dup-includes.sh
> > 
> How many times has this been a problem?

It is not a problem, just a trivial clean-up.

> Find the same file included twice in
> any single file seems like a pretty rudimentary thing to catch by visual
> inspection during development.  At the very least I recall coverity having a
> header file analyer which would indicate that a header file was unused, and I
> think that offered detection of duplicate includes.  

Good for Coverity. This simple script is Open Source ;)
  

Patch

diff --git a/devtools/check-dup-includes.sh b/devtools/check-dup-includes.sh
new file mode 100755
index 000000000..ba9871255
--- /dev/null
+++ b/devtools/check-dup-includes.sh
@@ -0,0 +1,35 @@ 
+#! /bin/sh -e
+
+# Copyright 2017 Mellanox Technologies, Ltd. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+#   * Redistributions of source code must retain the above copyright
+#     notice, this list of conditions and the following disclaimer.
+#   * Redistributions in binary form must reproduce the above copyright
+#     notice, this list of conditions and the following disclaimer in
+#     the documentation and/or other materials provided with the
+#     distribution.
+#
+# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+# MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
+# IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
+# CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
+# TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
+# SOFTWARE OR THE USE OR OTHER DEALINGS IN THE  SOFTWARE.
+
+# Check C files in the git repository for duplicated includes.
+
+cd $(dirname $(readlink -m $0))/..
+
+# speed up by ignoring Unicode details
+export LC_ALL=C
+
+for file in $(git ls-files '*.[ch]') ; do
+	sed -rn 's,^[[:space:]]*#include[[:space:]]*[<"](.*)[>"].*,\1,p' $file |
+	sort | uniq -d |
+	sed "s,^,$file: duplicated include: ,"
+done