[libvirt] [PATCH v3 05/22] build-aux: rewrite whitespace checker in Python

Daniel P. Berrangé berrange at redhat.com
Fri Sep 27 10:42:28 UTC 2019


On Fri, Sep 27, 2019 at 10:33:45AM +0100, Daniel P. Berrangé wrote:
> On Thu, Sep 26, 2019 at 06:08:14PM +0200, Ján Tomko wrote:
> > On Tue, Sep 24, 2019 at 03:58:46PM +0100, Daniel P. Berrangé wrote:
> > > As part of an goal to eliminate Perl from libvirt build tools,
> > > rewrite the check-spacing.pl tool in Python.
> > > 
> > > This was a straight conversion, manually going line-by-line to
> > > change the syntax from Perl to Python. Thus the overall structure
> > > of the file and approach is the same.
> > > 
> > > Signed-off-by: Daniel P. Berrangé <berrange at redhat.com>
> > > ---
> > > Makefile.am                |   2 +-
> > > build-aux/check-spacing.pl | 198 --------------------------------
> > > build-aux/check-spacing.py | 229 +++++++++++++++++++++++++++++++++++++
> > > cfg.mk                     |   4 +-
> > > 4 files changed, 232 insertions(+), 201 deletions(-)
> > > delete mode 100755 build-aux/check-spacing.pl
> > > create mode 100755 build-aux/check-spacing.py
> > > 
> > > diff --git a/build-aux/check-spacing.py b/build-aux/check-spacing.py
> > > new file mode 100755
> > > index 0000000000..6b9f3ec1ba
> > > --- /dev/null
> > > +++ b/build-aux/check-spacing.py
> > > @@ -0,0 +1,229 @@
> > > +#!/usr/bin/env python
> > > +#
> > > +# Copyright (C) 2012-2019 Red Hat, Inc.
> > > +#
> > > +# check-spacing.pl: Report any usage of 'function (..args..)'
> > > +# Also check for other syntax issues, such as correct use of ';'
> > > +#
> > > +# This library is free software; you can redistribute it and/or
> > > +# modify it under the terms of the GNU Lesser General Public
> > > +# License as published by the Free Software Foundation; either
> > > +# version 2.1 of the License, or (at your option) any later version.
> > > +#
> > > +# This library is distributed in the hope that it will be useful,
> > > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > > +# Lesser General Public License for more details.
> > > +#
> > > +# You should have received a copy of the GNU Lesser General Public
> > > +# License along with this library.  If not, see
> > > +# <http://www.gnu.org/licenses/>.
> > > +
> > > +from __future__ import print_function
> > > +
> > > +import re
> > > +import sys
> > > +
> > > +
> > > +def check_whitespace(filename):
> > > +    errs = False
> > > +    with open(filename, 'r') as fh:
> > > +        quotedmetaprog = re.compile(r"""'[";,=]'""")
> > > +        quotedstringprog = re.compile(r'''"(?:[^\\\"]|\\.)*"''')
> > > +        commentstartprog = re.compile(r'''^(.*)/\*.*$''')
> > > +        commentendprog = re.compile(r'''^.*\*/(.*)$''')
> > > +        commentprog = re.compile(r'''^(.*)/\*.*\*/(.*)''')
> > > +        funcprog = re.compile(r'''(\w+)\s\((?!\*)''')
> > > +        keywordprog = re.compile(
> > > +            r'''^.*\b(?:if|for|while|switch|return)\(.*$''')
> > > +        functypedefprog = re.compile(r'''^.*\(\*\w+\)\s+\(.*$''')
> > > +        whitespaceprog1 = re.compile(r'''^.*\s\).*$''')
> > > +        whitespaceprog2 = re.compile(r'''^\s+\);?$''')
> > > +        whitespaceprog3 = re.compile(r'''^.*\((?!$)\s.*''')
> > > +        commasemiprog1 = re.compile(r'''.*\s[;,].*''')
> > > +        commasemiprog2 = re.compile(r'''.*\S; ; .*''')
> > > +        commasemiprog3 = re.compile(r'''^\s+;''')
> > > +        semicolonprog = re.compile(r'''.*;[^	 \\\n;)].*''')
> > > +        commaprog = re.compile(r'''.*,[^ \\\n)}].*''')
> > > +        assignprog1 = re.compile(r'''[^ ]\b[!<>&|\-+*\/%\^=]?=''')
> > > +        assignprog2 = re.compile(r'''=[^= \\\n]''')
> > > +        condstartprog = re.compile(r'''^\s*(if|while|for)\b.*\{$''')
> > > +        statementprog = re.compile(r'''^[^;]*;[^;]*$''')
> > > +        condendprog = re.compile(r'''^\s*}\s*$''')
> > > +
> > 
> > I think even with descriptive names for the regexes and the Python
> > rewrite, this script is hard to read and comes too close to be a C
> > parser.
> 
> Yeah, trying to parse C code using regular expressions was not
> our most sensible decision.
> 
> > Also, the execution time goes from 1.5s to 6.5s, which is longer than
> > all the other checks combined run on my 8-core machine.
> 
> Thanks for pointing that out. I'll see if there's any low hanging
> fruit to optimize but I'm doubtful.
> 
> I dont mind postponing this patch, as all patches in this series
> are independant of each other.
> 
> > Can we get rid of it completely in favor of some proper formatting tool?
> 
> I think that would be a good idea because this script only handles
> a tiny subset of formatting rules we like.
> 
> > I have played with clang-format to try to match our style, the main
> > problems seem to be:
> > * pre-processor directives are indented by the same offset as code
> >  (I see that cppi is specifically intended to add just one space)
> 
> That's an interesting approach. I wouldn't object to such indentation
> style myself.
> 
> > * function calls sometimes leave an empty opening parenthesis
> 
> > * always rewrapping function arguments might create unnecessary churn
> > * parameters wrapping might not be smart enough, e.g. we like to do
> >  virReportError(VIR_ERR_CODE, "%s",
> >                 _("string"));
> >  in a lot of places.
> 
> Yeah these last two points are the ones I struggled with too when I
> looked at clang-format 6 months back.

Another tool is "uncrustify" which seems to have even more configurability
than clang-format does. On my SSD it takes 30 seconds to run over all the
.c file.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list