[libvirt] [PATCH v3 05/22] build-aux: rewrite whitespace checker in Python
Ján Tomko
jtomko at redhat.com
Thu Sep 26 16:08:14 UTC 2019
On Tue, Sep 24, 2019 at 03:58:46PM +0100, Daniel P. Berrangé wrote:
>As part of an goal to eliminate Perl from libvirt build tools,
>rewrite the check-spacing.pl tool in Python.
>
>This was a straight conversion, manually going line-by-line to
>change the syntax from Perl to Python. Thus the overall structure
>of the file and approach is the same.
>
>Signed-off-by: Daniel P. Berrangé <berrange at redhat.com>
>---
> Makefile.am | 2 +-
> build-aux/check-spacing.pl | 198 --------------------------------
> build-aux/check-spacing.py | 229 +++++++++++++++++++++++++++++++++++++
> cfg.mk | 4 +-
> 4 files changed, 232 insertions(+), 201 deletions(-)
> delete mode 100755 build-aux/check-spacing.pl
> create mode 100755 build-aux/check-spacing.py
>
>diff --git a/build-aux/check-spacing.py b/build-aux/check-spacing.py
>new file mode 100755
>index 0000000000..6b9f3ec1ba
>--- /dev/null
>+++ b/build-aux/check-spacing.py
>@@ -0,0 +1,229 @@
>+#!/usr/bin/env python
>+#
>+# Copyright (C) 2012-2019 Red Hat, Inc.
>+#
>+# check-spacing.pl: Report any usage of 'function (..args..)'
>+# Also check for other syntax issues, such as correct use of ';'
>+#
>+# This library is free software; you can redistribute it and/or
>+# modify it under the terms of the GNU Lesser General Public
>+# License as published by the Free Software Foundation; either
>+# version 2.1 of the License, or (at your option) any later version.
>+#
>+# This library is distributed in the hope that it will be useful,
>+# but WITHOUT ANY WARRANTY; without even the implied warranty of
>+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
>+# Lesser General Public License for more details.
>+#
>+# You should have received a copy of the GNU Lesser General Public
>+# License along with this library. If not, see
>+# <http://www.gnu.org/licenses/>.
>+
>+from __future__ import print_function
>+
>+import re
>+import sys
>+
>+
>+def check_whitespace(filename):
>+ errs = False
>+ with open(filename, 'r') as fh:
>+ quotedmetaprog = re.compile(r"""'[";,=]'""")
>+ quotedstringprog = re.compile(r'''"(?:[^\\\"]|\\.)*"''')
>+ commentstartprog = re.compile(r'''^(.*)/\*.*$''')
>+ commentendprog = re.compile(r'''^.*\*/(.*)$''')
>+ commentprog = re.compile(r'''^(.*)/\*.*\*/(.*)''')
>+ funcprog = re.compile(r'''(\w+)\s\((?!\*)''')
>+ keywordprog = re.compile(
>+ r'''^.*\b(?:if|for|while|switch|return)\(.*$''')
>+ functypedefprog = re.compile(r'''^.*\(\*\w+\)\s+\(.*$''')
>+ whitespaceprog1 = re.compile(r'''^.*\s\).*$''')
>+ whitespaceprog2 = re.compile(r'''^\s+\);?$''')
>+ whitespaceprog3 = re.compile(r'''^.*\((?!$)\s.*''')
>+ commasemiprog1 = re.compile(r'''.*\s[;,].*''')
>+ commasemiprog2 = re.compile(r'''.*\S; ; .*''')
>+ commasemiprog3 = re.compile(r'''^\s+;''')
>+ semicolonprog = re.compile(r'''.*;[^ \\\n;)].*''')
>+ commaprog = re.compile(r'''.*,[^ \\\n)}].*''')
>+ assignprog1 = re.compile(r'''[^ ]\b[!<>&|\-+*\/%\^=]?=''')
>+ assignprog2 = re.compile(r'''=[^= \\\n]''')
>+ condstartprog = re.compile(r'''^\s*(if|while|for)\b.*\{$''')
>+ statementprog = re.compile(r'''^[^;]*;[^;]*$''')
>+ condendprog = re.compile(r'''^\s*}\s*$''')
>+
I think even with descriptive names for the regexes and the Python
rewrite, this script is hard to read and comes too close to be a C
parser.
Also, the execution time goes from 1.5s to 6.5s, which is longer than
all the other checks combined run on my 8-core machine.
Can we get rid of it completely in favor of some proper formatting tool?
I have played with clang-format to try to match our style, the main
problems seem to be:
* pre-processor directives are indented by the same offset as code
(I see that cppi is specifically intended to add just one space)
* function calls sometimes leave an empty opening parenthesis
* always rewrapping function arguments might create unnecessary churn
* parameters wrapping might not be smart enough, e.g. we like to do
virReportError(VIR_ERR_CODE, "%s",
_("string"));
in a lot of places.
On the plus side, we would be able to properly check for spacing after
casts.
Jano
>+ incomment = False
>+ # Per-file variables for multiline Curly Bracket (cb_) check
>+ cb_lineno = 0
>+ cb_code = ""
>+ cb_scolon = False
>+
>+ lineno = 0
>+ for line in fh:
>+ lineno = lineno + 1
>+ data = line
>+ # For temporary modifications
>+
>+ # Kill any quoted , ; = or "
>+ data = quotedmetaprog.sub("'X'", data)
>+
>+ # Kill any quoted strings
>+ data = quotedstringprog.sub('"XXX"', data)
>+
>+ if data[0] == '#':
>+ continue
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190926/99d88921/attachment-0001.sig>
More information about the libvir-list
mailing list