[libvirt] [PATCH v3 03/22] build-aux: rewrite po file minimizer in Python
Daniel P. Berrangé
berrange at redhat.com
Thu Sep 26 13:16:04 UTC 2019
On Thu, Sep 26, 2019 at 12:39:39PM +0200, Erik Skultety wrote:
> On Tue, Sep 24, 2019 at 03:58:44PM +0100, Daniel P. Berrangé wrote:
> > As part of an goal to eliminate Perl from libvirt build tools,
> > rewrite the minimize-po.pl tool in Python.
> >
> > This was a straight conversion, manually going line-by-line to
> > change the syntax from Perl to Python. Thus the overall structure
> > of the file and approach is the same.
> >
> > Signed-off-by: Daniel P. Berrangé <berrange at redhat.com>
> > ---
> > Makefile.am | 2 +-
> > build-aux/minimize-po.pl | 37 -------------------------
> > build-aux/minimize-po.py | 60 ++++++++++++++++++++++++++++++++++++++++
> > po/Makefile.am | 2 +-
> > 4 files changed, 62 insertions(+), 39 deletions(-)
> > delete mode 100755 build-aux/minimize-po.pl
> > create mode 100755 build-aux/minimize-po.py
> >
> > diff --git a/Makefile.am b/Makefile.am
> > index 17448a914e..8f688d40d0 100644
> > --- a/Makefile.am
> > +++ b/Makefile.am
> > @@ -45,7 +45,7 @@ EXTRA_DIST = \
> > build-aux/check-spacing.pl \
> > build-aux/gitlog-to-changelog \
> > build-aux/header-ifdef.pl \
> > - build-aux/minimize-po.pl \
> > + build-aux/minimize-po.py \
> > build-aux/mock-noinline.pl \
> > build-aux/prohibit-duplicate-header.pl \
> > build-aux/useless-if-before-free \
> > diff --git a/build-aux/minimize-po.pl b/build-aux/minimize-po.pl
> > deleted file mode 100755
> > index 497533a836..0000000000
> > --- a/build-aux/minimize-po.pl
> > +++ /dev/null
> > @@ -1,37 +0,0 @@
> > -#!/usr/bin/perl
> > -
> > -my @block;
> > -my $msgstr = 0;
> > -my $empty = 0;
> > -my $unused = 0;
> > -my $fuzzy = 0;
> > -while (<>) {
> > - if (/^$/) {
> > - if (!$empty && !$unused && !$fuzzy) {
> > - print @block;
> > - }
> > - @block = ();
> > - $msgstr = 0;
> > - $fuzzy = 0;
> > - push @block, $_;
> > - } else {
> > - if (/^msgstr/) {
> > - $msgstr = 1;
> > - $empty = 1;
> > - }
> > - if (/^#.*fuzzy/) {
> > - $fuzzy = 1;
> > - }
> > - if (/^#~ msgstr/) {
> > - $unused = 1;
> > - }
> > - if ($msgstr && /".+"/) {
> > - $empty = 0;
> > - }
> > - push @block, $_;
> > - }
> > -}
> > -
> > -if (@block && !$empty && !$unused) {
>
> I guess the fact !$fuzzy was missing in this condition was a bug that the new
> python code doesn't suffer from, right?
Yeah it was a pre-existing bug, but harmless.
> > diff --git a/build-aux/minimize-po.py b/build-aux/minimize-po.py
> > new file mode 100755
> > index 0000000000..5046bacede
> > --- /dev/null
> > +++ b/build-aux/minimize-po.py
> > @@ -0,0 +1,60 @@
> > +#!/usr/bin/env python
> > +#
> > +# Copyright (C) 2018-2019 Red Hat, Inc.
> > +#
> > +# This library is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU Lesser General Public
> > +# License as published by the Free Software Foundation; either
> > +# version 2.1 of the License, or (at your option) any later version.
> > +#
> > +# This library is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > +# Lesser General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU Lesser General Public
> > +# License along with this library. If not, see
> > +# <http://www.gnu.org/licenses/>.
> > +
> > +from __future__ import print_function
> > +
> > +import re
> > +import sys
> > +
> > +block = []
> > +msgstr = False
> > +empty = False
> > +unused = False
> > +fuzzy = False
> > +
> > +strprog = re.compile(r'''.*".+".*''')
>
> question 1) what's the benefit of compiling a regex and using it only once? Btw
> python does cache every pattern passed to re.match (and friends) so compilation
> IMO hardly ever makes sense unless you're doing 1000s of searches for the same
> pattern in which case the latency would naturally accumulate.
Ah, I've just seen the docs now
"The compiled versions of the most recent patterns passed to
re.compile() and the module-level matching functions are
cached, so programs that use only a few regular expressions
at a time needn’t worry about compiling regular expressions."
so with this in mind, I can probably just remove the 'compile' step from
all the scripts in this series. I haven't used it consistently to start
with.
> question 2) why do we need the '''.* and .*''' parts compared to the original
> perl regex? I'll go ahead and assume it's because re.match matches at the
> beginning of a string by default, in which case, shouldn't we use re.search
> which matches anywhere (that's what perl does by default) instead?
Yeah, I didn't notice the 're.search' function existed & had the semantics
closer to Perl.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
More information about the libvir-list
mailing list