[libvirt] [PATCH v3 03/22] build-aux: rewrite po file minimizer in Python

Daniel P. Berrangé berrange at redhat.com
Thu Sep 26 13:16:04 UTC 2019


On Thu, Sep 26, 2019 at 12:39:39PM +0200, Erik Skultety wrote:
> On Tue, Sep 24, 2019 at 03:58:44PM +0100, Daniel P. Berrangé wrote:
> > As part of an goal to eliminate Perl from libvirt build tools,
> > rewrite the minimize-po.pl tool in Python.
> >
> > This was a straight conversion, manually going line-by-line to
> > change the syntax from Perl to Python. Thus the overall structure
> > of the file and approach is the same.
> >
> > Signed-off-by: Daniel P. Berrangé <berrange at redhat.com>
> > ---
> >  Makefile.am              |  2 +-
> >  build-aux/minimize-po.pl | 37 -------------------------
> >  build-aux/minimize-po.py | 60 ++++++++++++++++++++++++++++++++++++++++
> >  po/Makefile.am           |  2 +-
> >  4 files changed, 62 insertions(+), 39 deletions(-)
> >  delete mode 100755 build-aux/minimize-po.pl
> >  create mode 100755 build-aux/minimize-po.py
> >
> > diff --git a/Makefile.am b/Makefile.am
> > index 17448a914e..8f688d40d0 100644
> > --- a/Makefile.am
> > +++ b/Makefile.am
> > @@ -45,7 +45,7 @@ EXTRA_DIST = \
> >    build-aux/check-spacing.pl \
> >    build-aux/gitlog-to-changelog \
> >    build-aux/header-ifdef.pl \
> > -  build-aux/minimize-po.pl \
> > +  build-aux/minimize-po.py \
> >    build-aux/mock-noinline.pl \
> >    build-aux/prohibit-duplicate-header.pl \
> >    build-aux/useless-if-before-free \
> > diff --git a/build-aux/minimize-po.pl b/build-aux/minimize-po.pl
> > deleted file mode 100755
> > index 497533a836..0000000000
> > --- a/build-aux/minimize-po.pl
> > +++ /dev/null
> > @@ -1,37 +0,0 @@
> > -#!/usr/bin/perl
> > -
> > -my @block;
> > -my $msgstr = 0;
> > -my $empty = 0;
> > -my $unused = 0;
> > -my $fuzzy = 0;
> > -while (<>) {
> > -    if (/^$/) {
> > -        if (!$empty && !$unused && !$fuzzy) {
> > -            print @block;
> > -        }
> > -        @block = ();
> > -        $msgstr = 0;
> > -        $fuzzy = 0;
> > -        push @block, $_;
> > -    } else {
> > -        if (/^msgstr/) {
> > -            $msgstr = 1;
> > -            $empty = 1;
> > -        }
> > -        if (/^#.*fuzzy/) {
> > -            $fuzzy = 1;
> > -        }
> > -        if (/^#~ msgstr/) {
> > -            $unused = 1;
> > -        }
> > -        if ($msgstr && /".+"/) {
> > -            $empty = 0;
> > -        }
> > -        push @block, $_;
> > -    }
> > -}
> > -
> > -if (@block && !$empty && !$unused) {
> 
> I guess the fact !$fuzzy was missing in this condition was a bug that the new
> python code doesn't suffer from, right?

Yeah it was a pre-existing bug, but harmless.

> > diff --git a/build-aux/minimize-po.py b/build-aux/minimize-po.py
> > new file mode 100755
> > index 0000000000..5046bacede
> > --- /dev/null
> > +++ b/build-aux/minimize-po.py
> > @@ -0,0 +1,60 @@
> > +#!/usr/bin/env python
> > +#
> > +# Copyright (C) 2018-2019 Red Hat, Inc.
> > +#
> > +# This library is free software; you can redistribute it and/or
> > +# modify it under the terms of the GNU Lesser General Public
> > +# License as published by the Free Software Foundation; either
> > +# version 2.1 of the License, or (at your option) any later version.
> > +#
> > +# This library is distributed in the hope that it will be useful,
> > +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +# Lesser General Public License for more details.
> > +#
> > +# You should have received a copy of the GNU Lesser General Public
> > +# License along with this library.  If not, see
> > +# <http://www.gnu.org/licenses/>.
> > +
> > +from __future__ import print_function
> > +
> > +import re
> > +import sys
> > +
> > +block = []
> > +msgstr = False
> > +empty = False
> > +unused = False
> > +fuzzy = False
> > +
> > +strprog = re.compile(r'''.*".+".*''')
> 
> question 1) what's the benefit of compiling a regex and using it only once? Btw
> python does cache every pattern passed to re.match (and friends) so compilation
> IMO hardly ever makes sense unless you're doing 1000s of searches for the same
> pattern in which case the latency would naturally accumulate.

Ah, I've just seen the docs now

  "The compiled versions of the most recent patterns passed to 
   re.compile() and the module-level matching functions are 
   cached, so programs that use only a few regular expressions 
   at a time needn’t worry about compiling regular expressions."

so with this in mind, I can probably just remove the 'compile' step from
all the scripts in this series. I haven't used it consistently to start
with.

> question 2) why do we need the '''.* and .*''' parts compared to the original
> perl regex? I'll go ahead and assume it's because re.match matches at the
> beginning of a string by default, in which case, shouldn't we use re.search
> which matches anywhere (that's what perl does by default) instead?

Yeah, I didn't notice the 're.search' function existed & had the semantics
closer to Perl.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list