[libvirt] [PATCH v3 03/22] build-aux: rewrite po file minimizer in Python

Daniel P. Berrangé berrange at redhat.com
Fri Sep 27 08:31:44 UTC 2019


On Fri, Sep 27, 2019 at 09:22:13AM +0200, Erik Skultety wrote:
> On Thu, Sep 26, 2019 at 04:38:49PM +0100, Daniel P. Berrangé wrote:
> > On Thu, Sep 26, 2019 at 05:34:49PM +0200, Ján Tomko wrote:
> > > On Thu, Sep 26, 2019 at 02:16:04PM +0100, Daniel P. Berrangé wrote:
> > > > On Thu, Sep 26, 2019 at 12:39:39PM +0200, Erik Skultety wrote:
> > > > > On Tue, Sep 24, 2019 at 03:58:44PM +0100, Daniel P. Berrangé wrote:
> > > > > question 1) what's the benefit of compiling a regex and using it only once? Btw
> > > > > python does cache every pattern passed to re.match (and friends) so compilation
> > > > > IMO hardly ever makes sense unless you're doing 1000s of searches for the same
> > >
> > > Some of the scripts here are run on the whole libvirt codebase so that
> > > is the case here. For example just removing the pre-compilation of
> > > regexes for comments from the spacing check script bumped the execution
> > > time from 6.5s to 7.4s
> > >
> > > Sadly, the one script where pre-compilation matters the most is the one
> > > where separating them puts them far away from the usage to not fit on
> > > one screen.
> >
> > I could do a little custom function that caches all regexes
> >
> >   recache = {}
> >
> >   def research(regex, line):
> >     global recache
> >     if regex not in recache:
> >       recache[regex] = re.compile(regex)
> >     return recache[regex].search(line)
> 
> I'm not sure how ^this would solve the slowdown Jano is seeing as this is
> exactly what python should already be doing internally, IOW the slowdown Jano
> reported is most likely caused by cache accesses which I don't think our own
> custom cache would solve, so we probably do want to keep the compilation in even
> though I personally don't mind the ~1 sec penalty here (compared to the 4x
> slowdown in the next patch which I think we need to do better to resolve).

Yeah the slowdown Jano reports looks like a bigger problem to deal with.
I think it could still be worth doing it for this patch, since although
1 sec doesn't sound like much, with the huge number of scripts we have,
it all adds up.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list