[Libguestfs] [PATCH 0/2] Add lightweight bindings for PCRE.

Pino Toscano ptoscano at redhat.com
Wed Aug 2 13:01:53 UTC 2017


On Wednesday, 2 August 2017 13:52:06 CEST Richard W.M. Jones wrote:
> On Wed, Aug 02, 2017 at 12:33:14PM +0200, Pino Toscano wrote:
> > Hi,
> > 
> > (replying here since v2 of the series does not have this explanation.)
> > 
> > On Tuesday, 1 August 2017 16:00:15 CEST Richard W.M. Jones wrote:
> > > We'd like to use PCRE instead of the awful Str module.  However I
> > > don't necessarily want to pull in the extra dependency of ocaml-pcre,
> > > and in any case ocaml-pcre is rather difficult to use.
> > > 
> > > This introduces very simplified and lightweight bindings for PCRE.
> > > 
> > > They work rather like Str in that there is some global state (actually
> > > thread-local in this implementation) between the matching and the
> > > getting the substring, so you can write code like this:
> > > 
> > >   let re = PCRE.compile "(a+)b"
> > >   ...
> > > 
> > >   if PCRE.matches re "ccaaaabb" then (
> > >     let whole = PCRE.sub 0 in (* returns "aaaab" *)
> > >     let first = PCRE.sub 1 in (* returns "aaaa" *)
> > >     ...
> > 
> > Since we are providing a better module, with a different API (which
> > needs changes), what about removing the usage of a global state, in
> > favour of a match object holding the captures?  Something like
> > (starting from your example above):
> > 
> >   let re = PCRE.compile "(a+)b" in
> >   try
> >     let m = PCRE.match re "ccaaaabb" in
> >     let whole = PCRE.sub m 0 in (* returns "aaaab" *)
> >     let first = PCRE.sub m 1 in (* returns "aaaa" *)
> >   with Not_matched _ ->
> >     ...
> 
> That's what I was trying to avoid.  I think the if statement with
> global state is much easier to use.
> 
> > This makes it possible to stop thinking about what was the last saved
> > state, and even keep the multiple results of matches at the same time.
> 
> I've converted all of the daemon code to this form, and this is
> not an issue that came up.

Right, because we have already these constraints because of Str.

> 
> > Also the results are properly GC'ed once they get out of scope, and not
> > linger until the thread finish (or the program shutdown).
> > The drawback I see is that many of the Str usages are in chains of
> > "if ... else if ...", which could make the code slightly more complex.
> > 
> > Of course PCRE.matches ought to be left, but it would just return
> > whether the re matched, without changing any global state, and without
> > any result available.
> 
> I think you're suggesting this:
> 
>   let m = PCRE.exec re "ccaaaabb" in
>   if PCRE.matches m then (
>     let whole = PCRE.sub m 0 in

Not really, my suggestion was to have a separate object representing
the result of a regex match -- much like other language have in their
regex APIs.

OTOH, this solution LGTM as well: the result of the regex is not saved
in a thread-local variable, but directly in the same regex object, so
can be kept/used around, and it is GC'ed when not needed anymore.
If you could apply that change, that'd be a LGTM.

Thanks,
-- 
Pino Toscano
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20170802/b6591ee6/attachment.sig>


More information about the Libguestfs mailing list