[Libguestfs] [PATCH 0/2] Add lightweight bindings for PCRE.
Pino Toscano
ptoscano at redhat.com
Wed Aug 2 13:01:53 UTC 2017
On Wednesday, 2 August 2017 13:52:06 CEST Richard W.M. Jones wrote:
> On Wed, Aug 02, 2017 at 12:33:14PM +0200, Pino Toscano wrote:
> > Hi,
> >
> > (replying here since v2 of the series does not have this explanation.)
> >
> > On Tuesday, 1 August 2017 16:00:15 CEST Richard W.M. Jones wrote:
> > > We'd like to use PCRE instead of the awful Str module. However I
> > > don't necessarily want to pull in the extra dependency of ocaml-pcre,
> > > and in any case ocaml-pcre is rather difficult to use.
> > >
> > > This introduces very simplified and lightweight bindings for PCRE.
> > >
> > > They work rather like Str in that there is some global state (actually
> > > thread-local in this implementation) between the matching and the
> > > getting the substring, so you can write code like this:
> > >
> > > let re = PCRE.compile "(a+)b"
> > > ...
> > >
> > > if PCRE.matches re "ccaaaabb" then (
> > > let whole = PCRE.sub 0 in (* returns "aaaab" *)
> > > let first = PCRE.sub 1 in (* returns "aaaa" *)
> > > ...
> >
> > Since we are providing a better module, with a different API (which
> > needs changes), what about removing the usage of a global state, in
> > favour of a match object holding the captures? Something like
> > (starting from your example above):
> >
> > let re = PCRE.compile "(a+)b" in
> > try
> > let m = PCRE.match re "ccaaaabb" in
> > let whole = PCRE.sub m 0 in (* returns "aaaab" *)
> > let first = PCRE.sub m 1 in (* returns "aaaa" *)
> > with Not_matched _ ->
> > ...
>
> That's what I was trying to avoid. I think the if statement with
> global state is much easier to use.
>
> > This makes it possible to stop thinking about what was the last saved
> > state, and even keep the multiple results of matches at the same time.
>
> I've converted all of the daemon code to this form, and this is
> not an issue that came up.
Right, because we have already these constraints because of Str.
>
> > Also the results are properly GC'ed once they get out of scope, and not
> > linger until the thread finish (or the program shutdown).
> > The drawback I see is that many of the Str usages are in chains of
> > "if ... else if ...", which could make the code slightly more complex.
> >
> > Of course PCRE.matches ought to be left, but it would just return
> > whether the re matched, without changing any global state, and without
> > any result available.
>
> I think you're suggesting this:
>
> let m = PCRE.exec re "ccaaaabb" in
> if PCRE.matches m then (
> let whole = PCRE.sub m 0 in
Not really, my suggestion was to have a separate object representing
the result of a regex match -- much like other language have in their
regex APIs.
OTOH, this solution LGTM as well: the result of the regex is not saved
in a thread-local variable, but directly in the same regex object, so
can be kept/used around, and it is GC'ed when not needed anymore.
If you could apply that change, that'd be a LGTM.
Thanks,
--
Pino Toscano
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20170802/b6591ee6/attachment.sig>
More information about the Libguestfs
mailing list