[augeas-devel] Greedy disambiguation
Francis Giraldeau
francis.giraldeau at usherbrooke.ca
Tue Sep 28 21:34:02 UTC 2010
Hi,
Augeas does typecheck iteration, concat and union operations. For
example, this lens is ambigous:
let ambig = [ key /[a]*/ ] *
ambig.aug:3.12-.28:exception: ambiguous iteration
Iterated regexp: /[a]*/
'a' can be split into
'|=|a'
and
'a|=|'
The problem for the "ambig" lens here can be fixed by removing double
kleene, but in the XML lens, where elements can contain text and/or
elements, I must have something like this:
let text = [ label "#text" . store /[^<>]*/ ]
let element (body:lens) : [ <...> body* <...> ]
let rec content = element (element|text)
and then, there is ambigous iteration with the text lens, because body
is iterated. We just can't do all combinations:
(text)|(element)|(text . element)|(element . text)| (...)
But, in fact, the lens is working fine. I guess it's because the
matching algorithm is actualy greedy. The typechecker seems to be too
strict in this case.
In it's recent paper, C. Brabrand [1] provides methods to disambiguate
this expression, and propose exactly to use a greedy matching strategy.
This way, there would be only one possible match for the "ambig" lens
above.
Could it be possible to get a weaker typecheck on case by case basis,
for example by with modified operators?
l1 || l2 --> l1 will match before l2 even if their intersection is not
empty (becomes l1 | l2 - l1)
l1 .. l2 ---> l1 will be greedy, even if there is a non empty overlap
with l2
l1** is a subcase of l1 .. l1*
Cheer,
Francis
---
[1] http://www.itu.dk/~brabrand/regexp.pdf
More information about the augeas-devel
mailing list