[augeas-devel] httpd.conf and beyond

David Lutterkort lutter at redhat.com
Fri Feb 19 23:19:51 UTC 2010


Hi Francis,

On Fri, 2010-02-19 at 17:39 -0500, Francis Giraldeau wrote:
> lutter at redhat.com wrote:
> > This series of patches adds support for context-free lenses, which means
> > that Augeas can now process funky stuff like the Apache config.
> >   
> 
> That's great! I looked at test lenses, and it seems really exciting.

The fly in the ointment is that I still haven't finished the lens for
httpd.conf :( The main holdup is that if you write a very fine-grained
lens, i.e. one that handles each possible Apache directive separately,
the typechecker does not finish checking the regular parts of that lens,
i.e. a regular lens that looks like

        let directives = (directive1|directive2|...|directiveN)*

Because there are so many directives, the typechecker runs out of
memory, even if you give it a lot of memory. This is particularly
annoying since it's perfectly fine to _use_ the lens, i.e. even though
you can't typecheck the lens, you can use them for parsing without a
noticeable performance problem (if you're willing to assume you have no
type errors ;).

I've been trying to address that by playing various tricks with reguar
expressions inside libfa (the OOM happens in fa_ambig_example) - the
next thing to try, when I get some time, is to convert regular
expressions to finite automata using a derivative-based approach[1] I've
got some patches to do the canonization of regular expressions their
construction requires, I just haven't had a chance to implement their
DFA construction.

Another approach would be to write an Apache lens that is less
fine-grained and lumps directives into fewer branches in the above lens.

> > Patch 15/16 contains a generic lens for JSON[1] files, which is a very
> > pleasant file format.
> >   
> 
> I tried to load the lens without success. Here is what I got with
> release 0.7.0 :
> 
> augtool> load
> augtool: get.c:952: rec_process: Assertion `lens->tag == L_REC &&
> lens->jmt != ((void *)0)' failed.

Can you send me the file that you tried this with ? That's definitely a
bug somewhere. Might be fixed in git; if it's not please file a bug.

> What is exactly the implied limitation about the fact that put direction
> is based on regular trees?

Essentially, it means that if you look at the labels of nodes at a
certain level in the tree, the string language you get from
concatenating all possible labels must be regular and can't be
context-free. (It's actually a little more complicated since you'd need
to look at the language formed by concatenating the label and value of
each node)

In practical terms, I haven't found that to cause a real limitation.

> I saw that there is now an earley parser for recursive lens, is it used
> only for context free lenses or for regular ones too? I remember that
> regular expressions based parsing, even with regular approximations, was
> a show stopper to support context free languages. Does the regular
> approximation used only for ambiguity analysis?

The Earley parser is used only for the context-free parts of the lens,
and only in the string -> tree direction. It looks at any regular lens
as a terminal, which can be an arbitrarily complex construct (like the
directives lens above) Matches for those terminals are then parsed using
the existing infrastructure for regular lenses.

I tried a few regular approximations, and none of them yielded anything
reasonable for the string -> tree direction (meaning that the
approximations are so coarse that you always get type errors, even for
perfectly unambiguous cf languages) Until somebody has a better way to
solve this, there's no typechecking for the ctype of a context-free
lens. Discovering ambiguities is left to the parser.

The regular approximation for the atype (tree -> string direction) is
generally reasonable, and the atype of cf lenses is therefore properly
typechecked.

David

[1] http://www.ccs.neu.edu/home/turon/re-deriv.pdf





More information about the augeas-devel mailing list