[augeas-devel] Httpd strategy
Francis Giraldeau
francis.giraldeau at usherbrooke.ca
Tue Aug 3 16:14:17 UTC 2010
Hi,
> Given that we can't really list all possible directive or section names,
> the 'generic' approach seems the only one that won't choke on too many
> existing, valid httpd configurations; the downside is that we'll allow
> some illegal constructs, like providing too many or too few arguments to
> directives, or illegal nesting of sections.
Right, I'm fine with this, let's do it in a generic way.
>
> What I envision is a lens that knows about three things: sections,
> directives and arguments, and turns a file
>
> <Sec1 arg11 arg12>
> <Sec2 arg21 arg22>
> Directive1 arg31 arg32
> Directive2 arg41
> </Sec2>
> </Sec1>
>
> into the tree
>
> { "<Sec1" }
> { "param" = "arg11" } { "param" = "arg12" }
> { "<Sec2" }
> { "param" = "arg21" } { "param" = "arg22" }
> { "Directive1"
> { "param" = "arg31" } { "param" = "arg32" } }
> { "Directive2"
> { "param" = "arg41" } }
>
> This is also pretty much what the httpd lens for RHQ does (attached to
> bug #100 or in their git repo at [1]) As far as I can tell, that will
> avoid all typechecking headaches, and still lead to a reasonable tree.
There is a gotcha with the "<" in front of the section name. What need
it to manage the put ambiguity. Directives nodes don't have "<" and
hence they are differentiable.
We can't have this in a square lens, otherwise it will be put at the end
of the tag, as this:
<Directive>
...
</<Directive>
We could get a tree like this:
{ "section" = "Sec1"
{ "param" = "arg11" }
{ "param" = "arg12" }
{ "section" = "Sec2"
{ "param" = "arg21" }
{ "param" = "arg22" }
{ "directive" = "Directive1"
{ "param" = "arg31" }
{ "param" = "arg32" }
}
{ "directive" = "Directive2"
{ "param" = "arg4" }
}
}
}
But, you know, it's much less sexy...
> The RHQ lens needs some work though, since it doesn't accept any section
> name, just a fixed list if names - no wonder, because you need the
> square lens for that.
>
> > Here are the results parsing benchmark with a representative apache
> > configuration for the two lenses. (average of 10 runs on intel duo
> > 1,8GHz) First test is real time to process the test and the second is
> > total memory allocation reported by valgrind.
> >
> > | time w check | time wo check
> > Httpd_exact | 5,31 s | 0,34 s
> > Httpd_generic | 0,09 s | 0,05 s
> >
> > | mem w check | mem wo check
> > Httpd_exact | 1536 Mb | 61 Mb
> > Httpd_generic | 3 Mb | 1 Mb
>
> That's a very strong argument for using a generic lens; I am actually
> surprised you got the exact lens to typecheck. The ones I've written in
> the past all ran out of memory on a 4GB machine.
One earlier version was doing something like this:
let directives_regexp = [a-zA-Z0-9]+ - /Directory|.../
This is what it produce for one section name:
/Director((y[0-9A-Za-z]|[0-9A-Za-xz])[0-9A-Za-z]*|())|
Directo([0-9A-Za-qs-z][0-9A-Za-z]*|())|
Direct([0-9A-Za-np-z][0-9A-Za-z]*|())|
Direc([0-9A-Za-su-z][0-9A-Za-z]*|())|
Dire([0-9A-Zabd-z][0-9A-Za-z]*|())|
Dir([0-9A-Za-df-z][0-9A-Za-z]*|())|
Di([0-9A-Za-qs-z][0-9A-Za-z]*|())|
(D[0-9A-Za-hj-z]|[0-9A-CE-Za-z][0-9A-Za-z])[0-9A-Za-z]*
|D|[0-9A-CE-Za-z]/
And that was creating a huge automaton that was memory intensive. In
fact, automatons are smaller when listing every directives than while
substracting from a general regexp.
The sampled 1,5Gb for typecheching is the total allocated memory, not
the maximum memory the process occupied in memory, so I was never ran
out of memory.
But anyway, we will do something generic...
>
> > also another point to take into consideration. Httpd directives are case
> > insensitive, the generic lens handle this and the exact one doesn't.
>
> There's an additional wrinkle that I didn't think of before: besides
> making sure that all regexps used in the lens match case insensitively,
> we also need to make sure that path expressions match case
> insensitively, i.e. the path expression
>
> /files/etc/httpd/conf/httpd.conf/<sec1/<Sec2/directive1
>
> should match in the tree above.
>
> The most elegant way to achieve this would be to add two new flags to
> tree nodes 'label_nocase' and 'value_nocase'; when they are set, the
> interpreter for path expressions performs comparisons against the node
> label/value without regard for case. They get initialized in get.c: when
> a key or store lens is based on a case-insensitive regexp, we set these
> flags on the tree node that is constructed from them.
Ok, ticket created.
Francis
More information about the augeas-devel
mailing list