[augeas-devel] Httpd strategy

David Lutterkort lutter at redhat.com
Fri Jul 23 23:50:42 UTC 2010


Hi Francis,

great that are grabbing the bull by its horns ;)

On Thu, 2010-07-22 at 21:48 -0400, Francis Giraldeau wrote:
> There is one big issue with Httpd lens: it's huge configuration space.
> There are few sections names and over 200 directives. There are two
> strategies: generic v.s. exact. I wanted to compare both strategies for
> their runtime performance and how they were easy to use. 

Don't forget that there are also any number of plugins that add custom
directives; there's also a heinous module out there that lets you define
new 'macro' sections - though the name escapes me right now.

> Let's say we want a generic lens, a lens that matches any directives and
> sections. Then, we got a put ambiguity error, because both directive and
> section lenses will match a node like { /[a-zA-Z0-9]+/ }. One way to
> bypass this problem is to use the following nodes 
> 
> sections:    { /[a-zA-Z0-9]+/ } 
> directives:  { "#directive" = /[a-zA-Z0-9]/ }
> 
> Because "#" is not in the section name, then we avoid the put ambiguity.
> The lens size is small, because we avoid to list all directives, but
> notice that the directive name is a value, not a label. It's harder to
> use, because of the more complex path queries involved. 

Given that we can't really list all possible directive or section names,
the 'generic' approach seems the only one that won't choke on too many
existing, valid httpd configurations; the downside is that we'll allow
some illegal constructs, like providing too many or too few arguments to
directives, or illegal nesting of sections.

What I envision is a lens that knows about three things: sections,
directives and arguments, and turns a file

  <Sec1 arg11 arg12>
     <Sec2 arg21 arg22>
       Directive1 arg31 arg32
       Directive2 arg41
     </Sec2>
  </Sec1>

into the tree

        { "<Sec1" }
          { "param" = "arg11" } { "param" = "arg12" }
          { "<Sec2" }
                { "param" = "arg21" } { "param" = "arg22" }
                { "Directive1"
                  { "param" = "arg31" } { "param" = "arg32" } }
                { "Directive2"
                  { "param" = "arg41" } }

This is also pretty much what the httpd lens for RHQ does (attached to
bug #100 or in their git repo at [1]) As far as I can tell, that will
avoid all typechecking headaches, and still lead to a reasonable tree.
The RHQ lens needs some work though, since it doesn't accept any section
name, just a fixed list if names - no wonder, because you need the
square lens for that.

> Here are the results parsing benchmark with a representative apache
> configuration for the two lenses. (average of 10 runs on intel duo
> 1,8GHz) First test is real time to process the test and the second is
> total memory allocation reported by valgrind. 
> 
>                | time w check  | time wo check 
> Httpd_exact    | 5,31 s        | 0,34 s
> Httpd_generic  | 0,09 s        | 0,05 s
> 
>                | mem w check   | mem wo check 
> Httpd_exact    | 1536 Mb       | 61 Mb
> Httpd_generic  |    3 Mb       |  1 Mb

That's a very strong argument for using a generic lens; I am actually
surprised you got the exact lens to typecheck. The ones I've written in
the past all ran out of memory on a 4GB machine.

> also another point to take into consideration. Httpd directives are case
> insensitive, the generic lens handle this and the exact one doesn't.

There's an additional wrinkle that I didn't think of before: besides
making sure that all regexps used in the lens match case insensitively,
we also need to make sure that path expressions match case
insensitively, i.e. the path expression

        /files/etc/httpd/conf/httpd.conf/<sec1/<Sec2/directive1
        
should match in the tree above. 

The most elegant way to achieve this would be to add two new flags to
tree nodes 'label_nocase' and 'value_nocase'; when they are set, the
interpreter for path expressions performs comparisons against the node
label/value without regard for case. They get initialized in get.c: when
a key or store lens is based on a case-insensitive regexp, we set these
flags on the tree node that is constructed from them.

David

[1] http://git.fedorahosted.org/git/?p=rhq/rhq.git;a=tree;f=modules/plugins/apache/src/main/resources





More information about the augeas-devel mailing list