[augeas-devel] Language revamp

David Lutterkort dlutter at redhat.com
Tue Mar 18 21:03:18 UTC 2008


Hi Bryan,

On Tue, 2008-03-18 at 09:35 -0400, Bryan Kearney wrote:
> As long as I can say that file X and Y and Z are all types of ini, or 
> properties, etc then I am fine not having nested modules. It does seem 
> like there is some basic gammer (words, blank lines, etc) that everyone 
> will share.

This is one of the big drivers of doing the language this way: to make
the language much more flexible, and allow much more reuse than is
possible with the current language.

The modules would be very similar to Python's module system, and are
really just a means to group similar things together under one name.

> >       * The 'del' lens is new; deleting entries from the input (i.e.,
> >         not including them in the tree) now needs to be more explicit,
> >         so that a default value can be included - the default was
> >         previously part of the token definition.
> Does this mean that if a read a file in with comments, and write it out 
> the comments are lost?

No, maybe 'del' is a bad name for this. The behavior is exactly the same
as if you have just a plain regexp in a grammar rule right now: whatever
is matched by that regexp is not included in the tree, but will be
restored when the tree is saved back to the file.

> I will be honest.. I dont know why I need the first sep function in the 
> example. What value is it buying me? The older syntax seemed closer to 
> an BNF, but I dont know if a pure BNF gives enough data to be 
> bi-directional.

You don't need the sep function at all, you could just write 'let
sep_tab = del /[ \t]+/ "\t"' - I just put it there to show that by
allowing functions, you gain a lot more flexibility because you can
parametrize lots of things. But you don't have to by any means.

Think about processing some XML file with Augeas: in the current syntax,
you'd have to write down regular expressions for matching an opening
tag, attributes, and a closing tag over and over again, even though
they'd only differ in the name of the element (or attribute).

With this new syntax you could write something like (I left out any
del/key/store etc. lenses to keep the example concise):

        let open_tag_begin (element:string) = "<" . element . /[ \t]+/
        let attribute(name: string) = /["']/ . name . /[ \t]+=[ \t]+/ . /[^"']/
        let open_tag_end = ">"
        let close_tag(element:string) = "</" . element . ">"

and then use those definitions to process concrete tags.

> > I also want to add a simple unit test facility, so that you can say
> > right in your .aug file something like
> >         test Hosts.lens put
> >           "127.0.0.1 localhost"
> >         after
> >           rm /0
> >         = ""
> > with the general structure "test LENS put S after COMMANDS = S'" where S
> > and S' are strings and COMMANDS is a list of augtool-like commands to
> > manipulate the tree. The test would parse S into a tree, run COMMANDS on
> > it and check that the result of turning the tree back into a string is
> > S'.
> 
> any reason to not make it seperte, prehaps part of a dev package?

I think it's more convenient to have those tests with the definitions of
the lenses, and intend to have augtool ignore them by default and/or
have another tool similar to augparse that is more geared towards
developing file format descriptions.

But in general, I would leave the decision whether you have tests in the
main definition file or an auxiliary file up to people's taste and not
make special provisions in the language to prefer one over the other.

> I really dont like the idea of the config tree being different from the 
> actual file system. I understand that it would promote portablity 
> between layouts, but if I am workning on file system editing.. I think 
> it would improve readability to see the actual hierarchy.

As so many things: it depends ;) For some things, you want a very clear
maping between files and entries in the tree. For others, you don't care
- for example, for yum repos, I don't think anybody really cares too
much what file the fedora repo is defined in; all you care is to turn
gpgcheck on for that repo. Do you see that differently ?

> So.. on start up.. I could see two models:
> 
> 1) Could you cache the mappings at startup, and then lazily read read in 
> the grammers? So... you would only read in the grammer for pam when I 
> access something under /system/config/pam (well.. for me ideally /etc/pam)

Yeah, that's one thing I would like to support. Not quite sure how to
structure things to make that easy. One option I've been thinking is to
have a special directory where you plonk a file that contains just the
'autorun' directive from my example, and everything else in the example
goes into a different file that is processed only as needed.

> 2) I could see a model where you "open" the file,

Definitely something I want to support - I've not done it mainly because
I want to wait until after the language revamp to have a clearer picture
how to best expose that in the API. In particular, should that be a
special purpose call just to load a file description and run it or
should it be a more generic 'eval' call where you send an expression in
the language, and one of the things you can do is have it load a file
into the tree, something like 'eval("Hosts.files")'

>  and then use the tree 
> syntax relative to that root. This is probably better for the api, but 
> something like
> edit /augeas/files/etc/hosts
> set 10000/ipaddr = 192.168.0.1
> set 10000/canonical pigiron.example.com
> clear /10000/aliases
> commit

If I understand you right, you're arguing to reduce the amount of
abstraction that Augeas specifications introduce. How do you view cases
where you don't really care where entries are stored in the file system,
like yum repo definitions or files in /etc/httpd/conf.d ? Imaging if
you'd just want to go through your Apache config and make sure
Allow/Deny statements are set to something sane. Should the user of
Augeas have to know where in the filesystem Apache config files are
stored ? Should they be able to have Augeas follow Include directives or
should they have to figure that out by themselves ?

> What about specific validtions? So.. I know that in hosts the ipaddr 
> should be in a known format. It seems like I could support this now, is 
> this correct?

Yeah, you could just write a more specific regular expression for
ipaddr. It's not clear to me though where to best draw the line between
being able to read as many config files as possible (i.e. even ones that
are mildly wrong, like with an invalid ipaddr) and being very precise in
specifying these things, like not accepting "127.0" as an ipaddr.

David





More information about the augeas-devel mailing list