[augeas-devel] regex syntax

Tue Jul 22 17:42:21 UTC 2008

Before I respond, quick apologies to the list for not replying to previous
message
the right spot.

(and apologies to the rhel5 list that I accidentally moved this thread to)

David Lutterkort <dlutter at redhat.com> wrote on 07/17/2008 03:12:00 PM:

> On Thu, 2008-07-17 at 14:21 -0500, Greg_Swift at aotx.uscourts.gov wrote:
> >
> > David Lutterkort <dlutter at redhat.com> wrote on 07/17/2008 01:03:59 PM:
> > > How exactly did you do this ? Augeas uses extended POSIX regexp
> > > syntax[1] - that syntax is also used by some command line tools. For
> > > playing with individual regexps, it's sometimes useful to play e.g.
with
> > > sed and do 'sed -r -e 's/MYREGEXP/FOO/' to see exactly what a regexp
> > > matches ... like 'sed -r -e 's/[ \t]*/<spaces>/' will replace
> > > whitespaces on an input line with the literal string '<spaces>'.
> >
> > I tried it multiple ways.  I used grep and sed... although I did not
use
> > the '-r -e'.  And based on the check I just ran it behaves
differently...
> > hrmph...
>
> The '-r' is essential since it switches to extended POSIX regexp syntax
> (instead of basic syntax, which is quite different)

okay... that helps to be kept in mind.

> > Basically I started by slowly following the trail of the regex, using
grep
> > to verify a line matches and sed to remove the match.
> >
> > So if augeas/files/*/error showed:
> > (((([A-Z_]*)([ \t]*))(((([a-z_]+)(=))([^,:=
\t\n]*))|([a-z_]+)(=))([^,:=
> > \t\n]*))|([a-z_]+)))*))(\n)
> >
> > I would break it down to ($file is the originally attached
onconfig.test)
> >
> > grep "\([A-Z_]*\)" $file
> > sed "s/\([A-Z]*\)//" $file
>
> It might be easier to go the other way: take small snippets of the
> config file, and run 'test LNS get SNIPPET = ?' on it - you can use any
> lens in a test, not just the 'main' lens from a module.

Hopefully I'll get a chance to dig into that this week.  Implementing
augeas
looks like it would be a major win for my org (whether they know it yet or
not).

> > But i did just try the -r with that and
> > sed "s/\([A-Z]*\)\([ \t]*\)//" $file
> > sed -r "s/\([A-Z]*\)\([ \t]*\)//" $file
> > give much different results
>
> Yeah, one of the big differences between extended and basic POSIX syntax
> is that to match 0 or more 'a's in extended syntax is 'a*' whereas in
> basic syntax it's 'a\*'.

i had a stupid question for that but I figured it out myself (basically the
difference between the results above is that ere doesnt use \'s on the ()'s
)

-greg