[augeas-devel] Proper locale support

David Lutterkort lutter at redhat.com
Sat Oct 24 10:34:12 UTC 2009


On Fri, 2009-10-23 at 20:28 +0100, Daniel P. Berrange wrote:
> Would  PCRE work for Augeas ?
> 
>  http://www.pcre.org/
>  http://en.wikipedia.org/wiki/PCRE
> 
> 
> Apparently this is the regex lib used by Apache, Exim & a bunch of
> other apps

>From a quick look at the docs, it seems that PCRE by default matches in
the C locale, which is what Augeas needs. What has kept me from using
PCRE is that it's regexp syntax lets you describe way more than just
regular languages through things like lookahead assertions and
backreferences[1] (I seem to remember that Perl supports some sort of
recursive matching, too, and I assume that PCRE does, too)

Augeas needs to make sure that it only deals in regular lanuages, since
the computations in the typechecker can only be done for regular
languages. Using PCRE would require a syntax translator from Augeas'
POSIX ERE-ish syntax to PCRE's.

Besides the strange language class that pcre can recognize[2], GNU regex
is available on any platform under the sun via gnulib. Ideally, somebody
would write a replacement for GNU regex that allows passing in an
explicit locale, but that's quite a project, even for POSIX extended
regexps. Now that uselocale is in POSIX, it's only a matter of time
though that every platform will support it.

David

[1] Basic POSIX regexp's also support backreferences, which regex(7)
rightly calls a 'dreadful botch' - GNU regex lets you turn that off.

[2] A backreference to something like /[a-z]+/ makes the 'regexp'
describe a language that is not even context free, let alone regular.





More information about the augeas-devel mailing list