perl and UTF-8

Anand Buddhdev anand at celtelplus.com
Tue Jun 22 08:14:41 UTC 2004


On Mon, Jun 21, 2004 at 08:59:08PM -0300, Pedro Fernandes Macedo wrote:

> > good idea. Just be aware that sed and grep are quite slow in UTF
> > environments, and you should run then like this if you know your text
> > is ascii:
> >
> > LANG=C grep ....
>
> This shouldnt be needed.. I remmember seeing a update announcement for 
> grep released some time ago that should fix this on 
> fedora-announce-list. If it is still slow , please fill a report on 
> bugzilla.

On my FC1 system:

[arb at home arb]$ rpm -q grep
grep-2.5.1-17.4

[arb at home arb]$ echo $LANG
en_US.UTF-8
[arb at home arb]$ time grep zymology docs/sowpods.txt
enzymology
zymology
 
real    0m0.267s
user    0m0.260s
sys     0m0.000s

[arb at home arb]$ export LANG=C
[arb at home arb]$ time grep zymology docs/sowpods.txt
enzymology
zymology
 
real    0m0.012s
user    0m0.000s
sys     0m0.000s

Grep is clearly still much slower in UTF8.

-- 
Anand Buddhdev
Celtel International





More information about the fedora-list mailing list