perl and UTF-8

Anand Buddhdev anand at celtelplus.com
Tue Jun 22 14:25:52 UTC 2004


On Tue, Jun 22, 2004 at 03:15:56PM +0100, Jonathan Rawle wrote:

> Anand Buddhdev wrote:
> 
> > [arb at home arb]$ time grep zymology docs/sowpods.txt
> > enzymology
> > zymology
> >  
> > real    0m0.267s
> > user    0m0.260s
> > sys     0m0.000s
> > 
> > [arb at home arb]$ export LANG=C
> > [arb at home arb]$ time grep zymology docs/sowpods.txt
> > enzymology
> > zymology
> >  
> > real    0m0.012s
> > user    0m0.000s
> > sys     0m0.000s
> > 
> > Grep is clearly still much slower in UTF8.
> 
> I'm not sure that's quite a fair test as the files will be cached in the
> second case.

I know about caching, and so I first ran the test to fill the cache.
And then I ran it with UTF on and then off. So I'm pretty sure caching
wasn't an issue. Anyway, you can keep repeating this test with various
different regular expressions, and you will see the difference in
speed very clearly.

-- 
Anand Buddhdev
Celtel International





More information about the fedora-list mailing list