rawhide report: 20050405 changes

Daniel Veillard veillard at redhat.com
Tue Apr 5 17:38:57 UTC 2005


On Tue, Apr 05, 2005 at 01:28:43PM -0400, Konstantin Ryabitsev wrote:
> On Apr 5, 2005 11:35 AM, seth vidal <skvidal at phy.duke.edu> wrote:
> >  Icon was working on repoview and decided to try out CelementTree b/c he
> > was using kid anyway and it used it. After some preliminary tests it
> > showed up as significantly faster parsing the metadata. For
> > primary.xml.gz the times went from 21s for 1800ish pkgs to 7s. Then when
> > he switched it to use iterparse() the memory footprint dropped below 10M
> > for the whole parse.
> 
> That was filelists.xml for development -- 36MB of XML, not
> primary.xml. There was a 2.5 times speed improvement with cElementTree
> code, compared to old yum code, using libxml2 -- around 20 seconds for
> libxml2, and around 7-8 seconds for cElementTree on an AMD Athlon
> 2600+.

  it's not libxml2. you will see that libxml2 by itself on such a
box should parse your 36MB of XML in a mere second. Try with xmllint --stream
for example. The potential for improvements by solving the python string
import would have been huge if you did want to look at it, but it seems
obvious you didn't want.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard at redhat.com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/




More information about the fedora-devel-list mailing list