rawhide report: 20050405 changes

seth vidal skvidal at phy.duke.edu
Tue Apr 5 15:35:35 UTC 2005


>   Is that worth adding yet another XML Parser package to the distribution
> used by a single tool ? Is there a compatibility layer to still use
> libxml2 ? 
>   If I remember correctly, the performance problem wasn't libxml2 itself
> but the specific usage within yum, i.e. collecting the data, libxml2 by
> itself is parsing the megabyte sized file in less than a tenth of a second.
> I'm surprized the solution ends up going to use a python specific library
> instead of trying to find why the interface between libxml2 and yum generated
> that problem. I don't remember you saying you would switch library as a result.

well what happened was this:
 Icon was working on repoview and decided to try out CelementTree b/c he
was using kid anyway and it used it. After some preliminary tests it
showed up as significantly faster parsing the metadata. For
primary.xml.gz the times went from 21s for 1800ish pkgs to 7s. Then when
he switched it to use iterparse() the memory footprint dropped below 10M
for the whole parse.

Check out the numbers on the cElementTree webpage. They're fairly
compelling. 

The biggest reason I've not talked to you about it much is that for the
last few weeks I've been in kinda deep-hack mode and not communicating
as much as I have in the past.

Sorry for the problems.

-sv





More information about the fedora-test-list mailing list