why doesn't yum cache anything?
Daniel Veillard
veillard at redhat.com
Fri Dec 31 13:04:59 UTC 2004
On Fri, Dec 31, 2004 at 01:27:49PM +0100, Farkas Levente wrote:
> Daniel Veillard wrote:
> > Parsing the XML file and building the associated Python objects.
> >
> >And before bashing XML and the cost of parsing, it's only a very small
> >fraction of the time spent, building the Python strings and objects is
> >the really costly part as we found with seth when doing basic tests.
> >My own test led me to believe that python string interning (take a
> >string from the C layer or XML and get the copy from Python own string
> >implementation) is extremely costly, and of course we are manipulating
> >an very large amount of strings when collecting the repodata.
>
> have you already made some real mesurement?
of what ? yes I know exactly how long it takes libxml2 to parse
the data:
[root at localhost ~]# xmllint --stream --timing /var/cache/yum/base/primary.xml.gzParsing took 1094 ms
using the reader at the C level, this include decompressing the archive
and walking though all nodes. The main cost is to turn the parsed data into
Python's internal representation as I said.
> than wouldn't be useful to
> implement that small portion in C? or it isn't so small part?
The string interning is in the Python lib, probably in C as it's a C API
as far as I can tell. And no I din't looked at python internal code.
Daniel
--
Daniel Veillard | Red Hat Desktop team http://redhat.com/
veillard at redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
More information about the fedora-devel-list
mailing list