ANNOUNCE: rpmgrok - a web-based tool for tracking a full distribution of RPMs

Axel Thimm Axel.Thimm at
Wed Aug 6 20:42:33 UTC 2008

On Wed, Aug 06, 2008 at 03:35:45PM -0400, David Malcolm wrote:
> I've been working on a new web-based tool for analyzing Fedora: rpmgrok
> It digests built RPMs, analysing the metadata and payload, and stores
> the results in a database.  There's a web UI for viewing the data, an
> XML-RPC interface for querying it, and a command-line tool for using the
> XML-RPC interface.
> I've got a prototype running on:
> More info (e.g. source code) can be seen at
> The idea is to provide a new way for Fedora developers, testers, and
> other enthusiasts to track various things across the entire
> distribution, without having to have a full tree installed.  It's
> probably usable by other Linux distributions.

I like the part with the "Errors and warnings from rpmlint". I can
imagine that someone that's say an expert on "invalid-desktopfile"
issues could now dig into this much easier. Very nice!

> rpmgrok is Free/Open Source software (licensed under the LGPLv2.1)
> == What does it track? ==
>   - all symbols in binaries/libraries, and the dependencies between
> them, so that you can see e.g. exactly what calls a particular function.
> This can also be used to locate instances of static linkage.  See e.g.
> (details
> of /lib/ from a built RPM)
>   - manifests of all RPMs, so that you can browse the files in packages
> via a web UI.  See
> The file view is only interesting at the moment for ELF files (binaries)
> and for .desktop files.
>   - all shared objects names, and the dependencies between them.  See
> e.g.
>     - (browsable
> view of all sonames in the distro)
>     -
> (details of /usr/lib/ from within a built libxml2 rpm)
>     - Everything implementing or linking against down to
> the level of individual binaries:
>   - results of rpmlint of all rpms.  See
> for a UI
> to browse by error message, and e.g.
> for an example of all error messages of a particular kind.  It may be worth fixing some rpmlint errors (though others look like false positives, and others are probably not worth it)
>   - all .desktop files and their fields so that you can e.g. find
> applications that can handle PDF files.  See e.g.
> for a view of all desktop files that can handle "application/pdf", and
> e.g.
> showing a specific desktop file
>   - SLOCcount stats (see ) for
> prepped source trees (e.g "what % of Fedora is in C/C++/Python?" etc).
> Don't have the data prepped yet.
>   - any other kind of thing we want to add (provided there's a sane way
> to gather it in a script and slurp it into the database, of course...)
>     - sizes of packages; why is package foo so big?
>     - report on all fonts in the distro, and what packages provide them
>     etc
> Note that due to my poor css there are lots of links that don't show up
> as such in the various table views.  You may need to explore with the
> mouse to find all of the cross-referencing that the web UI has.
> == What's it currently showing? ==
> I queued up an analysis of all of rawhide as of 2008-07-25 on i386; a
> little over 10000 built packages.  It took about a week to process, and
> about 200 of these jobs failed for one reason or another.  See
> for more info.
> So the db is currently just showing a snapshot in time of rawhide two
> weeks ago, on one architecture (and missing 2% of the packages due to
> errors).  
> Ultimately I want to build things up so that we can show time-based
> trend reports e.g. the size of a minimal install over time (or
> whatever).
> == Help Needed! ==
> Hopefully this looks of interest to people.
> I need help with coding, with sysadmin work, with making the UI better,
> and with things I probably haven't thought of yet etc.  I hope this can
> be a useful tool for Fedora.
> If you're interested in hacking on rpmgrok, get in touch.  The README
> file is hopefully of interest; see
>;a=blob_plain;f=README.txt;hb=HEAD README.txt
> It's implemented using TurboGears and SQLAlchemy (specifically,
> sqlalchemy 0.4, since it uses polymorphic inheritance features from that
> version).
> It also has a somewhat general-purpose task scheduler, used to control a
> pool of worker hosts that do the actual analysis.  It ought to be
> pluggable to do other types of task.
> == Source Code ==
> Git URLS are:
>   git://
>   ssh://
> (you need to be in the gitrpmgrok of the Fedora Accounts System to have
> git push privileges; talk to me if you want to get involved)
> == Related work ==
> Inspiration includes
>   - the OpenGrok project (see
> though that appears to focus
> on source trees, whereas rpmgrok focuses on built packages)
>   - the Debian project's Lintian tool (see )
> Enjoy!
> Dave

Axel.Thimm at
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
URL: <>

More information about the fedora-devel-list mailing list