Boot speedup with readahead

Thu Sep 11 00:48:00 UTC 2008

On Wed, Sep 10, 2008 at 04:46:27PM -0400, Seth Vidal wrote:
>On Wed, 2008-09-10 at 16:16 -0400, Michael Stone wrote:
>
>> Please remember that people use packaging tools for more than simple
>> system maintenance; for example, I use an image-builder which is now
>> based on smart (even though it was originally based on yum) because
>> smart is more convenient for some of my programmatic use cases [1].
>
>I think I'd like to look at those use cases in more detail. 

I'm happy to explain in more detail, though I'll do so in my own order.

First, some introduction: my software [1] consists of a family of python
programs (called "compilations", since each is a primitive compiler), designed
to be run in a mock chroot, which convert collections of packages and hacks
into publishables (for example a list of installed packages, a XO JFFS2 disk
image, and a tarball of the filesystem contents). 

   [1]: http://wiki.laptop.org/go/Puritan

Its primary goal is to ease the task of creating releaseable disk images for
OLPC XOs in a reproducible and verifiable fashion by storing all effective
inputs to the build process in git commits. Its secondary goal is to overcome
some of the infelicities of pilgrim with respect to error detection, cleanup,
and debuggability. It's third goal is to be useful to expert Python programmers
who maintain Linux builds for fixed hardware regardless of their preferred
distro and package-format. It therefore exists in contrast to general-purpose
tools like debian-installer, anaconda, revisor, and livecd-tools, which are
distro-specific build tools with an interest in shielding their users from the
tool's implementation's underlying error-detection, analysis, and correction
mechanisms and workflows, often by means of a domain-specific language.

Smart is convenient to these goals in that it has a thoroughly documented shell
interface which permits the following useful operations:

Smart permits me to control my selection of package repositories package more
succinctly and less distro-specifically than yum; e.g.

   smart channel -y --remove-all
   smart channel -y --add bootstrap type='rpm-md' baseurl='http://dev.laptop.org/~mstone/puritan-repo/'
   smart channel -y --add bootstrap-f9 type='rpm-md' baseurl='http://dev.laptop.org/~mstone/puritan-f9-repo/'
   smart channel -y --enable bootstrap
   smart channel -y --enable bootstrap-f9
   smart update
   smart install -y olpc-crcimg python-msutils

I was able to write these basic commands without reference to the smart source
code using only its man page; could I have done the same with yum?

   (In release compilations, the rpm repos being sourced would be included as
    git submodules of the compilation commit thereby achieving full history good
    reproducibility.)

Next, after adding two new repos (for olpc-specific packages and for general
F-9 packages), I run

   for pkg in firefox seamonkey mozplugger kdebase kernel; do
     smart priority --set $pkg olpc-joyride 100
   done

in order to bias several packages toward the OLPC repo. I could do something
similar in yum with the '--exclude' option or by modifying the yum repo
configuration, but it would be much more annoying either way in that I'd need
more complicated code to pipe the results of my configuration phase into the
installation phase.

Next, mirror configuration:

   smart mirror --add http://koji.fedoraproject.org/static-repos/dist-olpc3-build-current/i386/ http://koji.fedoraproject.org/packages/

This command essentially lets me perform url-rewriting as needed. This ability
is not _necessary_ for my purpose, but it sure is convenient from time to time.

>> priority system, 
>available in a yum plugin - though questionably useful to begin with.

Yum plugins seem, at first blush and in general rather than in specific, highly
inconvenient for my purpose because of issues like the:

  * lack of uniform quality control 
  * absence of centralized documentation 
  * potential lack of unified error handling conventions,
  * perceived difficulty of installation and potential for version mismatch
    between plugin and host,
  * lack of a API stability guarantees

(Plugins offer a different and compelling social benefit; namely, pluggability;
however, that's not the social benefit that _I_ happen to need at the moment.)

>> support for alternate package formats,
>not sure how this helps fedora.

First, must it help Fedora in order to be valuable to me or to be important to
my particular use case? Live and let live, friend. 

Second, please acknowledge that I am, for the time being dedicated first to the
XO hardware and second to Fedora. This being what it is, and there lots of
other people out there who would like to get their pet distribution running on
the XO hardware, it's important to me to be able to collaborate with others
using the same tools that I (hope to soon) use during the rest of my day. Isn't
that going to benefit Fedora insofar as it helps make Fedora packages work
better on all 400k XOs we've shipped it on and insofar as it stimulates more
people to make useful contributions to upstream projects?

>> built-in ability to download packages,
>yum can clearly download pkgs - not sure why it being built in rather
>than a plugin matters though.

For the reasons I discussed above -- less complexity for me, the tool-writer.

>> strictness options, etc.)
>
>strictness of what?

Error-checking. Pilgrim was notorious for producing broken builds because
pilgrim failed or was unable to detect situations in which yum was unable to
install all the requested packages. Perhaps this is a documentation problem --
I note that the yum man page contains no information about what error codes yum
will or can be made to return (if any?).

In particular, 

   yum -d 10 -e 10 install grhlkjoei

returns 0. This needlessly complicates my life: I _DO NOT_ want to have to
parse the yum output in order to learn that a problem occurred.

>explain a bit more about the features you're looking for and I'll see
>what I can do to help.

In conclusion, yum does lots of helpful things but it doesn't do the exact
useful things that I need. Yum can certainly be made better by polishing some
of these infelicities and I hope that my exposition offers you some useful
guidance on conventions and social properties that make my life, as a generic
Unix programmer working in Python, much easier were yum's goals a better match
for my own.

Regarding the possibility of using the yum python modules directly: 

  * what documentation exists?
  * what guarantees of API stability exist?
  * what sample code exists and is it sane or contorted?

Also, consider the simplicity of my solution: I accomplished everything I
needed to (repo configuration, priorities, mirrors, downloading, installation,
cross-distro tool, and full error-detection) in something like 40-60 lines of
my code which I was able to write from a single man-page and the online help
for my tool of choice. I'd be interested in knowing whether you can offer me
something comparable. 

Kind regards,

Michael

P.S. - I hope I've successfully communicated that I really am interested in my
continued ability to make use of all the hard work that goes into maintaining
Fedora's packages even though I find myself presently unable to justify doing
so with yum?