"Stateless Linux" project

Thu Sep 23 18:31:14 UTC 2004

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Havoc Pennington wrote:

> Red Hat engineering is starting a new project we're calling "stateless
> Linux" ...

Good stuff.  The "ideal properties" of an OS (recreating state
bit-for-bit on new hardware, modifying groups of machines in a single
step): absolutely.  Further, I'd say, it should be straightforward to
rapidly modify single machines (or groups) for specialized purposes.

To me, this emphasizes is the need for smart configuration management
technology.  Thin/cached/fat clients: largely orthogonal.  Home
directory syncing/backup strategies: almost entirely orthogonal.  What
we want are good ways to keep large numbers of diverse machines
configured as required -- once we have that, implementing specific
policies is mostly secondary.

As the document says, it's enough to reconstruct all client state from a
central location.  John Hearns and Josh England have already pointed out
that there are substantial mechanisms around for doing a good chunk of
this.  I'm not familiar with oneSIS <http://onesis.sf.net/> beyond
reading the documentation, but can maybe briefly describe how LCFG
<http://www.lcfg.org/> (and to a large extent, Quattor
<http://www.quattor.org/>) do things.

LCFG can be used to control any category of machine.  It's in use at
Edinburgh University's School of Informatics to manage around 1000
nodes, comprising desktops, laptops, servers & cluster machines.  To
each node you add a collection of small init-like scripts, called
components, responsible for controlling daemons and config files.  You
don't need a read-only root.  Components read a per-node "profile"
telling them, eg, what the current mail hub is.  The mail component is
responsible for creating/modifying the local MTA's config file so that
it specifies that mail hub -- and if there's a daemon, (re)starting it
as appropriate.  (Compare with oneSIS where, say, /etc/mail/sendmail.cf
would be linked to the per-node/group/cluster master copy via a RAM disk
- -- if I've got it right.)

The per-node profile -- much like a long list of key-value pairs -- is
compiled centrally from a declarative description of what each node
should look like.  This description permits machine grouping and
prototypes via a #include-type mechanism.  So it's easy to make small
per-machine or per-node tweaks.  The same description is used to specify
the complete RPM list for each node (all non-config files on nodes come
from RPMs).  Profiles are distributed to nodes by HTTP(S).  A node gets
a UDP notification when its profile has changed, or can just poll the
config server.  There's also a way to make controlled changes to the
local profile without the need to talk to the config server -- eg, on a
laptop when it moves between networks.  And nodes can be built entirely
automatically from bare metal (but no magical solution to bootstrapping
host keys in, eg, a Kerberos environment).

Couple of other thoughts:

(*) Mobility -- and disconnected operation -- is an important
    architectural constraint.  (For instance, the current Edinburgh LCFG
    deployment integrates laptops & desktops, and ends up replicating
    DNS & LDAP servers on *all* nodes.)

(*) The central location that stores client state doesn't ultimately
    have to be a central location.  It could be that intended client
    state is assembled from various kinds of config data distributed
    around the net -- perhaps via peer-to-peer or service discovery type
    mechanisms -- leading to improved resilience & autonomy.  This is
    just another take on "dynamic/automatic configuration", I guess.

Lex
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1

iD8DBQFBUxZrZnhMljHa7tgRAoZOAJ9EJ5Cx68/Orfct1b8amat/vya8AACfX2mg
ajEbcygyypPGLMzgtQD7Aug=
=vocE
-----END PGP SIGNATURE-----