[Freeipa-users] any tips or horror stories about automating dynamic enrollment and removal of IPA clients?

Thu Apr 13 14:25:45 UTC 2017

On Thu, 2017-04-13 at 17:16 +0300, Alexander Bokovoy wrote:
> On to, 13 huhti 2017, Simo Sorce wrote:
> >On Thu, 2017-04-13 at 08:05 -0400, Chris Dagdigian wrote:
> >> Hi folks,
> >>
> >> I've got a high performance computing (HPC) use case that will need AD
> >> integration for user identity management. We've got a working IPA server
> >> in AWS that has 1-way trusts going to several remote AD forests and
> >> child domains. Works fine but so far all of the enrolled clients are
> >> largely static/persistent boxes.
> >>
> >> The issue is that the HPC cluster footprint is going to be elastic by
> >> design. We'll likely keep 3-5 nodes in the grid online 24x7 but the vast
> >> majority of the compute node fleet (hundreds of nodes quite likely) will
> >> be fired up on demand as a mixture of spot, RI and hourly-rate EC2
> >> instances. The cluster will automatically shrink in size as well when
> >> needed.
> >>
> >> Trying to think of which method I should use for managing users  (mainly
> >> UID and GID values) on the compute fleet:
> >>
> >> [Option 1]  Script the enrollment and de-install actions via existing
> >> hooks we have for running scripts at "first boot" as well as
> >> "pre-termination".  I think this seems technically pretty
> >> straightforward but I'm not sure I really need to stuff our IPA server
> >> with host information for boxes that are considered anonymous and
> >> disposable. We don't care about them really and don't need to implement
> >> RBAC controls on them. Also slightly worried that a large-scale
> >> enrollment or uninstall action may bog down the server or (worse)
> >> perhaps only partially complete leading to an HPC grid where jobs flow
> >> into a bad box and die en-mass because "user does not exist..."
> >>
> >> [Option 2]  Steal from the HPC ops playbook and minimize network
> >> services that can cause failures. Distribute static files to the worker
> >> fleet --  Bind the 24x7 persistent systems to the IPA server and force
> >> all HPC users to provide a public SSH key. Then use commands like "id
> >> <username" and getent utilities to dump the username/uid/gid values so
> >> that we can manufacture static /etc/passwd, /etc/shadow and /etc/group
> >> files that can be pushed out to the compute node fleet. The main win
> >> here is that we can maintain consistent IPA-derived
> >> UID/GID/username/group data cluster wide while totally removing the need
> >> for an elastic set of anonymous boxes to be individually enrolled and
> >> removed from IPA all the time.
> >>
> >> Right now I'm leaning towards Option #2 but would love to hear
> >> experiences regarding moderate-scale automatic enrollment and removal of
> >> clients!
> >
> >One option could also be to keep a (set of) keytab(s) you can copy on
> >the elastic hosts and preconfigure their sssd daemon. At boot you copy
> >the keytab in the host and start sssd and everything should magically
> >work. They all are basically the same identity so using the same key for
> >all of them may be acceptable.
> It would be better to avoid using Kerberos authentication here at all.
> 
> Multiple hosts authenticating with the same key would cause a lot of
> updates in the LDAP entry representing this principal. This is going to
> break replication if this is the only key that is used by multiple hosts
> against multiple IPA masters.

If replication is a issue we should probably mask those attributes from
replication as well, just like we do for attributes for failed auth.

Simo.

-- 
Simo Sorce
Sr. Principal Software Engineer
Red Hat, Inc