[Rdo-list] [OFI] Astapor, Foreman, Staypuft interaction

Mon May 12 13:30:31 UTC 2014

On 12/05/14 14:08 +0100, Martyn Taylor wrote:
>All,
>
>I recently had some discussion about HA orchestration this morning 
>with Petr Chalupa.  Particular around the HA Controller node 
>deployment.  This particular role behaves slightly differently to the 
>other roles in a Staypuft deployment in that it requires more than one 
>puppet run to complete.
>
>Up to now we have worked on the assumption that once we have received 
>a successful puppet run report in foreman, then the node associated 
>with the role is configured and ready to go.  We use this for 
>scheduling the next list of nodes in a given deployment.
>
>We do have a work around for HA Controller issue described above in 
>the astapor modules.  Blocking is implemented in the subsequent puppet 
>modules that are dependent on the HA Controller services. This means 
>that any depdendent modules will wait until controller completes 
>before proceeding.  This results in the following behaviour.
>
>Sequence
>- Controller Nodes Provisioned.
>- First puppet run returns successful.
>- LVM Block Storage is provisioned.
>- Controller Node puppet run 2 completes
>- LVM Block storage puppet run completes.
>
>In this case, the LVM block storage is provisioned before the 
>controllers are complete, but will block until the Controller puppet 
>run 2 completes.
>
>This work around is sufficient for the time being.  But really what we 
>would like is to have Staypuft orchestrate the whole process, rather 
>than it be partially orchestrated by the puppet modules, partially by 
>Staypuft orchestration.
>
>The difficulty we have right now in Staypuft is that (with out knowing 
>the specific implementation details of the puppet modules), there is 
>no clear way to detect whether a node with role X is complete and we 
>are able to schedule the next roles in the sequence.
>
>What we need here is a clear interface for determining status of 
>puppet class and/or HostGroup status for the Astapor modules.
>
>I have 2 questions around this,
>
>1.  Does there currently exist anyway to consistently detect the 
>status of a role/list of classes within Foreman for Astapor classes 
>that we can utilize?
>   -. If so can we do this without knowing the implementation details 
>of the Astapor puppet modules?  (We do not want to, for example, look 
>for class specific facts in foreman, since these vary between classes 
>and may change in Astapor)?
>
>2.  If not 1.  Is is possible to add something to the puppet modules 
>to explicitly show that a class/Hostgroup is complete?  I am thinking 
>something along the lines of reporting a "Ready" flag back to foreman.
>
I'll have to think about it more, but we already have a fact similar
to this that we use in quickstack for determining if ha-mysql is
ready, so we can decide whether to do certain other steps.  Crag had
some concern that we were seeing an odd behavior with puppet agent
running as a service though, not sure if he and Petr looked at it
friday or not.  In case they did not, his theory was that the puppet
facts from the node were not getting updated correctly between agent
runs when the agent was not a service.  It seemed that the node was
reporting in and the next run still did not have the new value for the
fact (so in this case, the second run should show ha_mysql_ready=true
or similar).  The fact was correct when puppet agent was run in the
foreground for each run, so I believe the thought was that when agent
ran as a service, facts were being cached and not updated.  I am
unsure if this has yet been either proved or disproved, just
mentioning it in case it is a real issue.

Anyway, if that were _not_ an issue, it would be simple enough to add
a controller_ready fact or similar to quickstack.  I am still not sure
if this is the best approach, but it is definitely feasible, we have
all the information available to us to report back such a thing.

-j
>If none of the above, any other suggestions?
>
>Cheers
>Martyn