[libvirt-users] Hook problem

Wed Jan 30 17:06:16 UTC 2019

On Wed, Jan 30, 2019 at 04:42:47PM +0000, David Gilmour wrote:
> I am trying to use /etc/libvirt/hooks/qemu to control the startup of
> several guests with interdependencies.  The goal is to delay the start
> of guest B until the DNS server on guest A is running.  To accomplish
> this, I wrote a qemu hook script that detects the normal startup of
> guest B and start a second script in the background to wait until the
> preconditions to start B are fulfilled, then start B using a call to
> the virsh command.
> 
> For this strategy to work, it must handle the case where libvirt has
> chosen guest B as the first guest to attempt to start.  (Although
> renaming the symlinks in /etc/libvirt/qemu/autostart to force starting
> the guests in a particular order might work, I do not want to rely on
> this undocumented behavior).  In the case where libvirt happens to
> attempt to start guest B before it starts guest A, the hook script
> needs to somehow tell libvirt to skip guest B and go on to starting
> the next guest.  Otherwise a deadlock would result as libvirt waited
> for B to start, but B was waiting for A to start.   I have tried to
> handle this by returning failure from the hook script for the initial
> attempt to start B once the background script has been started to
> implement the DNS check and eventually the delayed start of B.
> 
> Unfortunately, I cannot find a way to force libvirt to continue until
> the background script exits.  No combination of background execution,
> nohup, disown, setsid -f, or at seems to detach the process
> sufficiently to "fool" libvirt into acting on the "exit 1" line in the
> qemu script and proceed on to start other guests.  As a result, the
> dependency of B on A deadlocks, and neither guest ever starts.
> 
> Can someone please either find an error in my approach or propose a
> different strategy to implement this customized dependency of the
> startup of one guest on another?

When libvirt runs the hooks, it will capture stdout/stderr of
the script. It will expect to see EOF from stdout/stderr before
it will allow execution to continue. It will wait for EOF, even
if the hook script itself has exited. It only checks for exit
once the EOF is seen, which leads me to the bug......

> Here is my qemu script:
> 
> #!/bin/bash
> if [[ "$2" == 'start' ]]; then
>         echo "$0: Starting $1..." |& logger
>         if [[ "$1" == 'B' ]]; then
>             # The next line is where the background script is invoked
>                 /bin/bash /usr/local/bin/startB &

.... startB is inheriting stdout + stderr of the hook script and so
will keep them open, even after the hook script exits. Thus libvirtd
still considers the script active.

Your startB script needs to explicitly close its stdout + stderr, and/or
replace them with /dev/null or a logfile if you want to see any debug
messages.

>             # These also don't work:
>             # (/bin/bash /usr/local/bin/startB) & ; disown
> # setsid -f (/bin/bash /usr/local/bin/startB) & ; disown
>             # Unfortunately, the exit in the following line doesn't force libvirt to move on to the next guest to start until the background command has itself exited
>                 exit 1;
>         fi
> fi
> 
> Here is the startB script, including a call to a program named in the $dnssuccess variable that does the testing of DNS availability on guest A:
> 
> #!/bin/bash
> until $dnssuccess
> do
>     echo "$0: Delaying start of guest B 10 seconds" |& logger
>     sleep 10;
> done
> # It's now OK to start guest
> echo "$0: Now starting guest B" |& logger
> virsh start B;
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|