Is it possible that "virsh destroy" does not stop a domain ?

Peter Crowther peter.crowther at melandra.com
Wed Oct 7 17:26:53 UTC 2020


Bernd, another option would be a mismatch between the message that "virsh
destroy" issues and the message that force_stop() in the pacemaker agent
expects to receive.  Pacemaker is trying to determine the success or
failure of the destroy based on the concatenation of the text of the exit
code and the text output by virsh; if either of those have changed between
virsh versions, and especially if virsh destroy ever exits with a status
other than zero, then you'll get that OCF error.

Do you know what $VIRSH_OPTIONS ends up as in your Pacemaker config,
particularly whether --graceful is specified?

Cheers,

- Peter

On Wed, 7 Oct 2020 at 18:13, Lentes, Bernd <
bernd.lentes at helmholtz-muenchen.de> wrote:

> Hi,
>
> Is it possible that "virsh destroy" does not stop a domain ?
> I'm asking because i have some domains running in a two-node HA-Cluster
> (pacemaker).
> And sometimes one node get fenced (killed) because it couldn't stop a
> domain.
> That's very ugly.
>
> This is also the reason why i asked before what "virsh destroy" really
> does ?
> IIRC a kill -9 can't terminate a process which is in "D" state
> (uninterruptible sleep).
> So if the process of the domain is in "D" state, it can't be finished.
> Right ?
>
> Pacemaker tries to shutdown or destroy a domain with a resource agent,
> which is a shell script, similar
> to an init script.
>
> Here is an excerp from the resource agent for virtual domains:
>
> force_stop()
> {
>         local out ex translate
>         local status=0
>
>         ocf_log info "Issuing forced shutdown (destroy) request for domain
> ${DOMAIN_NAME}."
>         out=$(LANG=C virsh $VIRSH_OPTIONS destroy ${DOMAIN_NAME} 2>&1)
>           # hier wird die domain destroyed
>         ex=$?
>         translate=$(echo $out|tr 'A-Z' 'a-z')
>         echo >&2 "$translate"
>         case $ex$translate in
>                 *"error:"*"domain is not running"*|*"error:"*"domain not
> found"*|\
>                 *"error:"*"failed to get domain"*)
>                         : ;; # unexpected path to the intended outcome,
> all is well   sucess
>                 [!0]*)
>                         ocf_exit_reason "forced stop failed"   #
> <============ fail of destroy seems to be possible
>                         return $OCF_ERR_GENERIC ;;
>                 0*)
>                         while [ $status != $OCF_NOT_RUNNING ]; do
>                                 VirtualDomain_status
>                                 status=$?
>                         done ;;
>         esac
>         return $OCF_SUCCESS
> }
>
> The function force_stop is responsible for stop/destroy the domain.
> And it cares about a non-working "virsh destroy".
> Is there a developer who can explain what "virsh destroy" really does ?
> Or is there another ML for the developers ?
>
> Bernd
>
> --
>
> Bernd Lentes
> Systemadministration
> Institute for Metabolism and Cell Death (MCD)
> Building 25 - office 122
> HelmholtzZentrum München
> bernd.lentes at helmholtz-muenchen.de
> phone: +49 89 3187 1241
> phone: +49 89 3187 3827
> fax: +49 89 3187 2294
> http://www.helmholtz-muenchen.de/mcd
>
> stay healthy
> Helmholtz Zentrum München
>
> Helmholtz Zentrum München
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20201007/fd64eb34/attachment.htm>


More information about the libvirt-users mailing list