[libvirt] [PATCH 06/10] Make QEMU perform managed save of all VMs on stop of libvirtd
eblake at redhat.com
Thu Nov 29 17:26:43 UTC 2012
> On Tue, Nov 27, 2012 at 03:06:45PM -0500, Eric Blake wrote:
> > How does this interact with the libvirt-guests init script? I
> > guess
> > that if that is installed, it gets to run first before the system
> > libvirtd is shutdown, so the qemu:///system won't see any running
> > guests, and this patch would just affect qemu:///session?
> In theory it can do both qemu:///system and qemu:///session, however,
> the later patches only wire this up for qemu:///session, precisely
> because libvirt-guests already exists.
In other words, right now, there is no interaction, but we may
convert libvirt-guests in the future to use this method instead.
Fair enough, and shouldn't hold up this series.
> > Is managed save always the right thing, or should we allow the user
> > to
> > configure for a graceful shutdown and/or a migration to other host?
> > Do
> > we need timeouts, as managed save may take a while per guest?
> > Similar
> > to how /etc/sysconfig/libvirt-guests can request O_DIRECT to avoid
> > file system pollution, do we need a way for the user to configure
> > the
> > use of that flag? Is doing the same thing for all guests wise, or
> > do
> > we need to allow for the possibility of a per-guest choice of what
> > action
> > to take (maybe via a new <on_host_shutdown> element?) where we only
> > fall
> > back to the configured default if the guest XML didn't request
> > something?
> > Do we correctly and automatically restart all the guests that were
> > saved
> > by this hook the next time libvirtd restarts?
> We don't attempt to automatically restart all guests - we will only
> restart those marked as "autostart", which I think is the right
> thing todo in general.
Well, with 'libvirt-guests', we restarted everything that had been
saved, not just those guests marked autostart. (Or another way
of looking at it: saving a guest on libvirtd shutdown resulted
in the guest behaving as if it had a one-shot autostart). I think
it may be a bit awkward if session behaves differently from
libvirt-guests regarding the resume of guests that are not autostart.
> I think managed save is the only reasonable choice here, because
> we want something that is going to be reliable, and zero-conf
> for the user. Migration requires the user to pick a dest host,
> and is not guaranteed to converge. Graceful shutdown relies on
> a co-operating guest. So IMHO, neither of those are really
> satisfactory options for running on desktop logout / host
I agree that managed save is the only reasonable default, but I
still wonder if we need to allow for non-default configurations,
rather than forcing managed save as the only solution.
> Not sure about O_DIRECT - I'm inclined to say we should just
> *always* use O_DIRECT - unless someone can point out a downside
> with it ?
About the only possible downside I can see is that it _might_ be
slightly slower, as it forces libvirt to take all the I/O through
a pipe before playing it out to the file system; which is why we
exposed the option to request O_DIRECT when I first implemented
it. But later, we switched to always using a pipe anyways, because
current qemu doesn't know how to throttle when writing to regular
files (it only properly does bandwidth throttling on a file
descriptor that can report EAGAIN, such as a pipe), so now that
we are always using a pipe, always using O_DIRECT might make sense
as a default (and make the existing flag for requesting O_DIRECT
become a no-op because we are automatically using direct mode to
begin with). Thoughts? If we make O_DIRECT unconditional, then
we have several followup patches to write.
> Long term, I make no secret of the fact that I want to see
> libvirt-guests die in horribly painful way. This kind of
> functionality should be brought "in house" to libvirtd and
> obviously this new code is a start in that direction.
> > > + virDomainSuspend(domains[i]);
> > Should you be checking for errors here?
> Yes & no. Pausing the guests is just a performance tweak, so if it
> fails it isn't critical. If pausing fails, chances are the
> managed save willl fail too, so we'll likely get an erorr regardless.
> > > + }
> > > +
> > > + ret = 0;
> > > + /* Then we save the VMs to disk */
> > > + for (i = 0 ; i < numDomains ; i++)
> > > + if (virDomainManagedSave(domains[i], flags[i]) < 0)
> > > + ret = -1;
> > What happens if a guest managed to exit itself after the initial
> > virConnectListAllDomains() and this point? The
> > virDomainManagedSave
> > will fail, but that is not an error since a stopped guest has
> > nothing to save after all. Do you need to be checking specific
> > error codes here before treating this as a failure?
> I guess we could handle that - though the caller will ignore all
> the errors anyway :-)
At this point, I think you've answered my questions. I'm not sure
if you made any minor tweaks based on my feedback, but it looks like
this patch is ready to push and any future work can be in followup
patches (since we are already planning for followups related to
More information about the libvir-list