[libvirt] [RFC] Simplifying usage of {Enter, Exit}Monitor and {Begin/End}Job

Thu Oct 22 12:52:29 UTC 2015

On Thu, Oct 22, 2015 at 01:47:28PM +0200, Jiri Denemark wrote:
> Hi all.
> 
> Looking at the qemu driver code, we have a lot of code with the
> following pattern:
> 
>     if (doSomething() < 0)
>         goto cleanup;
> 
>     if (qemuDomainObjBeginJob() < 0)
>         goto cleanup;
> 
>     if (doSomethingElse() < 0)
>         goto endjob;
> 
>     qemuDomainObjEnterMonitor();
> 
>     if (qemuMonitorSomething() < 0) {
>         qemuDomainObjExitMonitor();
>         goto endjob;
>     }
> 
>     if (qemuMonitorSomethingElse() < 0) {
>         qemuDomainObjExitMonitor();
>         goto endjob;
>     }
> 
>     if (qemuDomainObjExitMonitor() < 0)
>         goto endjob;
> 
>     ...
> 
>     ret = 0;
> 
>  endjob:
>     qemuDomainObjEndJob();
> 
>  cleanup:
>     ...
>     return ret;
> 
> Sometimes qemuDomainObjExitMonitor is in its own label instead of being
> explicitly called on every single error path. Sometimes
> qemuDomainObjEndJob is called explicitly followed by goto cleanup. In
> either case, it's pretty easy to get this wrong. It's too easy to jump
> to a wrong label (especially when moving code around) or forget to call
> the appropriate exit function before jumping to a label.
> 
> So what if we make the code more robust by changing who needs to track
> whether we are in a monitor or have a job. Now it's the caller that
> tracks it while I think we could teach the job/monitor APIs themselves
> to track the state:
> 
>     bool inJob = false;
>     bool inMonitor = false;
> 
>     if (doSomething() < 0)
>         goto cleanup;
> 
>     if (qemuDomainObjBeginJob(..., &inJob) < 0)
>         goto cleanup;
> 
>     if (doSomethingElse() < 0)
>         goto cleanup;
> 
>     qemuDomainObjEnterMonitor(..., &inMonitor);
> 
>     if (qemuMonitorSomething() < 0)
>         goto cleanup;
> 
>     if (qemuMonitorSomethingElse() < 0)
>         goto cleanup;
> 
>     if (qemuDomainObjExitMonitor(..., &inMonitor) < 0)
>         goto cleanup;
> 
>     ...
> 
>     ret = 0;
> 
>  cleanup:
>     if (qemuDomainObjExitMonitor(..., &inMonitor) < 0)
>         ret = -1;
>     qemuDomainObjEndJob(..., &inJob);
>     ...
>     return ret;
> 
> 
> It's not a lot shorter or longer but it saves us from jumping to
> different labels and it makes sure we always exit the monitor and end
> the job.
> 
> So is it worth the effort or do you thing it's even worse than before or
> do you have any other ideas?

On a related topic, we don't have great error reporting in the (usually
unlikely) scenario that we get a stuck job / timeout. I've long thought
it could be desirable to record some metadata when we start jobs, such
as the __FUNC__ of the method which started the job, so when we report
an error we can include that info as a diagnostic aid. This would
have to be against the qemuDomainObjPrivPtr struct. THis makes me
think that using the separate bool inJob/inMonitor stack variables
is not required.

We could just add

   int threadid;
   bool inJob;
   bool inMonitor;
   const char *jobfunc;

to qemuDomainObjPrivPtr. That way you don't need to modify the
Enter/Exit functions to add extra arguments - we just track
everything internally. When exiting, we'd compare against the
threadid, to make sure we don't accidentally relaase a different
thread's job.

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|