[libvirt] question about job handling across migration protocol phases

Jiri Denemark jdenemar at redhat.com
Tue Aug 28 11:14:16 UTC 2018


On Fri, Aug 24, 2018 at 14:40:08 -0600, Jim Fehlig wrote:
> While investigating a bug [1] found by Xen's osstest I realized I don't quite 
> understand how to handle modify jobs (e.g. BeginJob/EndJob) on virDomainObj 
> across the various phases of V3 migration protocol. E.g. on the src host the 
> Begin, Perform, and Confirm phases are performed. Should a modify job start 
> (BeginJob) in the Begin phase and stop (EndJob) in the Confirm phase? Or should 
> each phase, if necessary, do BeginJob/EndJob? Same question for dst host. IMO 
> the job should be held across the phases on each host, preventing any 
> modifications during the overall migration process.

Right, the first phase (Begin on the source and Prepare on the
destination) should acquire the job and it should be held until the end
of the migration (Confirm/Finish) to make sure nothing changes during
the migration. In QEMU driver, we have several helpers around the
generic job APIs:

    qemuMigrationJobStart
        - used at the beginning of migration

    qemuMigrationJobSetPhase
        - called at the beginning of each migration phase except the
          first one (the first one calls qemuMigrationJobStart)

    qemuMigrationJobContinue
        - called at the end of each phase except for the last one (which
          calls qemuMigrationJobFinish)

    qemuMigrationJobFinish
        - called at the end of migration

> Although I do worry about orphaned jobs, e.g. a missed EndJob caused
> by some obscure error in the migration machinery.

Well, this could happen even if the job was acquired by each step
separately. I don't think that spanning the job over several APIs makes
the situation significantly worse.

Jirka




More information about the libvir-list mailing list