[libvirt] libvirt disk labels and QEMU drive host alias generation

John Ferlan jferlan at redhat.com
Tue Jun 14 12:30:50 UTC 2016



On 06/14/2016 06:50 AM, Kashyap Chamarthy wrote:
> Summary
> -------
> 
> It seems like libvirt is generating the same QEMU drive host
> alias ('drive-virtio-disk%d') for different two disk labels (vdb, vdb1).
> 
> [Refer the contextual root cause analysis discussion at the end with
> Laine & Peter.]
> 
> Let's take a quick example to demonstrate the issue.
> 
> On a guest that is shut down, attach a couplee of disks with labels
> 'vdb', and 'vdb1'
> 
>     $ sudo virsh attach-disk cvm1 \
>          /export/vmimages/1.raw vdb --config
> 
>     $ sudo virsh attach-disk cvm1 \
>          /export/vmimages/1.raw vdb1 --config
>  
>     $ virsh domblklist cvm1
>     Target     Source
>     ------------------------------------------------
>     vda        /var/lib/libvirt/images/cirros-0.3.3-x86_64-disk.img
>     vdb        /export/vmimages/1.raw
>     vdb1       /export/vmimages/1.raw
> 
> Which results in guest XML:
> 
> 	[...]
>     <disk type='file' device='disk'>
>       <driver name='qemu' type='raw'/>
>       <source file='/export/vmimages/1.raw'/>
>       <target dev='vdb' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
>     </disk>
>     <disk type='file' device='disk'>
>       <driver name='qemu' type='raw'/>
>       <source file='/export/vmimages/1.raw'/>
>       <target dev='vdb1' bus='virtio'/>
>       <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
>     </disk>
> 	[...]
> 
> Sarting the guest (error message manually wrapped) fails, as 
> 
>     $ virsh start cvm1
>     error: Failed to start domain cvm1
>     error: internal error: process exited while connecting to monitor:
>     qemu-system-x86_64: -drive
>     file=/export/vmimages/1.raw,format=raw,if=none,id=drive-virtio-disk1: Duplicate
>     ID 'drive-virtio-disk1' for drive
> 
> It results in QEMU CLI:
> 
> 	-drive file=/export/vmimages/1.raw,format=raw,if=none,id=drive-virtio-disk1 \
>  	-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk1,id=virtio-disk1 \
> 	-drive file=/export/vmimages/1.raw,format=raw,if=none,id=drive-virtio-disk1 \
> 	-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x8,drive=drive-virtio-disk1,id=virtio-disk1 \
> 
> 
> Root cause analysis discussion from IRC from Friday
> ---------------------------------------------------
> 
> [laine] 
> 
>     The problem is that the "alias" libvirt generates based on the target
>     device name should be unique, but it isn't.
>     
>     I can see why it's doing that. When putting the index of the disk it's
>     processing within the entries into drive-virtio-disk%d, it learns the
>     index by calling virDiskNameToIndex().
>     
>     It's going through a list of disks, 0, 1, 2, 3, etc creating the
>     alias.  But at each index, it calls this other function to learn the
>     index, and that function just searches for the first entry that matches
>     the name.
>     
>     If they *really* don't know the index (because they only have a
>     pointer to the entry) they should just search for the entry with the
>     matching *address in memory*
> 
>     Haven't stepped through it carefully, but it looks like "vdb" and
>     "vdb1" lead to the same id, and also "vdb" and "sdb"
> 
> [pkrempa] 
> 
>     The alias generator calls virDiskNameToIndex() which calls
>     virDiskNameParse().  virDiskNameParse() parses the name including
>     the partition number and returns it.  virDiskNameToIndex() discards
>     the partition number and returns the disk index.
> 
>     If we accept both 'sdb' and 'sdb1' but generate the same alias then
>     that's a bug.
> 

oh - this brings back some shorter term bad memories.

This all gets even more complicated with <hostdev>'s. Try to follow both
virDomainDiskDefAssignAddress and virDomainHostdevAssignAddress
especially when the <address> is not supplied.

I do agree with Dan - there should have been some failure on the second
command since the source file is duplicated. It seems the only check is
for duplicated target dev (see virDomainDiskDefCheckDuplicateInfo). A
quick scan finds no disk->src->path comparison for FILE type - not sure
a vanilla comparison would work for all "src->type"'s...

Beyond that - if one just "reads" the comments for virDiskNameToIndex
(and virDiskNameParse) they may assume that the code is attempting to
"serially assign an address" based on the target name. Thus "vdb" would
equate to the "2nd" disk while "vdb1" would equate to the "27th" disk.
However, that would only be "true" if the "partition" parameter was
supplied (e.g. a disk partition). If it's not (which is the more general
case), then the partition number is "dropped":

   /* Count the trailing digits.  */
    size_t n_digits = strspn(ptr, "0123456789");
    if (ptr[n_digits] != '\0')
        return -1;

    *disk = idx;

If the "digit" mattered, then the "n_digits" would have been used to
alter "idx" by "idx += *ptr - '0'; similar to how the 'a' math was done.
The order of the code perhaps tricks the eyes into believing that the
idx was adjusted.

In any case, for a non "disk partition" case, the second target perhaps
should have been "sdbb" and not "sdb1" (look at tests/utiltest.c).

Furthermore, maybe the "solution" for this is the XML catching that the
supplied target has a numerically ending value, but does that work for
all cases? If one looks at :

tests/qemuxml2argvdata/qemuxml2argv-pci-many.xml

one will see usage of "vdaa", "vdab", etc. and not "vda1", "vda2", etc.
I did not dig for other tests for many disks per controller.

Still even with all that "resolved", there is another issue. There's no
code currently that makes the "post parse" check if there was some disk
or hostdev somehow assigned the same address. I believe this is
something determined during review of the series:

http://www.redhat.com/archives/libvir-list/2015-July/msg00870.html

which went into August, but perhaps most relevant is patch 12:

http://www.redhat.com/archives/libvir-list/2015-August/msg00058.html

Based on that I started writing a patch that does a very nasty post
parse address duplication check for both "disks" and "hostdevs";
however, I believe I was "waiting for" something that allowed us to call
it before domain start. IOW: Before what now is Peter's domain
configuration validation series (essentially through commit id
'5972f185e').

The problem being that prior Peter's series, the only way we could stop
something with a duplicate address was at startup time. Now with Peter's
series, it would be possible to check for duplicates prior to start and
actually list them before actually attempting the start.

John

Hope this all makes sense - lots of distractions at home this morning...




More information about the libvir-list mailing list