Arbitrary truncating of SCSI disk serial number

Daniel P. Berrangé berrange at redhat.com
Thu Jun 3 14:51:45 UTC 2021


On Thu, Jun 03, 2021 at 04:37:49PM +0200, Peter Krempa wrote:
> Hi,
> 
> recently I've got a report that an upgrade of libvirt (and qemu) caused
> a guest-visible change in the SCSI disk identification when a very long
> serial number is used.
> 
> I've traced it back to the point where libvirt started to use the
> 'device_id=' property of the SCSI disk to pass in the alias of the disk
> when the serial is not configured and the serial if it is.
> 
> https://gitlab.com/libvirt/libvirt/-/commit/a1dce96236f6d35167924fa7e6a70f58f394b23c
> 
> The change is caused by the fact that when serial is configured via the
> 'serial=' property it's being silently truncated.
> 
> Now there are two distinct VPD pages which report the serial number:
> 
> 0x83 - device identification
> 
>  This one used to report only the device alias in the beginning but
>  starting from qemu commit:
> 
>    commit fd9307912d0a2ffa0310f9e20935d96d5af0a1ca
>    Author: Paolo Bonzini <pbonzini at redhat.com>
>    Date:   Fri Mar 16 19:12:43 2012 +0100
> 
>        scsi: copy serial number into VPD page 0x83
> 
>        Currently QEMU passes the qdev device id to the guest in an ASCII-string
>        designator in page 0x83.  While this is fine, it does not match what
>        real hardware does; usually the ASCII-string designator there hosts
>        another copy of the serial number (there can be other designators,
>        for example with a world-wide name).  Do the same for QEMU SCSI
>        disks.
> 
>        ATAPI does not support VPD pages, so it does not matter there.
>        Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
> 
>  it reports the serial number instead of the device alias when the
>  serial is configured. Now this historically copied the IDE(?) limit of
>  20 characters.
> 
>  Now with the change to use 'device_id' which overrides the behavior the
>  length of the reported value is limited to the technical limit of 255-8
>  which creates the problem.
> 
>  Libvirt uses 'device_id' because when -blockdev is used the disk alias
>  which was configured via the -drive is no longer configured and thus
>  would be missing.
> 
>  Libvirt also (unfortunately in this case I'd say) started to pass the
>  serial number via this property.
> 
> 0x80 - device serial (optional, only when serial is configured)
> 
>  This one started similarly to the 0x83 page to report the serial
>  truncated to 20, but later in commit:
> 
>    commit 48b6206305b8d56524ac2ee347b68e6e0a528559
>    Author: Rony Weng <ronyweng at synology.com>
>    Date:   Mon Aug 29 15:52:18 2016 +0800
> 
>        scsi-disk: change disk serial length from 20 to 36
> 
>        Openstack Cinder assigns volume a 36 characters uuid as serial.
>        QEMU will shrinks the uuid to 20 characters, which does not match
>        the original uuid.
> 
>        Note that there is no limit to the length of the serial number in
>        the SCSI spec.  20 was copy-pasted from virtio-blk which in turn was
>        copy-pasted from ATA; 36 is even more arbitrary.  However, bumping it
>        up too much might cause issues (e.g. 252 seems to make sense because
>        then the maximum amount of returned data is 256; but who knows there's
>        no off-by-one somewhere for such a nicely rounded number).
> 
>        Signed-off-by: Rony Weng <ronyweng at synology.com>
>        Message-Id: <1472457138-23386-1-git-send-email-ronyweng at synology.com>
>        Cc: qemu-stable at nongnu.org
>        Signed-off-by: Paolo Bonzini <pbonzini at redhat.com>
> 
>   qemu actually changed the silent truncation to another arbitrary value
>   of 36, which is the length of the UUID.
> 
>   Thus qemu isn't inocent either in these regards.
> 
> Now based on the fact that the above mentioned libvirt commit is
> contained in libvirt-5.1 (and qemu-4.0 adds support for 'device_id')
> reverting to truncation to 20 characters would IMO also be considerable
> as regression, based on the fact that there are users who changed qemu
> to lessen the truncation.
> 
> As of such I don't think libvirt should revert to using the trucated
> serial despite an ABI change.
> 
> On the other hand QEMU should IMO:
> 1) unify the truncation to a single length; preferrably the technical
>   limit
> 2) add possibility to report error when the serial is too long (libvirt
>    can accept a new property for example)
> 
> I'm open to other suggestions though.

Feels like we're essentially doomed in every scenario from an ABI compat
POV.  The best we can do I think is to document in libvirt what the
various limits are. Then say if you provide a value below the limit,
ABI stability is ensured, but if you go above the documented limit,
behaviour is undefined.

I agree that reporting an error in QEMU is more desirable than silently
truncating. This would show the mgmt app that they were supplying a value
that was too large and then could have truncated it themselves. Then when
QEMU later raised the limit, it would not have been an ABI regression.

QEMU could start off with a deprecation warning for over-long serials,
and turn it into a hard error after 2 releases.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list