[Libvir] PATCH: 0/7 Implementation of storage APIs
jim at meyering.net
Wed Oct 31 13:16:26 UTC 2007
"Daniel P. Berrange" <berrange at redhat.com> wrote:
> Since the previous discussions didn't really end up anywhere conclusive
> I decided it would be better to have a crack at getting some working code
> to illustrate my ideas. Thus, the following series of 7 patches provide
It's taken me a while just to digest all of this.
> Some open questions
> - Is it worth bothering with a UUID for individual storage volumes. It
> is possible to construct a globally unique identifier for most volumes,
> by combing various bits of metadata we have such as device unique ID,
> inode, iSCSI target & LUN, etc There isn't really any UUID that fits
> into the classic libvirt 16 byte UUID. I've implemented (randomly
> generated) UUIDs for the virStorageVolPtr object, but I'm inclined
> to remove them, since its not much use if they change each time the
> libvirtd daemon is restarted.
> The 'name' field provides a unique identifier scoped to the storage
> pool. I think we could add a 'char *key' field, as an arbitrary opaque
> string, forming a globally unique identifier for the volume. This
> would serve same purpose as UUID, but without the 16 bytes constraint
> which we can't usefully provide.
That sounds good. And with it being opaque, no one will
be tempted (or able) to rely on it.
> - For the local directory backend, I've got the ability to choose
> between file formats on a per-volume basis. eg, /var/lib/xen/images can
> contain a mix of raw, qcow, vmdk, etc files. This is nice & flexible
> for the user, but a more complex thing to implement, since it means
> we have to probe each volume and try & figure out its format each
Have there been requests for this feature?
The probe-and-recognize part doesn't sound too hard, but if
a large majority of use cases have homogeneous volumes-per-pool,
then for starters at least, maybe we can avoid the complexity.
A possible compromise (albeit ugly), _if_ we can dictate naming policy:
let part of a volume name (suffix, substring, component, whatever)
tell libvirtd its type. As I said, ugly, and hence probably not
worth considering, but I had to say it :-)
> time we list volumes. If we constrained the choice between formats
> to be at the pool level instead of the volume level we could avoid
> probing & thus simplify the code. This is what XenAPI does.
> - If creating non-sparse files, it can take a very long time to do the
> I/O to fill in the image. In virt-intsall/virt-manager we have nice
> incremental progress display. The API I've got currently though is
> blocking. This blocks the calling application. It also blocks the
> entire libvirtd daemon since we are single process. There are a couple
> of ways we can address this:
> 1 Allow the libvirtd daemon to serve each client connection in
> a separate thread. We'd need to adding some mutex locking to the
> QEMU driver and the storage driver to handle this. It would have
> been nice to avoid threads, but I don't think we can much longer.
> 2 For long running operations, spawn off a worker thread (or
> process) to perform the operation. Don't send a reply to the RPC
> message, instead just put the client on a 'wait queue', and get
> on with serving other clients. When the worker thread completes,
> send the RPC reply to the original client.
> 3 Having the virStorageVolCreate() method return immediately,
> giving back the client app some kind of 'job id'. The client app
> can poll on another API virStorageVolJob() method to determine
> how far through the task has got. The implementation in the
> driver would have to spawn a worker thread to do the actual
> long operation.
I like the idea of spawning off a thread for a very precise
and limited-scope task.
On first reading, I preferred your #2 worker-thread-based solution.
Then, client apps simply wait -- i.e., don't have to poll.
But we'd still need another interface for progress feedback, so #3
starts to look better: client progress feedback might come almost
for free, while polling to check for completion.
> Possibly we can allow creation to be async or blocking by
> making use of the 'flags' field to virStorageVolCreate() method,
> eg VIR_STORAGE_ASYNC. If we make async behaviour optional, we
> still need some work in the daemon to avoid blocking all clients.
> This problem will also impact us when we add cloning of existing
> volumes. It already sort of hits us when saving & restoring VMs
> if they have large amounts of memory. So perhaps we need togo
> for the general solution of making the daemon threaded per client
> connection. The ASYNC flag may still be useful anyway to get the
> incremental progress feedback in the UI.
Could we just treat that as another type of task to hand out to
a worker thread?
Otherwise, this (#1) sounds a lot more invasive, but that's just my
relatively uninformed impression.
More information about the libvir-list