[libvirt] [Qemu-devel] QMP: RFC: I/O error info & query-stop-reason

Anthony Liguori anthony at codemonkey.ws
Thu Jun 2 20:03:52 UTC 2011


On 06/02/2011 02:13 PM, Luiz Capitulino wrote:
> On Thu, 02 Jun 2011 13:33:52 -0500
> Anthony Liguori<anthony at codemonkey.ws>  wrote:
>
>> On 06/02/2011 01:09 PM, Luiz Capitulino wrote:
>>> On Thu, 02 Jun 2011 13:00:04 -0500
>>> Anthony Liguori<anthony at codemonkey.ws>   wrote:
>>>
>>>> On 06/02/2011 12:57 PM, Luiz Capitulino wrote:
>>>>> On Wed, 01 Jun 2011 16:35:03 -0500
>>>>> Anthony Liguori<anthony at codemonkey.ws>    wrote:
>>>>>
>>>>>> On 06/01/2011 04:12 PM, Luiz Capitulino wrote:
>>>>>>> Hi there,
>>>>>>>
>>>>>>> There are people who want to use QMP for thin provisioning. That's, the VM is
>>>>>>> started with a small storage and when a no space error is triggered, more space
>>>>>>> is allocated and the VM is put to run again.
>>>>>>>
>>>>>>> QMP has two limitations that prevent people from doing this today:
>>>>>>>
>>>>>>> 1. The BLOCK_IO_ERROR doesn't contain error information
>>>>>>>
>>>>>>> 2. Considering we solve item 1, we still have to provide a way for clients
>>>>>>>        to query why a VM stopped. This is needed because clients may miss the
>>>>>>>        BLOCK_IO_ERROR event or may connect to the VM while it's already stopped
>>>>>>>
>>>>>>> A proposal to solve both problems follow.
>>>>>>>
>>>>>>> A. BLOCK_IO_ERROR information
>>>>>>> -----------------------------
>>>>>>>
>>>>>>> We already have discussed this a lot, but didn't reach a consensus. My solution
>>>>>>> is quite simple: to add a stringfied errno name to the BLOCK_IO_ERROR event,
>>>>>>> for example (see the "reason" key):
>>>>>>>
>>>>>>> { "event": "BLOCK_IO_ERROR",
>>>>>>>        "data": { "device": "ide0-hd1",
>>>>>>>                  "operation": "write",
>>>>>>>                  "action": "stop",
>>>>>>>                  "reason": "enospc", }
>>>>>>
>>>>>> you can call the reason whatever you want, but don't call it stringfied
>>>>>> errno name :-)
>>>>>>
>>>>>> In fact, just make reason "no space".
>>>>>
>>>>> You mean, we should do:
>>>>>
>>>>>      "reason": "no space"
>>>>>
>>>>> Or that we should make it a boolean, like:
>>>>>
>>>>>     "no space": true
>>>>
>>>>
>>>> Do we need reason in BLOCK_IO_ERROR if query-block returns this information?
>>>
>>> True, no.
>>>
>>>>> I'm ok with either way. But in case you meant the second one, I guess
>>>>> we should make "reason" a dictionary so that we can group related
>>>>> information when we extend the field, for example:
>>>>>
>>>>>     "reason": { "no space": false, "no permission": true }
>>>>
>>>> Why would we ever have "no permission"?
>>
>> Why did it happen?  It's not clear to me when read/write would return
>> EPERM.  open() should fail.  In fact, EPERM is not mentioned in man 2 read.
>
> Actually, the error was an EACCESS which might sound more bizarre :)
>
> What happened was that the device file in question had its permission
> changed during VM execution due to a bug somewhere else. I'm not sure if
> the error was returned in a read() or write() (Kevin might have more details).

Strange, EACCES should only happen on open().  Is it possible that 
somehow a reopen was happening?

> This is a bit extreme and I'd agree it's arguable whether or not we should
> report EACCESS, but I had this in mind and ended up mentioning it...

If we can't explain why an error would occur, we shouldn't make it part 
of the protocol :-)

> Maybe libvirt guys could provide more input wrt the error reason usage.
> If we don't have valid use cases for other errors, then I'll agree that
> providing only "no space" is enough.

Definitely!  Adding libvirt to the CC to help encourage their input.

Regards,

Anthony Liguori





More information about the libvir-list mailing list