[libvirt] [PATCH v3 1/6] virfdstream: Check for thread error more frequently

Tue Jun 13 14:07:10 UTC 2017

On 06/13/2017 09:26 AM, Michal Privoznik wrote:
> On 06/13/2017 02:35 PM, John Ferlan wrote:
>>
>>
>> On 06/13/2017 07:31 AM, Michal Privoznik wrote:
>>> On 06/12/2017 11:26 PM, John Ferlan wrote:
>>>>
>>>>
>>>> On 06/05/2017 04:22 AM, Michal Privoznik wrote:
>>>>> When the I/O thread quits (e.g. due to an I/O error, lseek()
>>>>> error, whatever), any subsequent virFDStream API should return
>>>>> error too. Moreover, when invoking stream event callback, we must
>>>>> set the VIR_STREAM_EVENT_ERROR flag so that the callback knows
>>>>> something bad happened.
>>>>>
>>>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>>>>> ---
>>>>>  src/util/virfdstream.c | 15 ++++++++++++---
>>>>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>>>>
>>>>
>>>>> diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c
>>>>> index 7ee58be13..cd24757e6 100644
>>>>> --- a/src/util/virfdstream.c
>>>>> +++ b/src/util/virfdstream.c
>>>>> @@ -312,6 +312,9 @@ static void virFDStreamEvent(int watch ATTRIBUTE_UNUSED,
>>>>>          return;
>>>>>      }
>>>>>  
>>>>> +    if (fdst->threadErr)
>>>>> +        events |= VIR_STREAM_EVENT_ERROR;
>>>>> +
>>>>
>>>> This hunk almost feels separable... That is, it's updating the events
>>>> (VIR_STREAM_EVENT_ERROR) in the callback function feels different than
>>>> checking for threadErr more often, but I'll leave it up to you to decide
>>>> to split more or not.
>>>
>>> I don't think that separation is needed. The commit says "Check for
>>> thread error more frequently". Every check comes from an action. In
>>> majority of the cases the action is reporting an error. In some cases it
>>> means setting a flag, a boolean that causes error to be reported later
>>> in the process.
>>>
>>
>> "almost" - hand grenades, atomic warfare, etc. ;-)
>>
>> I'm fine with keeping as is.
>>
>>>>
>>>>>      cb = fdst->cb;
>>>>>      cbopaque = fdst->opaque;
>>>>>      ff = fdst->ff;
>>>>> @@ -787,10 +790,10 @@ static int virFDStreamWrite(virStreamPtr st, const char *bytes, size_t nbytes)
>>>>>      if (fdst->thread) {
>>>>>          char *buf;
>>>>>  
>>>>> -        if (fdst->threadQuit) {
>>>>> +        if (fdst->threadQuit || fdst->threadErr) {
>>>>>              virReportSystemError(EBADF, "%s",
>>>>>                                   _("cannot write to stream"));
>>>>
>>>> Would this (and the subsequent similar check) possibly overwrite
>>>> something from a threadErr with this error message?
>>>
>>> What do you mean? In case of thread error, threadErr is set to errno. So
>>> we should do:
>>>
>>> virReportSystemError(fdst->threadErr, ...);
>>>
>>> instead of EBADF? Well, look for the very next patch.
>>>
>>
>> As you saw in the next patch, it was too easy for me to be lost in the
>> weeds of which error is which...
>>
>> The comment was more towards how virFDStreamThread would set threadErr
>> to errno after:
>>
>>   1. virCondWait fails in which case a virReportSystemError was issued
>>   2. if (got < 0) on *DoRead and *DoWrite where I didn't chase
>> throughout the calls, but there were virReportSystemError calls already
>> made.
>>
>> so the question in my mind was on the '|| fdst->threadErr' would a
>> message already be generated?
> 
> Yes.
> 

In which case, then wouldn't this message overwrite the one already
there. Is that what we want "at this point".

I guess it's always been in the back of my mind that we want to display
the specific message that caused the original error rather than a
perhaps more generic message from some calling function. It seems we
have algorithms that take special care not to overwrite some
pre-existing error message, but wouldn't this path/change now generate
for display the more generic message.

Of course you probably have patch2 in mind where you grab the error
message from the other thread and overwrite the "last message" so what
gets displayed is that error instead of "cannot write to stream".

>> or is the error message for ->thread
>> thread as opposed to this processing thread?
> 
> What do you mean?
> 

Getting lost in the weeds of multiple patches while trying to consider
the singularity of "this" patch.

I think after patch 2 what I'm thinking about won't matter since
whatever message caused fdst->thread to fail will be set, true?

>>
>>>>
> 
>>>>
>>>>>      if (fdst->thread) {
>>>>>          /* Things are a bit complicated here. If FDStream is in a
>>>>>           * read mode, then if the message at the queue head is
>>>>> @@ -1018,6 +1024,9 @@ virFDStreamInData(virStreamPtr st,
>>>>>  
>>>>>      virObjectLock(fdst);
>>>>>  
>>>>> +    if (fdst->threadErr)
>>>>> +        goto cleanup;
>>>>> +
>>>>
>>>> This one in particular because threadQuit is checked in the subsequent
>>>> code similar to virFDStreamRead made me ponder a bit more, but I'm not
>>>> sure I could come to a conclusion...
>>>
>>> This is slightly different. I mean, I can move the check, but this is
>>> sort of a special API in sense that if the I/O thread exits and the msg
>>> queue is empty we will signal EOF to the caller. The other three APIs do
>>> change state of the stream - they change the position in the stream.
>>> This is a query API and as such can be used to query "are we at EOF
>>> yet?". That is the reason why this API doesn't fail on threadQuit,
>>> rather than return EOF (in a specific way as described in
>>> virStreamInData description).
>>>
>>
>> Makes sense when you consider the totality of things. The new API's
>> treat threadErr as different than threadQuit, while the old API's treat
>> them the same which is what I was trying to consider.
>>
>> I kept going back to that error message being generated for the earlier
>> two calls and wondering what, if any, downside there is to doing that.
> 
> There is none. It's just that I can use already existing piece of code
> and append my new check into it.
> 
> Michal
> 

While yes, error messages get logged somewhere, isn't it only the "last"
one that gets displayed to/for the "user"? I think that's what I'm hung
up on.

John