[Libguestfs] [nbdkit PATCH] retry: Tweak error message
Eric Blake
eblake at redhat.com
Wed Apr 1 13:46:53 UTC 2020
On 4/1/20 5:12 AM, Richard W.M. Jones wrote:
> On Tue, Mar 31, 2020 at 04:00:22PM -0500, Eric Blake wrote:
>> The retry filter defaults to 5 retries, but when run with verbose
>> details produces some confusing output:
>>
>> $ rm -f /tmp/inject; (sleep 5s; touch /tmp/inject)& ./nbdkit -f -v -U - \
>> null 1G --filter=retry --filter=noextents --filter=error error-rate=100% \
>> error-file=/tmp/inject --filter=delay rdelay=1 \
>> --run 'qemu-img convert $nbd out.img'
>> ...
>> nbdkit: null[1]: debug: noextents: pread count=2097152 offset=14680064
>> nbdkit: null[1]: debug: error: pread count=2097152 offset=14680064
>> nbdkit: null[1]: error: injecting EIO error into pread
>> nbdkit: null[1]: debug: retry 6: original errno = 5
>> nbdkit: null[1]: debug: could not recover after 5 retries
>>
>> again:
>> /* Log the original errno since it will be lost when we retry. */
>> - nbdkit_debug ("retry %d: original errno = %d", data->retry+1, *err);
>> + nbdkit_debug ("retry attempt %d: original errno = %d", data->retry, *err);
>>
>> if (data->retry >= retries) {
>> nbdkit_debug ("could not recover after %d retries", retries);
>
> I think adding 1 was correct, so that the retries count from 1.
But the problem is the timing of when the messages are printed. And
remember, this code is only reached after a failed attempt, not on
success. Pre-patch, we have:
initial try succeeds - no message
or (here, with retries limited to 2):
initial try fails:
retry 1: original error
delay
retry 1 fails:
retry 2: original error
delay
retry 2 fails:
retry 3: original error
could not recover after 2 retries
>
> How about moving the message to make it conditional instead,
> like this?
>
> again:
> /* Log the original errno since it will be lost when we retry. */
> if (data->retry < retries)
> nbdkit_debug ("retry %d: original errno = %d", data->retry+1, *err);
> else {
> nbdkit_debug ("could not recover after %d retries", retries);
> ...
with your suggestion, we'd have:
initial try fails:
retry 1: original error
delay
retry 1 fails:
retry 2: original error
delay
retry 2 fails:
could not recover after 2 retries
where we lost the log of the errno of retry 2. Maybe better would be:
initial try fails:
attempt 0 failed with errno %d, starting delay for retry 1
delay
retry 1 fails:
attempt 1 failed with errno %d, starting delay for retry 2
delay
retry 2 fails:
attempt 2 failed with errno %d, could not recover
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
More information about the Libguestfs
mailing list