[Libguestfs] libnbd: When are callbacks freed

Eric Blake eblake at redhat.com
Thu Jul 13 18:37:21 UTC 2023


On Thu, Jul 13, 2023 at 04:18:03PM +0000, Tage Johansson wrote:
> 
> > > So is there any safe way to get some description of the error from a
> > > completion callback apart from a non-zero number? It isn't too
> > > helpful to report to the user that the read operation faild with -1.
> > As I recall, from the callback, no.
> > 
> > The error is not lost however, it will be returned by the API call
> > itself.  eg. If you're in nbd_aio_opt_list -> callback (error) then
> > nbd_aio_opt_list will return -1 and at that point you can use
> > nbd_get_error to report the error.
> 
> 
> I don't understand. If I call `nbd_aio_opt_list()` with a completion
> callback. `nbd_aio_opt_list()` will return immediately, maybe with a
> successful result. But the command is not complete until
> `nbd_aio_is_connecting()` returns `false`, so the completion callback may be
> invoked with an error after `nbd_aio_opt_list()` has returned.

When you call nbd_aio_opt_list(), here are several possible scenarios
that can[*] happen (not exhaustive, but should be a good starting
point for thinking about it):

1. pre-transmission early failure
- call nbd_aio_opt_list
  - early failure detected, nothing queued to send to server
  - call list_callback.free if provided
  - call completion_callback.free if provided
  - nbd_aio_opt_list returns -1
- nbd_aio_is_negotiating is unchanged (the early failure did not
  change the state, .callbacks were not invoked, and .free is done
  before aio)

2. command succeeds after blocking
- nbd_aio_is_negotiating is true (otherwise we were in scenario 1)
- call nbd_aio_opt_list
  - no early failure detected, command is queued via nbd_internal_run()
    - state machine changes to connecting instead of negotiating
    - advance through states to send the request, hit EAGAIN waiting to
      read the server's reply
    - nbd_internal_run returns 0
  - nbd_aio_opt_list returns 0
- nbd_aio_is_negotiating is false, you must manually move the state
  machine along once you have data (by nbd_aio_poll,
  nbd_aio_notify_read, ...)
  - state machine reaches server responses
  - list_callback.callback called if server's reply includes a name
  - state machine gets to end of server's message
  - call completion_callback.callback if provided, with error 0
  - state set back to negotiating
  - call list_callback.free (if provided)
  - call completion_callback.free (if provided)
  - nbd_aio_poll/nbd_aio_notify_write/... returns 0
- nbd_aio_is_negotiating is true again (callbacks reached after the aio
  call completed)

3. server failure not detected until after blocking
- nbd_aio_is_negotiating is true (otherwise we were in scenario 1)
- call nbd_aio_opt_list
  - no early failure detected, command is queued via nbd_internal_run()
    - state machine changes to connecting instead of negotiating
    - advance through states to send the request, hit EAGAIN waiting to
      read the server's reply
    - nbd_internal_run returns 0
  - nbd_aio_opt_list returns 0
- nbd_aio_is_negotiating is false, you must manually move the state
  machine along once you have data (by nbd_aio_poll,
  nbd_aio_notify_read, ...)
  - state machine reaches server responses
  - list_callback.callback called if server's reply includes a name
  - state machine gets to end of server's message
  - call completion_callback.callback if provided, with error reported
    by the server
  - state set back to negotiating
  - call list_callback.free (if provided)
  - call completion_callback.free (if provided)
  - nbd_aio_poll/nbd_aio_notify_write/... returns 0
- nbd_aio_is_negotiating is true again (you had to use a completion
  callback to know about the server failure)

4. transmission failure after blocking
- nbd_aio_is_negotiating is true (otherwise we were in scenario 1)
- call nbd_aio_opt_list
  - no early failure detected, command is queued via nbd_internal_run()
    - state machine changes to connecting instead of negotiating
    - advance through states to send the request, hit EAGAIN waiting to
      read the server's reply
    - nbd_internal_run returns 0
  - nbd_aio_opt_list returns 0
- nbd_aio_is_negotiating is false, you must manually move the state
  machine along once you have data (by nbd_aio_poll,
  nbd_aio_notify_read, ...)
  - state machine sees server is no longer connected
  - state set to DEAD
  - call completion_callback.callback if provided, with error ENOTCONN
  - call list_callback.free (if provided)
  - call completion_callback.free (if provided)
  - nbd_aio_poll/nbd_aio_notify_write/... returns -1
- nbd_aio_is_negotiating is false (the completion callback occurs
  after the aio success)

5. server failure detected without blocking (unlikely, but possible
under gdb or heavy system load)
- nbd_aio_is_negotiating is true (otherwise we were in scenario 1)
- call nbd_aio_opt_list
  - no early failure detected, command is queued via nbd_internal_run()
    - state machine changes to connecting instead of negotiating
    - advance through states to send the request and read the server's
      reply
    - list_callback.callback might be called if server's reply includes
      a name
    - state machine gets to end of server's message
    - call completion_callback.callback if provided, with error set to
      the server's error
    - state set back to negotiating
    - call list_callback.free (if provided)
    - call completion_callback.free (if provided)
    - nbd_internal_run returns 0
  - nbd_aio_opt_list returns 0
- nbd_aio_is_negotiating is once again true (.free ran before the aio
  call completed, and you can directly issue another nbd_aio_opt
  command)

6. list callback requests command failure despite server success
- nbd_aio_is_negotiating is true (otherwise we were in scenario 1)
- call nbd_aio_opt_list
  - no early failure detected, command is queued via nbd_internal_run()
    - state machine changes to connecting instead of negotiating
    - advance through states to send the request, hit EAGAIN waiting to
      read the server's reply
    - nbd_internal_run returns 0
  - nbd_aio_opt_list returns 0
- nbd_aio_is_negotiating is false, you must manually move the state
  machine along once you have data (by nbd_aio_poll,
  nbd_aio_notify_read, ...)
  - state machine reaches server responses
  - list_callback.callback called if server's reply includes a name,
    and callback sets err and returns -1
  - state machine gets to end of server's success message
  - call completion_callback.callback if provided, with error from the
    earlier callback
  - state set back to negotiating
  - call list_callback.free (if provided)
  - call completion_callback.free (if provided)
  - nbd_aio_poll/nbd_aio_notify_write/... returns 0
- nbd_aio_is_negotiating is true again

[*] possibly modulo patches I'm about to post, if that isn't already
the case...

> 
> 
> Also, does the value of `err` (passed into a caller) has any meaning like a
> known errno value or something else that can be converted to a description?
> Or is it just an arbitrary non zero integer?

It is the POSIX errno value, either translated from the server's
response (e.g., if the server fails NBD_CMD_WRITE with NBD_ENOSPC,
you'll get your local system's value of ENOSPC even if it differs from
the NBD protocol's value), or injected directly by libnbd (e.g., we
have several spots where we inject EPROTO for cases where libnbd has
detected that the server is not compliant to the NBD protocol, even
though there is no NBD_EPROTO error in the protocol).  If 'err' is 0,
the overall command succeeded (you know the server did the command).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


More information about the Libguestfs mailing list