[libvirt] [go PATCH 00/37] Fix error reporting thread safety wrt goroutine rescheduling

Daniel P. Berrangé berrange at redhat.com
Mon Jul 23 15:58:43 UTC 2018


This series has been tested by the KubeVirt devs who confirmed it fixes
the problems they see, so I'll push it shortly.

On Mon, Jul 16, 2018 at 02:23:46PM +0100, Daniel P. Berrangé wrote:
> The Libvirt C API provides the virGetLastError() function to let callers
> aquire the error information after a function fails. This function uses
> a thread local, and so must be called in the same OS thread as the
> function that  failed originally.
> eg you could call
> 
>     char *xml = virDomainGetXMLDesc(dom);
>     if (!xml) {
>         virErrorPtr err = virGetLastError();
>         ....do stuff with err...
>     }
> 
> This is all fine, but there is a subtle problem that was overlooked when
> the Go bindings were first created. Specifically a native C API call is
> a goroutine re-scheduling point. So when the Go code does
> 
>     var xml = C.virDomainGetXMLDesc(dom);
>     if (xml == nil) {
>         C.virErrorPtr err = C.virGetLastError();
>         ....do stuff with err...
>     }
> 
> All that code runs in the same goroutine, but at the call entry to
> C.virGetLastError, the goroutine might get switched to a different
> OS level thread. As a result virGetLastError() will return either no
> error at all, or an error from a completely different libvirt API call.
> 
> We need to prevent the OS level thread being changed in between the call
> to the real function and the virGetLastError() function.
> 
> Naively you might think we could put a LockOSThread() / UnlockOSThread()
> call around this block of Go code, but that is a very bad idea in
> reality. Until Go 1.10, the LockOSThread() calls did not ref count, so
> if some other code has already locked the thread, when libvirt called
> UnlockOSThread it could do bad things. In addition, after calling
> UnlockOSThread() the Go runtime doesn't trust the OS thread state
> anymore, so will terminate the thread and spawn a new one. IOW using
> LockOSThread() would mean every single libvirt API call would create and
> destroy a new thread which is horrible for performance.
> 
> Thus this patch series takes a different approach. We create a wrapper
> function for every C API exposed by libvirt, that has a 'virErrorPtr'
> parameter. So the Go code can do
> 
>      var C.virErrorPtr err
>      var xml = C.virDomainGetXMLDescWrapper(dom, &err)
>      if (xml == nil) {
>          ...do stuff with err...
>      }
> 
> The wrapper function is responsible for calling virGetLastError() and
> since this is C code, we're guaranteed its all in the same OS level
> thread.
> 
> Daniel P. Berrangé (37):
>   error: add helper for converting libvirt to go error objects
>   storage volume: add missin blank line
>   Rename *cfuncs.{go,h} to *wrapper.{go,h}
>   Use "Wrapper" or "Helper" as suffix for C functions
>   Change "Compat" suffix to "Wrapper" to have standard naming scheme
>   connect: move wrapper functions out of compat header
>   network: move wrapper functions out of compat header
>   nwfilter binding: move wrapper functions out of compat header
>   node device: move wrapper functions out of compat header
>   secret: move wrapper functions out of compat header
>   stream: move wrapper functions out of compat header
>   storage volume: move wrapper functions out of compat header
>   storage pool: move wrapper functions out of compat header
>   qemu: move wrapper functions out of compat header
>   lxc: move wrapper functions out of compat header
>   domain: move wrapper functions out of compat header
>   make the XXX_wrapper.h header files self-contained
>   Add XXX_wrapper.{h,go} for every remaining file
>   Standardize formatting in all wrapper headers
>   storage vol: fix error reporting thread safety
>   storage pool: fix error reporting thread safety
>   stream: fix error reporting thread safety
>   secret: fix error reporting thread safety
>   nwfilter: fix error reporting thread safety
>   nwfilter binding: fix error reporting thread safety
>   node device: fix error reporting thread safety
>   network: fix error reporting thread safety
>   interface: fix error reporting thread safety
>   domain snapshot: fix error reporting thread safety
>   connect: fix error reporting thread safety
>   domain: fix error reporting thread safety
>   qemu: fix error reporting thread safety
>   lxc: fix error reporting thread safety
>   events: fix error reporting thread safety
>   error: remove GetLastError() function
>   error: make GetNotImplementedError private
>   connect: add missing references on domain object in stats records
> 
>  api_test.go                                   |    5 +-
>  callbacks.go                                  |    4 +-
>  callbacks_cfuncs.go => callbacks_wrapper.go   |    6 +-
>  callbacks_cfuncs.h => callbacks_wrapper.h     |    8 +-
>  connect.go                                    |  741 +++---
>  connect_cfuncs.h                              |   34 -
>  connect_compat.go                             |  206 --
>  connect_compat.h                              |   81 -
>  connect_wrapper.go                            | 1766 +++++++++++++
>  connect_wrapper.h                             |  730 ++++++
>  domain.go                                     | 1083 ++++----
>  domain_compat.go                              |  384 ---
>  domain_compat.h                               |  141 -
>  domain_events.go                              |  235 +-
>  domain_events_cfuncs.h                        |  124 -
>  ...ents_cfuncs.go => domain_events_wrapper.go |   85 +-
>  domain_events_wrapper.h                       |  207 ++
>  domain_snapshot.go                            |   65 +-
>  domain_snapshot_wrapper.go                    |  215 ++
>  domain_snapshot_wrapper.h                     |  102 +
>  domain_wrapper.go                             | 2330 +++++++++++++++++
>  domain_wrapper.h                              |  986 +++++++
>  error.go                                      |   17 +-
>  error_test.go                                 |   44 -
>  events.go                                     |   45 +-
>  events_cfuncs.h                               |   39 -
>  events_cfuncs.go => events_wrapper.go         |   89 +-
>  events_wrapper.h                              |   82 +
>  interface.go                                  |   48 +-
>  interface_wrapper.go                          |  158 ++
>  interface_wrapper.h                           |   76 +
>  lxc.go                                        |   27 +-
>  lxc_compat.go                                 |   51 -
>  lxc_wrapper.go                                |  101 +
>  lxc_compat.h => lxc_wrapper.h                 |   33 +-
>  network.go                                    |   88 +-
>  network_compat.h                              |   10 -
>  network_events.go                             |   23 +-
>  network_events_cfuncs.h                       |   38 -
>  ...nts_cfuncs.go => network_events_wrapper.go |   41 +-
>  network_compat.go => network_events_wrapper.h |   51 +-
>  network_wrapper.go                            |  267 ++
>  network_wrapper.h                             |  119 +
>  node_device.go                                |   66 +-
>  node_device_compat.go                         |   46 -
>  node_device_compat.h                          |    3 -
>  node_device_events.go                         |   32 +-
>  node_device_events_cfuncs.h                   |   40 -
>  ...cfuncs.go => node_device_events_wrapper.go |   46 +-
>  ..._cfuncs.go => node_device_events_wrapper.h |   73 +-
>  node_device_wrapper.go                        |  184 ++
>  node_device_wrapper.h                         |   88 +
>  nwfilter.go                                   |   38 +-
>  nwfilter_binding.go                           |   46 +-
>  nwfilter_binding_compat.h                     |   11 -
>  ...g_compat.go => nwfilter_binding_wrapper.go |   66 +-
>  nwfilter_binding_wrapper.h                    |   60 +
>  nwfilter_wrapper.go                           |  122 +
>  nwfilter_wrapper.h                            |   65 +
>  qemu.go                                       |   43 +-
>  qemu_cfuncs.go                                |   64 -
>  qemu_cfuncs.h                                 |   39 -
>  qemu_compat.go                                |   48 -
>  qemu_compat.h                                 |    3 -
>  qemu_wrapper.go                               |  133 +
>  qemu_wrapper.h                                |   78 +
>  secret.go                                     |   54 +-
>  secret_compat.h                               |    3 -
>  secret_events.go                              |   34 +-
>  secret_events_cfuncs.h                        |   40 -
>  ...ents_cfuncs.go => secret_events_wrapper.go |   45 +-
>  secret_compat.go => secret_events_wrapper.h   |   41 +-
>  secret_wrapper.go                             |  175 ++
>  secret_wrapper.h                              |   86 +
>  storage_pool.go                               |  121 +-
>  storage_pool_compat.h                         |    4 -
>  storage_pool_events.go                        |   34 +-
>  storage_pool_events_cfuncs.h                  |   40 -
>  ...funcs.go => storage_pool_events_wrapper.go |   46 +-
>  ...compat.go => storage_pool_events_wrapper.h |   43 +-
>  storage_pool_wrapper.go                       |  343 +++
>  storage_pool_wrapper.h                        |  150 ++
>  storage_volume.go                             |   86 +-
>  storage_volume_compat.go                      |   47 -
>  storage_volume_compat.h                       |    4 -
>  storage_volume_wrapper.go                     |  249 ++
>  storage_volume_wrapper.h                      |  116 +
>  stream.go                                     |   95 +-
>  stream_cfuncs.go                              |  132 -
>  stream_cfuncs.h                               |   37 -
>  stream_compat.go                              |   69 -
>  stream_compat.h                               |   13 -
>  stream_wrapper.go                             |  331 +++
>  stream_wrapper.h                              |  120 +
>  94 files changed, 11619 insertions(+), 3318 deletions(-)
>  rename callbacks_cfuncs.go => callbacks_wrapper.go (90%)
>  rename callbacks_cfuncs.h => callbacks_wrapper.h (87%)
>  delete mode 100644 connect_cfuncs.h
>  delete mode 100644 connect_compat.go
>  create mode 100644 connect_wrapper.go
>  create mode 100644 connect_wrapper.h
>  delete mode 100644 domain_compat.go
>  delete mode 100644 domain_events_cfuncs.h
>  rename domain_events_cfuncs.go => domain_events_wrapper.go (75%)
>  create mode 100644 domain_events_wrapper.h
>  create mode 100644 domain_snapshot_wrapper.go
>  create mode 100644 domain_snapshot_wrapper.h
>  create mode 100644 domain_wrapper.go
>  create mode 100644 domain_wrapper.h
>  delete mode 100644 error_test.go
>  delete mode 100644 events_cfuncs.h
>  rename events_cfuncs.go => events_wrapper.go (72%)
>  create mode 100644 events_wrapper.h
>  create mode 100644 interface_wrapper.go
>  create mode 100644 interface_wrapper.h
>  delete mode 100644 lxc_compat.go
>  create mode 100644 lxc_wrapper.go
>  rename lxc_compat.h => lxc_wrapper.h (53%)
>  delete mode 100644 network_events_cfuncs.h
>  rename network_events_cfuncs.go => network_events_wrapper.go (59%)
>  rename network_compat.go => network_events_wrapper.h (57%)
>  create mode 100644 network_wrapper.go
>  create mode 100644 network_wrapper.h
>  delete mode 100644 node_device_compat.go
>  delete mode 100644 node_device_events_cfuncs.h
>  rename node_device_events_cfuncs.go => node_device_events_wrapper.go (59%)
>  rename connect_cfuncs.go => node_device_events_wrapper.h (52%)
>  create mode 100644 node_device_wrapper.go
>  create mode 100644 node_device_wrapper.h
>  rename nwfilter_binding_compat.go => nwfilter_binding_wrapper.go (54%)
>  create mode 100644 nwfilter_binding_wrapper.h
>  create mode 100644 nwfilter_wrapper.go
>  create mode 100644 nwfilter_wrapper.h
>  delete mode 100644 qemu_cfuncs.go
>  delete mode 100644 qemu_cfuncs.h
>  delete mode 100644 qemu_compat.go
>  create mode 100644 qemu_wrapper.go
>  create mode 100644 qemu_wrapper.h
>  delete mode 100644 secret_events_cfuncs.h
>  rename secret_events_cfuncs.go => secret_events_wrapper.go (61%)
>  rename secret_compat.go => secret_events_wrapper.h (54%)
>  create mode 100644 secret_wrapper.go
>  create mode 100644 secret_wrapper.h
>  delete mode 100644 storage_pool_events_cfuncs.h
>  rename storage_pool_events_cfuncs.go => storage_pool_events_wrapper.go (60%)
>  rename storage_pool_compat.go => storage_pool_events_wrapper.h (51%)
>  create mode 100644 storage_pool_wrapper.go
>  create mode 100644 storage_pool_wrapper.h
>  delete mode 100644 storage_volume_compat.go
>  create mode 100644 storage_volume_wrapper.go
>  create mode 100644 storage_volume_wrapper.h
>  delete mode 100644 stream_cfuncs.go
>  delete mode 100644 stream_cfuncs.h
>  delete mode 100644 stream_compat.go
>  create mode 100644 stream_wrapper.go
>  create mode 100644 stream_wrapper.h
> 
> -- 
> 2.17.1
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvir-list mailing list