[PATCH 08/11] docs: Convert 'internals/rpc' page to RST and move it to 'kbase/internals'

Peter Krempa pkrempa at redhat.com
Thu Apr 7 14:00:30 UTC 2022


Signed-off-by: Peter Krempa <pkrempa at redhat.com>
---
 docs/api.rst                     |   2 +-
 docs/docs.rst                    |   3 -
 docs/internals/meson.build       |   1 -
 docs/internals/rpc.html.in       | 914 -------------------------------
 docs/kbase/index.rst             |   3 +
 docs/kbase/internals/meson.build |   1 +
 docs/kbase/internals/rpc.rst     | 781 ++++++++++++++++++++++++++
 7 files changed, 786 insertions(+), 919 deletions(-)
 delete mode 100644 docs/internals/rpc.html.in
 create mode 100644 docs/kbase/internals/rpc.rst

diff --git a/docs/api.rst b/docs/api.rst
index d9f01fb403..325b9b840c 100644
--- a/docs/api.rst
+++ b/docs/api.rst
@@ -219,7 +219,7 @@ Daemon and Remote Access

 Access to libvirt drivers is primarily handled by the libvirtd daemon
 through the `remote <remote.html>`__ driver via an
-`RPC <internals/rpc.html>`__. Some hypervisors do support client-side
+`RPC <kbase/internals/rpc.html>`__. Some hypervisors do support client-side
 connections and responses, such as Test, OpenVZ, VMware, VirtualBox
 (vbox), ESX, Hyper-V, Xen, and Virtuozzo. The libvirtd daemon service is
 started on the host at system boot time and can also be restarted at any
diff --git a/docs/docs.rst b/docs/docs.rst
index 3387dacce8..0a698913be 100644
--- a/docs/docs.rst
+++ b/docs/docs.rst
@@ -154,9 +154,6 @@ Project development
 `API extensions <api_extension.html>`__
    Adding new public libvirt APIs

-`RPC protocol & APIs <internals/rpc.html>`__
-   RPC protocol information and API / dispatch guide
-
 `Functional testing <testsuites.html>`__
    Testing libvirt with
    `TCK test suite <testtck.html>`__ and
diff --git a/docs/internals/meson.build b/docs/internals/meson.build
index 68a2e70a3d..cbf0623c08 100644
--- a/docs/internals/meson.build
+++ b/docs/internals/meson.build
@@ -1,5 +1,4 @@
 internals_in_files = [
-  'rpc',
 ]

 html_xslt_gen_install_dir = docs_html_dir / 'internals'
diff --git a/docs/internals/rpc.html.in b/docs/internals/rpc.html.in
deleted file mode 100644
index ceb7dba5f2..0000000000
--- a/docs/internals/rpc.html.in
+++ /dev/null
@@ -1,914 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<!DOCTYPE html>
-<html xmlns="http://www.w3.org/1999/xhtml">
-  <body>
-    <h1>libvirt RPC infrastructure</h1>
-
-    <ul id="toc"></ul>
-
-    <p>
-      libvirt includes a basic protocol and code to implement
-      an extensible, secure client/server RPC service. This was
-      originally designed for communication between the libvirt
-      client library and the libvirtd daemon, but the code is
-      now isolated to allow reuse in other areas of libvirt code.
-      This document provides an overview of the protocol and
-      structure / operation of the internal RPC library APIs.
-    </p>
-
-
-    <h2><a id="protocol">RPC protocol</a></h2>
-
-    <p>
-      libvirt uses a simple, variable length, packet based RPC protocol.
-      All structured data within packets is encoded using the
-      <a href="https://en.wikipedia.org/wiki/External_Data_Representation">XDR standard</a>
-      as currently defined by <a href="https://tools.ietf.org/html/rfc4506">RFC 4506</a>.
-      On any connection running the RPC protocol, there can be multiple
-      programs active, each supporting one or more versions. A program
-      defines a set of procedures that it supports. The procedures can
-      support call+reply method invocation, asynchronous events,
-      and generic data streams. Method invocations can be overlapped,
-      so waiting for a reply to one will not block the receipt of the
-      reply to another outstanding method. The protocol was loosely
-      inspired by the design of SunRPC. The definition of the RPC
-      protocol is in the file <code>src/rpc/virnetprotocol.x</code>
-      in the libvirt source tree.
-    </p>
-
-    <h3><a href="protocolframing">Packet framing</a></h3>
-
-    <p>
-      On the wire, there is no explicit packet framing marker. Instead
-      each packet is preceded by an unsigned 32-bit integer giving
-      the total length of the packet in bytes. This length includes
-      the 4-bytes of the length word itself. Conceptually the framing
-      looks like this:
-    </p>
-
-<pre>
-|~~~   Packet 1   ~~~|~~~   Packet 2   ~~~|~~~  Packet 3    ~~~|~~~
-
-+-------+------------+-------+------------+-------+------------+...
-| n=U32 | (n-4) * U8 | n=U32 | (n-4) * U8 | n=U32 | (n-4) * U8 |
-+-------+------------+-------+------------+-------+------------+...
-
-|~ Len ~|~   Data   ~|~ Len ~|~   Data   ~|~ Len ~|~   Data   ~|~
-
-</pre>
-
-    <h3><a href="protocoldata">Packet data</a></h3>
-
-    <p>
-      The data in each packet is split into two parts, a short
-      fixed length header, followed by a variable length payload.
-      So a packet from the illustration above is more correctly
-      shown as
-    </p>
-
-<pre>
-
-+-------+-------------+---------------....---+
-| n=U32 | 6*U32       | (n-(7*4))*U8         |
-+-------+-------------+---------------....---+
-
-|~ Len ~|~  Header   ~|~  Payload     ....  ~|
-</pre>
-
-
-    <h3><a href="protocolheader">Packet header</a></h3>
-    <p>
-      The header contains 6 fields, encoded as signed/unsigned 32-bit
-      integers.
-    </p>
-
-    <pre>
-+---------------+
-| program=U32   |
-+---------------+
-| version=U32   |
-+---------------+
-| procedure=S32 |
-+---------------+
-| type=S32      |
-+---------------+
-| serial=U32    |
-+---------------+
-| status=S32    |
-+---------------+
-    </pre>
-
-    <dl>
-      <dt><code>program</code></dt>
-      <dd>
-        This is an arbitrarily chosen number that will uniquely
-        identify the "service" running over the stream.
-      </dd>
-      <dt><code>version</code></dt>
-      <dd>
-        This is the version number of the program, by convention
-        starting from '1'. When an incompatible change is made
-        to a program, the version number is incremented. Ideally
-        both versions will then be supported on the wire in
-        parallel for backwards compatibility.
-      </dd>
-      <dt><code>procedure</code></dt>
-      <dd>
-        This is an arbitrarily chosen number that will uniquely
-        identify the method call, or event associated with the
-        packet. By convention, procedure numbers start from 1
-        and are assigned monotonically thereafter.
-      </dd>
-      <dt><code>type</code></dt>
-      <dd>
-        <p>
-        This can be one of the following enumeration values
-        </p>
-        <ol>
-          <li>call: invocation of a method call</li>
-          <li>reply: completion of a method call</li>
-          <li>event: an asynchronous event</li>
-          <li>stream: control info or data from a stream</li>
-        </ol>
-      </dd>
-      <dt><code>serial</code></dt>
-      <dd>
-        This is a number that starts from 1 and increases
-        each time a method call packet is sent. A reply or
-        stream packet will have a serial number matching the
-        original method call packet serial. Events always
-        have the serial number set to 0.
-      </dd>
-      <dt><code>status</code></dt>
-      <dd>
-        <p>
-        This can one of the following enumeration values
-        </p>
-        <ol>
-          <li>ok: a normal packet. this is always set for method calls or events.
-            For replies it indicates successful completion of the method. For
-            streams it indicates confirmation of the end of file on the stream.</li>
-          <li>error: for replies this indicates that the method call failed
-            and error information is being returned. For streams this indicates
-            that not all data was sent and the stream has aborted</li>
-          <li>continue: for streams this indicates that further data packets
-            will be following</li>
-        </ol>
-      </dd>
-    </dl>
-
-    <h3><a href="protocolpayload">Packet payload</a></h3>
-
-    <p>
-      The payload of a packet will vary depending on the <code>type</code>
-      and <code>status</code> fields from the header.
-    </p>
-
-    <ul>
-      <li>type=call: the in parameters for the method call, XDR encoded</li>
-      <li>type=call-with-fds: number of file handles, then the in parameters for the method call, XDR encoded, followed by the file handles</li>
-      <li>type=reply+status=ok: the return value and/or out parameters for the method call, XDR encoded</li>
-      <li>type=reply+status=error: the error information for the method, a virErrorPtr XDR encoded</li>
-      <li>type=reply-with-fds+status=ok: number of file handles, the return value and/or out parameters for the method call, XDR encoded, followed by the file handles</li>
-      <li>type=reply-with-fds+status=error: number of file handles, the error information for the method, a virErrorPtr XDR encoded, followed by the file handles</li>
-      <li>type=event: the parameters for the event, XDR encoded</li>
-      <li>type=stream+status=ok: no payload</li>
-      <li>type=stream+status=error: the error information for the method, a virErrorPtr XDR encoded</li>
-      <li>type=stream+status=continue: the raw bytes of data for the stream. No XDR encoding</li>
-    </ul>
-
-    <p>
-      With the two packet types that support passing file descriptors, in
-      between the header and the payload there will be a 4-byte integer
-      specifying the number of file descriptors which are being sent.
-      The actual file handles are sent after the payload has been sent.
-      Each file handle has a single dummy byte transmitted as a carrier
-      for the out of band file descriptor. While the sender should always
-      send '\0' as the dummy byte value, the receiver ought to ignore the
-      value for the sake of robustness.
-    </p>
-
-    <p>
-      For the exact payload information for each procedure, consult the XDR protocol
-      definition for the program+version in question
-    </p>
-
-    <h3><a id="wireexamples">Wire examples</a></h3>
-
-    <p>
-      The following diagrams illustrate some example packet exchanges
-      between a client and server
-    </p>
-
-    <h4><a id="wireexamplescall">Method call</a></h4>
-
-    <p>
-      A single method call and successful
-      reply, for a program=8, version=1, procedure=3, which 10 bytes worth
-      of input args, and 4 bytes worth of return values. The overall input
-      packet length is 4 + 24 + 10 == 38, and output packet length 32
-    </p>
-
-    <pre>
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call)
-       +--+-----------------------+-----------+
-
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
-       +--+-----------------------+--------+
-    </pre>
-
-    <h4><a id="wireexamplescallerr">Method call with error</a></h4>
-
-    <p>
-      An unsuccessful method call will instead return an error object
-    </p>
-
-    <pre>
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S   (call)
-       +--+-----------------------+-----------+
-
-       +--+-----------------------+--------------------------+
-C <--  |48| 8 | 1 | 3 | 2 | 1 | 0 | .o.oOo.o.oOo.o.oOo.o.oOo |  <-- S  (error)
-       +--+-----------------------+--------------------------+
-    </pre>
-
-    <h4><a id="wireexamplescallup">Method call with upload stream</a></h4>
-
-    <p>
-      A method call which also involves uploading some data over
-      a stream will result in
-    </p>
-
-    <pre>
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call)
-       +--+-----------------------+-----------+
-
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
-       +--+-----------------------+--------+
-
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       ...
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+
-C -->  |24| 8 | 1 | 3 | 3 | 1 | 0 | --> S  (stream finish)
-       +--+-----------------------+
-       +--+-----------------------+
-C <--  |24| 8 | 1 | 3 | 3 | 1 | 0 | <-- S  (stream finish)
-       +--+-----------------------+
-    </pre>
-
-    <h4><a id="wireexamplescallbi">Method call bidirectional stream</a></h4>
-
-    <p>
-      A method call which also involves a bi-directional stream will
-      result in
-    </p>
-
-    <pre>
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call)
-       +--+-----------------------+-----------+
-
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
-       +--+-----------------------+--------+
-
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       ..
-       +--+-----------------------+-------------....-------+
-C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
-       +--+-----------------------+-------------....-------+
-       +--+-----------------------+
-C -->  |24| 8 | 1 | 3 | 3 | 1 | 0 | --> S  (stream finish)
-       +--+-----------------------+
-       +--+-----------------------+
-C <--  |24| 8 | 1 | 3 | 3 | 1 | 0 | <-- S  (stream finish)
-       +--+-----------------------+
-    </pre>
-
-
-    <h4><a id="wireexamplescallmany">Method calls overlapping</a></h4>
-    <pre>
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call 1)
-       +--+-----------------------+-----------+
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 2 | 0 | .o.oOo.o. |  --> S  (call 2)
-       +--+-----------------------+-----------+
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 2 | 0 | .o.oOo |  <-- S  (reply 2)
-       +--+-----------------------+--------+
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 3 | 0 | .o.oOo.o. |  --> S  (call 3)
-       +--+-----------------------+-----------+
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 3 | 0 | .o.oOo |  <-- S  (reply 3)
-       +--+-----------------------+--------+
-       +--+-----------------------+-----------+
-C -->  |38| 8 | 1 | 3 | 0 | 4 | 0 | .o.oOo.o. |  --> S  (call 4)
-       +--+-----------------------+-----------+
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply 1)
-       +--+-----------------------+--------+
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 4 | 0 | .o.oOo |  <-- S  (reply 4)
-       +--+-----------------------+--------+
-    </pre>
-
-    <h4><a id="wireexamplescallfd">Method call with passed FD</a></h4>
-
-    <p>
-      A single method call with 2 passed file descriptors and successful
-      reply, for a program=8, version=1, procedure=3, which 10 bytes worth
-      of input args, and 4 bytes worth of return values. The number of
-      file descriptors is encoded as a 32-bit int. Each file descriptor
-      then has a 1 byte dummy payload. The overall input
-      packet length is 4 + 24 + 4 + 2 + 10 == 44, and output packet length 32.
-    </p>
-
-    <pre>
-       +--+-----------------------+---------------+-------+
-C -->  |44| 8 | 1 | 3 | 0 | 1 | 0 | 2 | .o.oOo.o. | 0 | 0 |  --> S  (call)
-       +--+-----------------------+---------------+-------+
-
-       +--+-----------------------+--------+
-C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
-       +--+-----------------------+--------+
-    </pre>
-
-
-    <h2><a id="security">RPC security</a></h2>
-
-    <p>
-      There are various things to consider to ensure an implementation
-      of the RPC protocol can be satisfactorily secured
-    </p>
-
-    <h3><a id="securitytls">Authentication/encryption</a></h3>
-
-    <p>
-      The basic RPC protocol does not define or require any specific
-      authentication/encryption capabilities. A generic solution to
-      providing encryption for the protocol is to run the protocol
-      over a TLS encrypted data stream. x509 certificate checks can
-      be done to form a crude authentication mechanism. It is also
-      possible for an RPC program to negotiate an encryption /
-      authentication capability, such as SASL, which may then also
-      provide per-packet data encryption. Finally the protocol data
-      stream can of course be tunnelled over transports such as SSH.
-    </p>
-
-    <h3><a id="securitylimits">Data limits</a></h3>
-
-    <p>
-      Although the protocol itself defines many arbitrary sized data values in the
-      payloads, to avoid denial of service attack there are a number of size limit
-      checks prior to encoding or decoding data. There is a limit on the maximum
-      size of a single RPC message, limit on the maximum string length, and limits
-      on any other parameter which uses a variable length array. These limits can
-      be raised, subject to agreement between client/server, without otherwise
-      breaking compatibility of the RPC data on the wire.
-    </p>
-
-    <h3><a id="securityvalidate">Data validation</a></h3>
-
-    <p>
-      It is important that all data be fully validated before performing
-      any actions based on the data. When reading an RPC packet, the
-      first four bytes must be read and the max packet size limit validated,
-      before any attempt is made to read the variable length packet data.
-      After a complete packet has been read, the header must be decoded
-      and all 6 fields fully validated, before attempting to dispatch
-      the payload. Once dispatched, the payload can be decoded and passed
-      on to the appropriate API for execution. The RPC code must not take
-      any action based on the payload, since it has no way to validate
-      the semantics of the payload data. It must delegate this to the
-      execution API (e.g. corresponding libvirt public API).
-    </p>
-
-    <h2><a id="internals">RPC internal APIs</a></h2>
-
-    <p>
-      The generic internal RPC library code lives in the <code>src/rpc/</code>
-      directory of the libvirt source tree. Unless otherwise noted, the
-      objects are all threadsafe. The core object types and their
-      purposes are:
-    </p>
-
-    <h3><a id="apioverview">Overview of RPC objects</a></h3>
-
-    <p>
-      The following is a high level overview of the role of each
-      of the main RPC objects
-    </p>
-
-    <dl>
-      <dt><code>virNetSASLContext *</code> (virnetsaslcontext.h)</dt>
-      <dd>The virNetSASLContext APIs maintain SASL state for a network
-        service (server or client). This is primarily used on the server
-        to provide an access control list of SASL usernames permitted as
-        clients.
-      </dd>
-
-      <dt><code>virNetSASLSession *</code> (virnetsaslcontext.h)</dt>
-      <dd>The virNetSASLSession APIs maintain SASL state for a single
-        network connection (socket). This is used to perform the multi-step
-        SASL handshake and perform encryption/decryption of data once
-        authenticated, via integration with virNetSocket.
-      </dd>
-
-      <dt><code>virNetTLSContext *</code> (virnettlscontext.h)</dt>
-      <dd>The virNetTLSContext APIs maintain TLS state for a network
-        service (server or client). This is primarily used on the server
-        to provide an access control list of x509 distinguished names, as
-        well as diffie-hellman keys. It can also do validation of
-        x509 certificates prior to initiating a connection, in order
-        to improve detection of configuration errors.
-      </dd>
-
-      <dt><code>virNetTLSSession *</code> (virnettlscontext.h)</dt>
-      <dd>The virNetTLSSession APIs maintain TLS state for a single
-        network connection (socket). This is used to perform the multi-step
-        TLS handshake and perform encryption/decryption of data once
-        authenticated, via integration with virNetSocket.
-      </dd>
-
-      <dt><code>virNetSocket *</code> (virnetsocket.h)</dt>
-      <dd>The virNetSocket APIs provide a higher level wrapper around
-        the raw BSD sockets and getaddrinfo APIs. They allow for creation
-        of both server and client sockets. Data transports supported are
-        TCP, UNIX, SSH tunnel or external command tunnel. Internally the
-        TCP socket impl uses the getaddrinfo info APIs to ensure correct
-        protocol-independent behaviour, thus supporting both IPv4 and IPv6.
-        The socket APIs can be associated with a virNetSASLSession *or
-        virNetTLSSession *object to allow seamless encryption/decryption
-        of all writes and reads. For UNIX sockets it is possible to obtain
-        the remote client user ID and process ID. Integration with the
-        libvirt event loop also allows use of callbacks for notification
-        of various I/O conditions
-      </dd>
-
-      <dt><code>virNetMessage *</code> (virnetmessage.h)</dt>
-      <dd>The virNetMessage APIs provide a wrapper around the libxdr
-        API calls, to facilitate processing and creation of RPC
-        packets. There are convenience APIs for encoding/encoding the
-        packet headers, encoding/decoding the payload using an XDR
-        filter, encoding/decoding a raw payload (for streams), and
-        encoding a virErrorPtr object. There is also a means to
-        add to/serve from a linked-list queue of messages.</dd>
-
-      <dt><code>virNetClient *</code> (virnetclient.h)</dt>
-      <dd>The virNetClient APIs provide a way to connect to a
-        remote server and run one or more RPC protocols over
-        the connection. Connections can be made over TCP, UNIX
-        sockets, SSH tunnels, or external command tunnels. There
-        is support for both TLS and SASL session encryption.
-        The client also supports management of multiple data streams
-        over each connection. Each client object can be used from
-        multiple threads concurrently, with method calls/replies
-        being interleaved on the wire as required.
-      </dd>
-
-      <dt><code>virNetClientProgram *</code> (virnetclientprogram.h)</dt>
-      <dd>The virNetClientProgram APIs are used to register a
-        program+version with the connection. This then enables
-        invocation of method calls, receipt of asynchronous
-        events and use of data streams, within that program+version.
-        When created a set of callbacks must be supplied to take
-        care of dispatching any incoming asynchronous events.
-      </dd>
-
-      <dt><code>virNetClientStream *</code> (virnetclientstream.h)</dt>
-      <dd>The virNetClientStream APIs are used to control transmission and
-        receipt of data over a stream active on a client. Streams provide
-        a low latency, unlimited length, bi-directional raw data exchange
-        mechanism layered over the RPC connection
-      </dd>
-
-      <dt><code>virNetServer *</code> (virnetserver.h)</dt>
-      <dd>The virNetServer APIs are used to manage a network server. A
-        server exposed one or more programs, over one or more services.
-        It manages multiple client connections invoking multiple RPC
-        calls in parallel, with dispatch across multiple worker threads.
-      </dd>
-
-      <dt><code>virNetDaemon *</code> (virnetdaemon.h)</dt>
-      <dd>The virNetDaemon APIs are used to manage a daemon process. A
-        daemon is a process that might expose one or more servers.  It
-        handles most process-related details, network-related should
-        be part of the underlying server.
-      </dd>
-
-      <dt><code>virNetServerClient *</code> (virnetserverclient.h)</dt>
-      <dd>The virNetServerClient APIs are used to manage I/O related
-        to a single client network connection. It handles initial
-        validation and routing of incoming RPC packets, and transmission
-        of outgoing packets.
-      </dd>
-
-      <dt><code>virNetServerProgram *</code> (virnetserverprogram.h)</dt>
-      <dd>The virNetServerProgram APIs are used to provide the implementation
-        of a single program/version set. Primarily this includes a set of
-        callbacks used to actually invoke the APIs corresponding to
-        program procedure numbers. It is responsible for all the serialization
-        of payloads to/from XDR.</dd>
-
-      <dt><code>virNetServerService *</code> (virnetserverservice.h)</dt>
-      <dd>The virNetServerService APIs are used to connect the server to
-        one or more network protocols. A single service may involve multiple
-        sockets (ie both IPv4 and IPv6). A service also has an associated
-        authentication policy for incoming clients.
-      </dd>
-    </dl>
-
-    <h3><a id="apiclientdispatch">Client RPC dispatch</a></h3>
-
-    <p>
-      The client RPC code must allow for multiple overlapping RPC method
-      calls to be invoked, transmission and receipt of data for multiple
-      streams and receipt of asynchronous events. Understandably this
-      involves coordination of multiple threads.
-    </p>
-
-    <p>
-      The core requirement in the client dispatch code is that only
-      one thread is allowed to be performing I/O on the socket at
-      any time. This thread is said to be "holding the buck". When
-      any other thread comes along and needs to do I/O it must place
-      its packets on a queue and delegate processing of them to the
-      thread that has the buck. This thread will send out the method
-      call, and if it sees a reply will pass it back to the waiting
-      thread. If the other thread's reply hasn't arrived, by the time
-      the main thread has got its own reply, then it will transfer
-      responsibility for I/O to the thread that has been waiting the
-      longest. It is said to be "passing the buck" for I/O.
-    </p>
-
-    <p>
-      When no thread is performing any RPC method call, or sending
-      stream data there is still a need to monitor the socket for
-      incoming I/O related to asynchronous events, or stream data
-      receipt. For this task, a watch is registered with the event
-      loop which triggers whenever the socket is readable. This
-      watch is automatically disabled whenever any other thread
-      grabs the buck, and re-enabled when the buck is released.
-    </p>
-
-    <h4><a id="apiclientdispatchex1">Example with buck passing</a></h4>
-
-    <p>
-      In the first example, a second thread issues an API call
-      while the first thread holds the buck. The reply to the
-      first call arrives first, so the buck is passed to the
-      second thread.
-    </p>
-
-    <pre>
-        Thread-1
-           |
-           V
-       Call API1()
-           |
-           V
-       Grab Buck
-           |           Thread-2
-           V              |
-       Send method1       V
-           |          Call API2()
-           V              |
-        Wait I/O          V
-           |<--------Queue method2
-           V              |
-       Send method2       V
-           |          Wait for buck
-           V              |
-        Wait I/O          |
-           |              |
-           V              |
-       Recv reply1        |
-           |              |
-           V              |
-       Pass the buck----->|
-           |              V
-           V           Wait I/O
-       Return API1()      |
-                          V
-                      Recv reply2
-                          |
-                          V
-                     Release the buck
-                          |
-                          V
-                      Return API2()
-    </pre>
-
-    <h4><a id="apiclientdispatchex2">Example without buck passing</a></h4>
-
-    <p>
-      In this second example, a second thread issues an API call
-      which is sent and replied to, before the first thread's
-      API call has completed. The first thread thus notifies
-      the second that its reply is ready, and there is no need
-      to pass the buck
-    </p>
-
-    <pre>
-        Thread-1
-           |
-           V
-       Call API1()
-           |
-           V
-       Grab Buck
-           |           Thread-2
-           V              |
-       Send method1       V
-           |          Call API2()
-           V              |
-        Wait I/O          V
-           |<--------Queue method2
-           V              |
-       Send method2       V
-           |          Wait for buck
-           V              |
-        Wait I/O          |
-           |              |
-           V              |
-       Recv reply2        |
-           |              |
-           V              |
-      Notify reply2------>|
-           |              V
-           V          Return API2()
-        Wait I/O
-           |
-           V
-       Recv reply1
-           |
-           V
-     Release the buck
-           |
-           V
-       Return API1()
-    </pre>
-
-    <h4><a id="apiclientdispatchex3">Example with async events</a></h4>
-
-    <p>
-      In this example, only one thread is present and it has to
-      deal with some async events arriving. The events are actually
-      dispatched to the application from the event loop thread
-    </p>
-
-    <pre>
-        Thread-1
-           |
-           V
-       Call API1()
-           |
-           V
-       Grab Buck
-           |
-           V
-       Send method1
-           |
-           V
-        Wait I/O
-           |          Event thread
-           V              ...
-       Recv event1         |
-           |               V
-           V          Wait for timer/fd
-       Queue event1        |
-           |               V
-           V           Timer fires
-        Wait I/O           |
-           |               V
-           V           Emit event1
-       Recv reply1         |
-           |               V
-           V          Wait for timer/fd
-       Return API1()       |
-                          ...
-    </pre>
-
-    <h3><a id="apiserverdispatch">Server RPC dispatch</a></h3>
-
-    <p>
-      The RPC server code must support receipt of incoming RPC requests from
-      multiple client connections, and parallel processing of all RPC
-      requests, even many from a single client. This goal is achieved through
-      a combination of event driven I/O, and multiple processing threads.
-    </p>
-
-    <p>
-      The main libvirt event loop thread is responsible for performing all
-      socket I/O. It will read incoming packets from clients and will
-      transmit outgoing packets to clients. It will handle the I/O to/from
-      streams associated with client API calls. When doing client I/O it
-      will also pass the data through any applicable encryption layer
-      (through use of the virNetSocket / virNetTLSSession and virNetSASLSession
-      integration). What is paramount is that the event loop thread never
-      do any task that can take a non-trivial amount of time.
-    </p>
-
-    <p>
-      When reading packets, the event loop will first read the 4 byte length
-      word. This is validated to make sure it does not exceed the maximum
-      permissible packet size, and the client is set to allow receipt of the
-      rest of the packet data. Once a complete packet has been received, the
-      next step is to decode the RPC header. The header is validated to
-      ensure the request is sensible, ie the server should not receive a
-      method reply from a client. If the client has not yet authenticated,
-      an access control list check is also performed to make sure the procedure
-      is one of those allowed prior to auth. If the packet is a method
-      call, it will be placed on a global processing queue. The event loop
-      thread is now done with the packet for the time being.
-    </p>
-
-    <p>
-      The server has a pool of worker threads, which wait for method call
-      packets to be queued. One of them will grab the new method call off
-      the queue for processing. The first step is to decode the payload of
-      the packet to extract the method call arguments. The worker does not
-      attempt to do any semantic validation of the arguments, except to make
-      sure the size of any variable length fields is below defined limits.
-    </p>
-
-    <p>
-      The worker now invokes the libvirt API call that corresponds to the
-      procedure number in the packet header. The worker is thus kept busy
-      until the API call completes. The implementation of the API call
-      is responsible for doing semantic validation of parameters and any
-      MAC security checks on the objects affected.
-    </p>
-
-    <p>
-      Once the API call has completed, the worker thread will take the
-      return value and output parameters, or error object and encode
-      them into a reply packet. Again it does not attempt to do any
-      semantic validation of output data, aside from variable length
-      field limit checks. The worker thread puts the reply packet on
-      the transmission queue for the client. The worker is now finished
-      and goes back to wait for another incoming method call.
-    </p>
-
-    <p>
-      The main event loop is back in charge and when the client socket
-      becomes writable, it will start sending the method reply packet
-      back to the client.
-    </p>
-
-    <p>
-      At any time the libvirt connection object can emit asynchronous
-      events. These are handled by callbacks in the main event thread.
-      The callback will simply encode the event parameters into a new
-      data packet and place the packet on the client transmission
-      queue.
-    </p>
-
-    <p>
-      Incoming and outgoing stream packets are also directly handled
-      by the main event thread. When an incoming stream packet is
-      received, instead of placing it in the global dispatch queue
-      for the worker threads, it is sidetracked into a per-stream
-      processing queue. When the stream becomes writable, queued
-      incoming stream packets will be processed, passing their data
-      payload on the stream. Conversely when the stream becomes
-      readable, chunks of data will be read from it, encoded into
-      new outgoing packets, and placed on the client's transmit
-      queue.
-    </p>
-
-    <h4><a id="apiserverdispatchex1">Example with overlapping methods</a></h4>
-
-    <p>
-      This example illustrates processing of two incoming methods with
-      overlapping execution
-    </p>
-
-    <pre>
-   Event thread    Worker 1       Worker 2
-       |               |              |
-       V               V              V
-    Wait I/O       Wait Job       Wait Job
-       |               |              |
-       V               |              |
-   Recv method1        |              |
-       |               |              |
-       V               |              |
-   Queue method1       V              |
-       |          Serve method1       |
-       V               |              |
-    Wait I/O           V              |
-       |           Call API1()        |
-       V               |              |
-   Recv method2        |              |
-       |               |              |
-       V               |              |
-   Queue method2       |              V
-       |               |         Serve method2
-       V               V              |
-    Wait I/O      Return API1()       V
-       |               |          Call API2()
-       |               V              |
-       V         Queue reply1         |
-   Send reply1         |              |
-       |               V              V
-       V           Wait Job       Return API2()
-    Wait I/O           |              |
-       |              ...             V
-       V                          Queue reply2
-   Send reply2                        |
-       |                              V
-       V                          Wait Job
-    Wait I/O                          |
-       |                             ...
-      ...
-    </pre>
-
-    <h4><a id="apiserverdispatchex2">Example with stream data</a></h4>
-
-    <p>
-      This example illustrates processing of stream data
-    </p>
-
-    <pre>
-   Event thread
-       |
-       V
-    Wait I/O
-       |
-       V
-   Recv stream1
-       |
-       V
-   Queue stream1
-       |
-       V
-    Wait I/O
-       |
-       V
-   Recv stream2
-       |
-       V
-   Queue stream2
-       |
-       V
-    Wait I/O
-       |
-       V
-   Write stream1
-       |
-       V
-   Write stream2
-       |
-       V
-    Wait I/O
-       |
-      ...
-    </pre>
-
-  </body>
-</html>
diff --git a/docs/kbase/index.rst b/docs/kbase/index.rst
index 2125bf4252..31711d908b 100644
--- a/docs/kbase/index.rst
+++ b/docs/kbase/index.rst
@@ -94,3 +94,6 @@ Internals

 `Lock managers <internals/locking.html>`__
    Use lock managers to protect disk content
+
+`RPC protocol & APIs <internals/rpc.html>`__
+   RPC protocol information and API / dispatch guide
diff --git a/docs/kbase/internals/meson.build b/docs/kbase/internals/meson.build
index 8195d7caf0..879c4b2de8 100644
--- a/docs/kbase/internals/meson.build
+++ b/docs/kbase/internals/meson.build
@@ -4,6 +4,7 @@ docs_kbase_internals_files = [
   'incremental-backup',
   'locking',
   'migration',
+  'rpc',
 ]


diff --git a/docs/kbase/internals/rpc.rst b/docs/kbase/internals/rpc.rst
new file mode 100644
index 0000000000..02bc880044
--- /dev/null
+++ b/docs/kbase/internals/rpc.rst
@@ -0,0 +1,781 @@
+==========================
+libvirt RPC infrastructure
+==========================
+
+.. contents::
+
+libvirt includes a basic protocol and code to implement an extensible, secure
+client/server RPC service. This was originally designed for communication
+between the libvirt client library and the libvirtd daemon, but the code is now
+isolated to allow reuse in other areas of libvirt code. This document provides
+an overview of the protocol and structure / operation of the internal RPC
+library APIs.
+
+RPC protocol
+------------
+
+libvirt uses a simple, variable length, packet based RPC protocol. All
+structured data within packets is encoded using the `XDR
+standard <https://en.wikipedia.org/wiki/External_Data_Representation>`__ as
+currently defined by `RFC 4506 <https://tools.ietf.org/html/rfc4506>`__. On any
+connection running the RPC protocol, there can be multiple programs active, each
+supporting one or more versions. A program defines a set of procedures that it
+supports. The procedures can support call+reply method invocation, asynchronous
+events, and generic data streams. Method invocations can be overlapped, so
+waiting for a reply to one will not block the receipt of the reply to another
+outstanding method. The protocol was loosely inspired by the design of SunRPC.
+The definition of the RPC protocol is in the file ``src/rpc/virnetprotocol.x``
+in the libvirt source tree.
+
+`Packet framing <protocolframing>`__
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+On the wire, there is no explicit packet framing marker. Instead each packet is
+preceded by an unsigned 32-bit integer giving the total length of the packet in
+bytes. This length includes the 4-bytes of the length word itself. Conceptually
+the framing looks like this:
+
+::
+
+   |~~~   Packet 1   ~~~|~~~   Packet 2   ~~~|~~~  Packet 3    ~~~|~~~
+
+   +-------+------------+-------+------------+-------+------------+...
+   | n=U32 | (n-4) * U8 | n=U32 | (n-4) * U8 | n=U32 | (n-4) * U8 |
+   +-------+------------+-------+------------+-------+------------+...
+
+   |~ Len ~|~   Data   ~|~ Len ~|~   Data   ~|~ Len ~|~   Data   ~|~
+
+`Packet data <protocoldata>`__
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The data in each packet is split into two parts, a short fixed length header,
+followed by a variable length payload. So a packet from the illustration above
+is more correctly shown as
+
+::
+
+
+   +-------+-------------+---------------....---+
+   | n=U32 | 6*U32       | (n-(7*4))*U8         |
+   +-------+-------------+---------------....---+
+
+   |~ Len ~|~  Header   ~|~  Payload     ....  ~|
+
+`Packet header <protocolheader>`__
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The header contains 6 fields, encoded as signed/unsigned 32-bit integers.
+
+::
+
+   +---------------+
+   | program=U32   |
+   +---------------+
+   | version=U32   |
+   +---------------+
+   | procedure=S32 |
+   +---------------+
+   | type=S32      |
+   +---------------+
+   | serial=U32    |
+   +---------------+
+   | status=S32    |
+   +---------------+
+
+``program``
+   This is an arbitrarily chosen number that will uniquely identify the
+   "service" running over the stream.
+``version``
+   This is the version number of the program, by convention starting from '1'.
+   When an incompatible change is made to a program, the version number is
+   incremented. Ideally both versions will then be supported on the wire in
+   parallel for backwards compatibility.
+``procedure``
+   This is an arbitrarily chosen number that will uniquely identify the method
+   call, or event associated with the packet. By convention, procedure numbers
+   start from 1 and are assigned monotonically thereafter.
+``type``
+   This can be one of the following enumeration values
+
+   #. call: invocation of a method call
+   #. reply: completion of a method call
+   #. event: an asynchronous event
+   #. stream: control info or data from a stream
+
+``serial``
+   This is a number that starts from 1 and increases each time a method call
+   packet is sent. A reply or stream packet will have a serial number matching
+   the original method call packet serial. Events always have the serial number
+   set to 0.
+``status``
+   This can one of the following enumeration values
+
+   #. ok: a normal packet. this is always set for method calls or events. For
+      replies it indicates successful completion of the method. For streams it
+      indicates confirmation of the end of file on the stream.
+   #. error: for replies this indicates that the method call failed and error
+      information is being returned. For streams this indicates that not all
+      data was sent and the stream has aborted
+   #. continue: for streams this indicates that further data packets will be
+      following
+
+`Packet payload <protocolpayload>`__
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The payload of a packet will vary depending on the ``type`` and ``status``
+fields from the header.
+
+-  type=call: the in parameters for the method call, XDR encoded
+-  type=call-with-fds: number of file handles, then the in parameters for the
+   method call, XDR encoded, followed by the file handles
+-  type=reply+status=ok: the return value and/or out parameters for the method
+   call, XDR encoded
+-  type=reply+status=error: the error information for the method, a virErrorPtr
+   XDR encoded
+-  type=reply-with-fds+status=ok: number of file handles, the return value
+   and/or out parameters for the method call, XDR encoded, followed by the file
+   handles
+-  type=reply-with-fds+status=error: number of file handles, the error
+   information for the method, a virErrorPtr XDR encoded, followed by the file
+   handles
+-  type=event: the parameters for the event, XDR encoded
+-  type=stream+status=ok: no payload
+-  type=stream+status=error: the error information for the method, a virErrorPtr
+   XDR encoded
+-  type=stream+status=continue: the raw bytes of data for the stream. No XDR
+   encoding
+
+With the two packet types that support passing file descriptors, in between the
+header and the payload there will be a 4-byte integer specifying the number of
+file descriptors which are being sent. The actual file handles are sent after
+the payload has been sent. Each file handle has a single dummy byte transmitted
+as a carrier for the out of band file descriptor. While the sender should always
+send '\0' as the dummy byte value, the receiver ought to ignore the value for
+the sake of robustness.
+
+For the exact payload information for each procedure, consult the XDR protocol
+definition for the program+version in question
+
+Wire examples
+~~~~~~~~~~~~~
+
+The following diagrams illustrate some example packet exchanges between a client
+and server
+
+Method call
+^^^^^^^^^^^
+
+A single method call and successful reply, for a program=8, version=1,
+procedure=3, which 10 bytes worth of input args, and 4 bytes worth of return
+values. The overall input packet length is 4 + 24 + 10 == 38, and output packet
+length 32
+
+::
+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call)
+          +--+-----------------------+-----------+
+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
+          +--+-----------------------+--------+
+
+Method call with error
+^^^^^^^^^^^^^^^^^^^^^^
+
+An unsuccessful method call will instead return an error object
+
+::
+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S   (call)
+          +--+-----------------------+-----------+
+
+          +--+-----------------------+--------------------------+
+   C <--  |48| 8 | 1 | 3 | 2 | 1 | 0 | .o.oOo.o.oOo.o.oOo.o.oOo |  <-- S  (error)
+          +--+-----------------------+--------------------------+
+
+Method call with upload stream
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A method call which also involves uploading some data over a stream will result
+in
+
+::
+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call)
+          +--+-----------------------+-----------+
+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
+          +--+-----------------------+--------+
+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          ...
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+
+   C -->  |24| 8 | 1 | 3 | 3 | 1 | 0 | --> S  (stream finish)
+          +--+-----------------------+
+          +--+-----------------------+
+   C <--  |24| 8 | 1 | 3 | 3 | 1 | 0 | <-- S  (stream finish)
+          +--+-----------------------+
+
+Method call bidirectional stream
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A method call which also involves a bi-directional stream will result in
+
+::
+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call)
+          +--+-----------------------+-----------+
+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
+          +--+-----------------------+--------+
+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C <--  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  <-- S  (stream data down)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          ..
+          +--+-----------------------+-------------....-------+
+   C -->  |38| 8 | 1 | 3 | 3 | 1 | 2 | .o.oOo.o.oOo....o.oOo. |  --> S  (stream data up)
+          +--+-----------------------+-------------....-------+
+          +--+-----------------------+
+   C -->  |24| 8 | 1 | 3 | 3 | 1 | 0 | --> S  (stream finish)
+          +--+-----------------------+
+          +--+-----------------------+
+   C <--  |24| 8 | 1 | 3 | 3 | 1 | 0 | <-- S  (stream finish)
+          +--+-----------------------+
+
+Method calls overlapping
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+::
+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 1 | 0 | .o.oOo.o. |  --> S  (call 1)
+          +--+-----------------------+-----------+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 2 | 0 | .o.oOo.o. |  --> S  (call 2)
+          +--+-----------------------+-----------+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 2 | 0 | .o.oOo |  <-- S  (reply 2)
+          +--+-----------------------+--------+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 3 | 0 | .o.oOo.o. |  --> S  (call 3)
+          +--+-----------------------+-----------+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 3 | 0 | .o.oOo |  <-- S  (reply 3)
+          +--+-----------------------+--------+
+          +--+-----------------------+-----------+
+   C -->  |38| 8 | 1 | 3 | 0 | 4 | 0 | .o.oOo.o. |  --> S  (call 4)
+          +--+-----------------------+-----------+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply 1)
+          +--+-----------------------+--------+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 4 | 0 | .o.oOo |  <-- S  (reply 4)
+          +--+-----------------------+--------+
+
+Method call with passed FD
+^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+A single method call with 2 passed file descriptors and successful reply, for a
+program=8, version=1, procedure=3, which 10 bytes worth of input args, and 4
+bytes worth of return values. The number of file descriptors is encoded as a
+32-bit int. Each file descriptor then has a 1 byte dummy payload. The overall
+input packet length is 4 + 24 + 4 + 2 + 10 == 44, and output packet length 32.
+
+::
+
+          +--+-----------------------+---------------+-------+
+   C -->  |44| 8 | 1 | 3 | 0 | 1 | 0 | 2 | .o.oOo.o. | 0 | 0 |  --> S  (call)
+          +--+-----------------------+---------------+-------+
+
+          +--+-----------------------+--------+
+   C <--  |32| 8 | 1 | 3 | 1 | 1 | 0 | .o.oOo |  <-- S  (reply)
+          +--+-----------------------+--------+
+
+RPC security
+------------
+
+There are various things to consider to ensure an implementation of the RPC
+protocol can be satisfactorily secured
+
+Authentication/encryption
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The basic RPC protocol does not define or require any specific
+authentication/encryption capabilities. A generic solution to providing
+encryption for the protocol is to run the protocol over a TLS encrypted data
+stream. x509 certificate checks can be done to form a crude authentication
+mechanism. It is also possible for an RPC program to negotiate an encryption /
+authentication capability, such as SASL, which may then also provide per-packet
+data encryption. Finally the protocol data stream can of course be tunnelled
+over transports such as SSH.
+
+Data limits
+~~~~~~~~~~~
+
+Although the protocol itself defines many arbitrary sized data values in the
+payloads, to avoid denial of service attack there are a number of size limit
+checks prior to encoding or decoding data. There is a limit on the maximum size
+of a single RPC message, limit on the maximum string length, and limits on any
+other parameter which uses a variable length array. These limits can be raised,
+subject to agreement between client/server, without otherwise breaking
+compatibility of the RPC data on the wire.
+
+Data validation
+~~~~~~~~~~~~~~~
+
+It is important that all data be fully validated before performing any actions
+based on the data. When reading an RPC packet, the first four bytes must be read
+and the max packet size limit validated, before any attempt is made to read the
+variable length packet data. After a complete packet has been read, the header
+must be decoded and all 6 fields fully validated, before attempting to dispatch
+the payload. Once dispatched, the payload can be decoded and passed on to the
+appropriate API for execution. The RPC code must not take any action based on
+the payload, since it has no way to validate the semantics of the payload data.
+It must delegate this to the execution API (e.g. corresponding libvirt public
+API).
+
+RPC internal APIs
+-----------------
+
+The generic internal RPC library code lives in the ``src/rpc/`` directory of the
+libvirt source tree. Unless otherwise noted, the objects are all threadsafe. The
+core object types and their purposes are:
+
+Overview of RPC objects
+~~~~~~~~~~~~~~~~~~~~~~~
+
+The following is a high level overview of the role of each of the main RPC
+objects
+
+``virNetSASLContext *`` (virnetsaslcontext.h)
+   The virNetSASLContext APIs maintain SASL state for a network service (server
+   or client). This is primarily used on the server to provide an access control
+   list of SASL usernames permitted as clients.
+``virNetSASLSession *`` (virnetsaslcontext.h)
+   The virNetSASLSession APIs maintain SASL state for a single network
+   connection (socket). This is used to perform the multi-step SASL handshake
+   and perform encryption/decryption of data once authenticated, via integration
+   with virNetSocket.
+``virNetTLSContext *`` (virnettlscontext.h)
+   The virNetTLSContext APIs maintain TLS state for a network service (server or
+   client). This is primarily used on the server to provide an access control
+   list of x509 distinguished names, as well as diffie-hellman keys. It can also
+   do validation of x509 certificates prior to initiating a connection, in order
+   to improve detection of configuration errors.
+``virNetTLSSession *`` (virnettlscontext.h)
+   The virNetTLSSession APIs maintain TLS state for a single network connection
+   (socket). This is used to perform the multi-step TLS handshake and perform
+   encryption/decryption of data once authenticated, via integration with
+   virNetSocket.
+``virNetSocket *`` (virnetsocket.h)
+   The virNetSocket APIs provide a higher level wrapper around the raw BSD
+   sockets and getaddrinfo APIs. They allow for creation of both server and
+   client sockets. Data transports supported are TCP, UNIX, SSH tunnel or
+   external command tunnel. Internally the TCP socket impl uses the getaddrinfo
+   info APIs to ensure correct protocol-independent behaviour, thus supporting
+   both IPv4 and IPv6. The socket APIs can be associated with a
+   virNetSASLSession \*or virNetTLSSession \*object to allow seamless
+   encryption/decryption of all writes and reads. For UNIX sockets it is
+   possible to obtain the remote client user ID and process ID. Integration with
+   the libvirt event loop also allows use of callbacks for notification of
+   various I/O conditions
+``virNetMessage *`` (virnetmessage.h)
+   The virNetMessage APIs provide a wrapper around the libxdr API calls, to
+   facilitate processing and creation of RPC packets. There are convenience APIs
+   for encoding/encoding the packet headers, encoding/decoding the payload using
+   an XDR filter, encoding/decoding a raw payload (for streams), and encoding a
+   virErrorPtr object. There is also a means to add to/serve from a linked-list
+   queue of messages.
+``virNetClient *`` (virnetclient.h)
+   The virNetClient APIs provide a way to connect to a remote server and run one
+   or more RPC protocols over the connection. Connections can be made over TCP,
+   UNIX sockets, SSH tunnels, or external command tunnels. There is support for
+   both TLS and SASL session encryption. The client also supports management of
+   multiple data streams over each connection. Each client object can be used
+   from multiple threads concurrently, with method calls/replies being
+   interleaved on the wire as required.
+``virNetClientProgram *`` (virnetclientprogram.h)
+   The virNetClientProgram APIs are used to register a program+version with the
+   connection. This then enables invocation of method calls, receipt of
+   asynchronous events and use of data streams, within that program+version.
+   When created a set of callbacks must be supplied to take care of dispatching
+   any incoming asynchronous events.
+``virNetClientStream *`` (virnetclientstream.h)
+   The virNetClientStream APIs are used to control transmission and receipt of
+   data over a stream active on a client. Streams provide a low latency,
+   unlimited length, bi-directional raw data exchange mechanism layered over the
+   RPC connection
+``virNetServer *`` (virnetserver.h)
+   The virNetServer APIs are used to manage a network server. A server exposed
+   one or more programs, over one or more services. It manages multiple client
+   connections invoking multiple RPC calls in parallel, with dispatch across
+   multiple worker threads.
+``virNetDaemon *`` (virnetdaemon.h)
+   The virNetDaemon APIs are used to manage a daemon process. A daemon is a
+   process that might expose one or more servers. It handles most
+   process-related details, network-related should be part of the underlying
+   server.
+``virNetServerClient *`` (virnetserverclient.h)
+   The virNetServerClient APIs are used to manage I/O related to a single client
+   network connection. It handles initial validation and routing of incoming RPC
+   packets, and transmission of outgoing packets.
+``virNetServerProgram *`` (virnetserverprogram.h)
+   The virNetServerProgram APIs are used to provide the implementation of a
+   single program/version set. Primarily this includes a set of callbacks used
+   to actually invoke the APIs corresponding to program procedure numbers. It is
+   responsible for all the serialization of payloads to/from XDR.
+``virNetServerService *`` (virnetserverservice.h)
+   The virNetServerService APIs are used to connect the server to one or more
+   network protocols. A single service may involve multiple sockets (ie both
+   IPv4 and IPv6). A service also has an associated authentication policy for
+   incoming clients.
+
+Client RPC dispatch
+~~~~~~~~~~~~~~~~~~~
+
+The client RPC code must allow for multiple overlapping RPC method calls to be
+invoked, transmission and receipt of data for multiple streams and receipt of
+asynchronous events. Understandably this involves coordination of multiple
+threads.
+
+The core requirement in the client dispatch code is that only one thread is
+allowed to be performing I/O on the socket at any time. This thread is said to
+be "holding the buck". When any other thread comes along and needs to do I/O it
+must place its packets on a queue and delegate processing of them to the thread
+that has the buck. This thread will send out the method call, and if it sees a
+reply will pass it back to the waiting thread. If the other thread's reply
+hasn't arrived, by the time the main thread has got its own reply, then it will
+transfer responsibility for I/O to the thread that has been waiting the longest.
+It is said to be "passing the buck" for I/O.
+
+When no thread is performing any RPC method call, or sending stream data there
+is still a need to monitor the socket for incoming I/O related to asynchronous
+events, or stream data receipt. For this task, a watch is registered with the
+event loop which triggers whenever the socket is readable. This watch is
+automatically disabled whenever any other thread grabs the buck, and re-enabled
+when the buck is released.
+
+Example with buck passing
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In the first example, a second thread issues an API call while the first thread
+holds the buck. The reply to the first call arrives first, so the buck is passed
+to the second thread.
+
+::
+
+           Thread-1
+              |
+              V
+          Call API1()
+              |
+              V
+          Grab Buck
+              |           Thread-2
+              V              |
+          Send method1       V
+              |          Call API2()
+              V              |
+           Wait I/O          V
+              |<--------Queue method2
+              V              |
+          Send method2       V
+              |          Wait for buck
+              V              |
+           Wait I/O          |
+              |              |
+              V              |
+          Recv reply1        |
+              |              |
+              V              |
+          Pass the buck----->|
+              |              V
+              V           Wait I/O
+          Return API1()      |
+                             V
+                         Recv reply2
+                             |
+                             V
+                        Release the buck
+                             |
+                             V
+                         Return API2()
+
+Example without buck passing
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In this second example, a second thread issues an API call which is sent and
+replied to, before the first thread's API call has completed. The first thread
+thus notifies the second that its reply is ready, and there is no need to pass
+the buck
+
+::
+
+           Thread-1
+              |
+              V
+          Call API1()
+              |
+              V
+          Grab Buck
+              |           Thread-2
+              V              |
+          Send method1       V
+              |          Call API2()
+              V              |
+           Wait I/O          V
+              |<--------Queue method2
+              V              |
+          Send method2       V
+              |          Wait for buck
+              V              |
+           Wait I/O          |
+              |              |
+              V              |
+          Recv reply2        |
+              |              |
+              V              |
+         Notify reply2------>|
+              |              V
+              V          Return API2()
+           Wait I/O
+              |
+              V
+          Recv reply1
+              |
+              V
+        Release the buck
+              |
+              V
+          Return API1()
+
+Example with async events
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+In this example, only one thread is present and it has to deal with some async
+events arriving. The events are actually dispatched to the application from the
+event loop thread
+
+::
+
+           Thread-1
+              |
+              V
+          Call API1()
+              |
+              V
+          Grab Buck
+              |
+              V
+          Send method1
+              |
+              V
+           Wait I/O
+              |          Event thread
+              V              ...
+          Recv event1         |
+              |               V
+              V          Wait for timer/fd
+          Queue event1        |
+              |               V
+              V           Timer fires
+           Wait I/O           |
+              |               V
+              V           Emit event1
+          Recv reply1         |
+              |               V
+              V          Wait for timer/fd
+          Return API1()       |
+                             ...
+
+Server RPC dispatch
+~~~~~~~~~~~~~~~~~~~
+
+The RPC server code must support receipt of incoming RPC requests from multiple
+client connections, and parallel processing of all RPC requests, even many from
+a single client. This goal is achieved through a combination of event driven
+I/O, and multiple processing threads.
+
+The main libvirt event loop thread is responsible for performing all socket I/O.
+It will read incoming packets from clients and will transmit outgoing packets to
+clients. It will handle the I/O to/from streams associated with client API
+calls. When doing client I/O it will also pass the data through any applicable
+encryption layer (through use of the virNetSocket / virNetTLSSession and
+virNetSASLSession integration). What is paramount is that the event loop thread
+never do any task that can take a non-trivial amount of time.
+
+When reading packets, the event loop will first read the 4 byte length word.
+This is validated to make sure it does not exceed the maximum permissible packet
+size, and the client is set to allow receipt of the rest of the packet data.
+Once a complete packet has been received, the next step is to decode the RPC
+header. The header is validated to ensure the request is sensible, ie the server
+should not receive a method reply from a client. If the client has not yet
+authenticated, an access control list check is also performed to make sure the
+procedure is one of those allowed prior to auth. If the packet is a method call,
+it will be placed on a global processing queue. The event loop thread is now
+done with the packet for the time being.
+
+The server has a pool of worker threads, which wait for method call packets to
+be queued. One of them will grab the new method call off the queue for
+processing. The first step is to decode the payload of the packet to extract the
+method call arguments. The worker does not attempt to do any semantic validation
+of the arguments, except to make sure the size of any variable length fields is
+below defined limits.
+
+The worker now invokes the libvirt API call that corresponds to the procedure
+number in the packet header. The worker is thus kept busy until the API call
+completes. The implementation of the API call is responsible for doing semantic
+validation of parameters and any MAC security checks on the objects affected.
+
+Once the API call has completed, the worker thread will take the return value
+and output parameters, or error object and encode them into a reply packet.
+Again it does not attempt to do any semantic validation of output data, aside
+from variable length field limit checks. The worker thread puts the reply packet
+on the transmission queue for the client. The worker is now finished and goes
+back to wait for another incoming method call.
+
+The main event loop is back in charge and when the client socket becomes
+writable, it will start sending the method reply packet back to the client.
+
+At any time the libvirt connection object can emit asynchronous events. These
+are handled by callbacks in the main event thread. The callback will simply
+encode the event parameters into a new data packet and place the packet on the
+client transmission queue.
+
+Incoming and outgoing stream packets are also directly handled by the main event
+thread. When an incoming stream packet is received, instead of placing it in the
+global dispatch queue for the worker threads, it is sidetracked into a
+per-stream processing queue. When the stream becomes writable, queued incoming
+stream packets will be processed, passing their data payload on the stream.
+Conversely when the stream becomes readable, chunks of data will be read from
+it, encoded into new outgoing packets, and placed on the client's transmit
+queue.
+
+Example with overlapping methods
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+This example illustrates processing of two incoming methods with overlapping
+execution
+
+::
+
+      Event thread    Worker 1       Worker 2
+          |               |              |
+          V               V              V
+       Wait I/O       Wait Job       Wait Job
+          |               |              |
+          V               |              |
+      Recv method1        |              |
+          |               |              |
+          V               |              |
+      Queue method1       V              |
+          |          Serve method1       |
+          V               |              |
+       Wait I/O           V              |
+          |           Call API1()        |
+          V               |              |
+      Recv method2        |              |
+          |               |              |
+          V               |              |
+      Queue method2       |              V
+          |               |         Serve method2
+          V               V              |
+       Wait I/O      Return API1()       V
+          |               |          Call API2()
+          |               V              |
+          V         Queue reply1         |
+      Send reply1         |              |
+          |               V              V
+          V           Wait Job       Return API2()
+       Wait I/O           |              |
+          |              ...             V
+          V                          Queue reply2
+      Send reply2                        |
+          |                              V
+          V                          Wait Job
+       Wait I/O                          |
+          |                             ...
+         ...
+
+Example with stream data
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+This example illustrates processing of stream data
+
+::
+
+      Event thread
+          |
+          V
+       Wait I/O
+          |
+          V
+      Recv stream1
+          |
+          V
+      Queue stream1
+          |
+          V
+       Wait I/O
+          |
+          V
+      Recv stream2
+          |
+          V
+      Queue stream2
+          |
+          V
+       Wait I/O
+          |
+          V
+      Write stream1
+          |
+          V
+      Write stream2
+          |
+          V
+       Wait I/O
+          |
+         ...
-- 
2.35.1



More information about the libvir-list mailing list