[Libvir] Remote patch, 2007-02-28

Tue Mar 6 05:19:31 UTC 2007

On Mon, Mar 05, 2007 at 09:49:42AM +0000, Mark McLoughlin wrote:
> On Thu, 2007-03-01 at 13:56 +0000, Richard W.M. Jones wrote:
> > Do you have some actual concrete problems with SunRPC?  For me it solves 
> > the problem of marshalling complicated data structures, including all 
> > the error infrastructure, over the wire (see src/remote_rpc.x).  It is 
> > very trim and efficient.  It demonstrably allows us to run over TLS, 
> > Unix domain sockets, and ssh.  It handles versioning for us.
> > 
> > On the other hand, coding with it is grotty to say the least.
> > 
> > But we definitely shouldn't publish the SunRPC interface or allow others 
> > to interoperate with it, so that we can keep our options open in future.
> 
> 	So, thoughts on the SunRPC stuff:
> 
>   - IMHO, we're never going to encourage people to use the SunRPC 
>     interface directly, but at some point we may really want to expose 
>     the remote interface directly and so we'll move to another
>     transport.

I'm not sure what you mean by 'expose the remote interface directly' ?
Do you mean allow arbitrary non-libvirt clients to speak to the server
daemon directly, or something else ?

>   - I'm not sure the libvirt API is really well designed for remote 
>     use, so I'm not sure that mapping the API calls to RPC calls is the 
>     best approach.

I agree some of the current APIs are too granular to be optimal in
terms of network traffic. In fact they're already sub-optimal even
in the local only case...

>     e.g. to iterate over the list of domain UUIDs, you need to 
>     ListDomains(), and then for each of them LookupByID() and 
>     GetUUID(). That API might be fine for apps, but libvirt doesn't 
>     necessarily need to map the API calls exactly to RPC calls - e.g. 
>     you could have a ListDomains() RPC call return id/name/uuid tuples 
>     and make LookupByID() and GetUUID() not involve a network 
>     roundtrip[1].

Even speaking to local XenD this sequence of calls is a PITA. In
virt-manager all internal tracking is based off UUID, but the API
to list domains gives me ids (for active VMs) or names (for inactive
domains). Translating to UUIDs is not nearly as lightweight as I'd
like thanks to some horrible aspects of XenD, requiring many more
RPC requests to XenD than we technically need. If we had a ListenDomains
public API method returning a tuple of (name,id,uuid), even the local
only case would be much more efficient, requiring only a 1 single 
HTTP call to XenD, instead of O(n)

>     One problem with that is that in order for the remote driver to 
>     know when to invalidate the ListDomains() cache, it needs 
>     asynchronous notification of domain creation. Which I think we want 
>     in the API long term, anyway.
> 
>     Maybe you could argue that all this is orthogonal to the transport 
>     question - "XDR is just a marshaling format" - but I'm not 
>     convinced, especially wrt. the async notification aspect. Also, 
>     AFAIR "we can just map the library calls to RPC calls" was one of 
>     the motivations for using SunRPC, so ...

We do this same mapping library calls to wire messages in the QEMU 
daemon, and libvirt proxy too. There isn't anything in SunRPC that
says '1 public API == 1 RPC call'. Since this is all internal impl
details, we can easily have a M-N model for public APIs <=> RPC
calls, regardless of what wire protocol, or network API we choose.

That all said, I'm not sure we'd want to do an internal M-N model
because it is going to require the maintainance of alot of state
internal to libvirt. The cache invalidation issues that implies
are non-trivial, and are quite possibly better answered by the
end application. We've already hit issues with the existing libvirt
caching of mere virDomainPtr objects in certain cases, so I'm not
all that enthusiastic about adding more complex caching.

So I'm all for adding in ways to let us get info about all domains
in batch calls & think we should seriously consider exposing batch
calls to the end applications. It'll give them alot of flexibility
in how to interact with libvirt in the most efficient manner for
their application model, while keeping the internals of libvirt
free of too much caching/state

>   - Yes, it's ugly code and even though you say it's done, code is 
>     never really done. Especially here where there are lots of stuff 
>     we're not sure we've gotten right - encryption, authentication, 
>     mapping the library calls to RPC calls, async notification etc.
>     I think people might avoid hacking on this code, and that won't 
>     help it evolve.

>   - Similarly, some people would consider SunRPC an old, legacy, crufty 
>     protocol. RPC systems is one of those high-fashion areas where 
>     people hold opinions which aren't necessarily terribly logical, and
>     so I think SunRPC will turn off hackers who might otherwise be 
>     interested.

I'm not inclined to pay attention to fashion, otherwise I'd be writing
webapps in Ruby using XML-RPC ;-) Seriously though, I think we do need 
to consider this point as one aspect of 'long term ease of maintainence'
criteria. ie less hackers == harder to maintainer. We can't decide on
that criteria alone though

> 	At the same time, though, I can sympathise with "look, we've written
> all this code and it works fine, so let's just go with it". Perhaps
> that's the right approach, and I'm just being a party pooper.

I'm seeing at least 4 core issues wrt to the question of remote management:

 - Wire format  - single most important aspect wrt to compatability
                  because once a libvirt client is released to the wild
                  we need to keep compatability for a non-trivial amount
                  of time.

 - Internal API - basically what we're using inside the driver/daemon.
                  Within the constraints of the wire format decision, we 
                  can change this at will since the use of the RPC API
                  is internal to the libvirt codebase

 - API efficiency - the question of how/whether we batch up common 
                  operations to improve the efficiency of the network
                  implementation. Whether any batching is private to
                  the libvirt internals, or placed in the public API

So how should we move forward in this whole area ? We've had many mailing
list discussions about remote management & lots of code from the proxy
to QEMU daemon, to the generic remote daemon, but there still seem to be
very different views of how this works from even the most basic starting
points.

My overriding concern is that we don't release anything until we're
confident we can support it long term without breaking compatability
with future releases. ie at very least old clients should always be
able to talk to new servers. Arguably new clients ought to be able to
talk to old servers too - albeit with possibly reduced functionality.

That clearly means we need stability in the wire format from day-1.
That puts some constraints on our internal API - but it is still
internal so it could be re-written completely if desired, provided
we keep wire format.  API efficiency feels like one of those issues
we can evolve over time - if we have a wire format that lets us do
versioning, we can add new RPC calls at will without breaking old
ones.

> [1] - Again, that's the kind of optimisation I think is really useful
> rather than "SunRPC is faster then XML-RPC"

I wouldn't prioritise one particular optimization over the other. The
efficiency of the general on-the-wire transmission & data (de-)marshalling
routines, is just as important to consider as the design of the APIs being
run over the transport. We really need to consider both the wire protocol
and the possibilty of 'bulk operation' APIs - a decision on one of these
shouldn't significantly impact the decision wrt the other since they're
at different layers of the comms stack. 

Regards,
Dan.
-- 
|=- Red Hat, Engineering, Emerging Technologies, Boston.  +1 978 392 2496 -=|
|=-           Perl modules: http://search.cpan.org/~danberr/              -=|
|=-               Projects: http://freshmeat.net/~danielpb/               -=|
|=-  GnuPG: 7D3B9505   F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505  -=|