[Libvir] Remote patch, 2007-02-28

Thu Mar 1 13:56:31 UTC 2007

Mark McLoughlin wrote:
> Hey,
> 	Btw, I'm really becoming quite convinced we'll evenutally regret using
> SunRPC if we stick with it.

Morning Mark, thanks for taking the time to look at the patch in detail.

Do you have some actual concrete problems with SunRPC?  For me it solves 
the problem of marshalling complicated data structures, including all 
the error infrastructure, over the wire (see src/remote_rpc.x).  It is 
very trim and efficient.  It demonstrably allows us to run over TLS, 
Unix domain sockets, and ssh.  It handles versioning for us.

On the other hand, coding with it is grotty to say the least.

But we definitely shouldn't publish the SunRPC interface or allow others 
to interoperate with it, so that we can keep our options open in future.

Some more comments and questions below.

> On Wed, 2007-02-28 at 21:03 +0000, Richard W.M. Jones wrote
> 
>> +AC_PATH_PROG(LOGGER, logger)
>>  
>> +AC_DEFINE_UNQUOTED(LOGGER, "$LOGGER",
>> +      [Define the location of the external 'logger' program, or
>> +       undefine to disable use of external 'logger'.  This is
>> +       used by libvirtd to write to syslog.])
> 
> 	You may have explained this before, but why not use syslog(3)?

Because SunRPC code has a pleasant tendancy to write messages to stderr.

>> +dnl Availability of various common functions.
>> +AC_CHECK_FUNCS([lrand48_r])
> 
> 	I've little interest in random number generation, so it might just be
> me, but do we really need this?

It's required because the SunRPC client needs semi-random IDs for 
outgoing messages.  See src/sunrpc/create_xid.c for the details.

>> @@ -233,6 +243,14 @@
>>  AC_SUBST(LIBXML_CONFIG)
>>  AC_SUBST(LIBXML_MIN_VERSION)
>>  
>> +dnl GnuTLS library
>> +AC_CHECK_LIB(gnutls, gnutls_handshake,
>> +       [],
>> +       [AC_MSG_ERROR([gnutls library not found])])
> 
> 	You should check for the headers too with AC_CHECK_HEADER() - I don't
> think AC_CHECK_LIB() will pass without the -devel package installed, but
> just in case ...

Yes, you're right.  I'll fix this in the next iteration.

>> +dnl /dev/urandom
>> +AC_CHECK_FILES([/dev/urandom])
> 
> 	You don't seem to use this?

Ah indeed.  I _used_ to use it, but then removed that code.  I'll kill 
this in the next iteration, although IIRC you'll be wanting /dev/urandom 
for something?

>> diff -urN --exclude=CVS --exclude=.git --exclude='*.pem' --exclude=demoCA libvirt-cvs/.gitignore libvirt-remote/.gitignore
>> --- libvirt-cvs/.gitignore      1970-01-01 01:00:00.000000000 +0100
>> +++ libvirt-remote/.gitignore   2007-02-26 14:43:51.000000000 +0000
> 
> 	Add this to your --exclude
>  
>> -virConfPtr virConfNew(void)
>> +virConfPtr
>> +_virConfNew(void)
> 
> 	Is git not making it easy enough for you to manage these patches
> individually? Why can't you get the remote patch from git without the
> other patches included?
> 
> 	(Honest question - I briefly tried to do this with git in the past and
> failed. That's why I use quilt.)

I'm quite sure that git could do this, but my level of git knowledge is 
pretty small at the moment, so I'm just using it as a kind of "local 
CVS".  I did supply some patches separately yesterday, so if those go in 
to the main libvirt tree then they will disappear from future releases 
of the remote patch.

>> +remote_open_ret *
>> +remote_open_1_svc (char **name, struct svc_req *req)
> 
> 	Weird, why is this passed a char**? Can we be sure it's never NULL?

That's the way SunRPC (in glibc at least) works.  Yes, you can be sure 
it's never NULL in this instance.

>> +        /* Log connection name. */
>> +        fprintf (stderr, "libvirtd (%d): open \"%s\"\n", getpid (), real_name);
> 
> 	logging this stuff to stderr isn't much good for us.
> 
> 	(Uggh, I see now ... this goes to the logger?)

Indeed.  Unless the -d (debug) option is passed on the command line in 
which case it goes to stderr.

>> +remote_close_ret *
>> +remote_close_1_svc (void *vp __attribute__((unused)),
>> +                        struct svc_req *req)
>> +{
>> +    static struct remote_close_ret ret;
> 
> 	static?

Again, that's the way SunRPC servers work.  There is no concurrency 
problem here, because SunRPC servers don't run concurrently (at least 
not the ones you can create using glibc).  If we want concurrency, then 
we prefork.

>> +remote_type_ret *
>> +remote_type_1_svc (void *vp __attribute__((unused)),
> 
> 	Use ATTRIBUTE_UNUSED

This is a bug.  ATTRIBUTE_UNUSED isn't in the collection of headers I'm 
using, but needs to be added.  I'll get this fixed in the next rev.

>> +
>> +// Just a number large enough to use for validation of the
>> +// externally produced maxids.
>> +#define MAX_DOMAINS 10000
> 
> 	Uggh.

Yes, not sure how to do this right.  Since the maxids comes from an 
external process, but leads to an array being allocated inside libvirtd, 
there is a very definite route to a DoS attack if the maxids parameter 
is allowed to be very large.  But how large is large?

>> +remote_listDomains_ret *
>> +remote_listdomains_1_svc (int *maxids,
>> +                          struct svc_req *req)
>> +{
>> +    static struct remote_listDomains_ret ret;
>> +    virConnectPtr conn = lookup_connection (req);
>> +
>> +    if (!conn) {
>> +        ret.status = -1;
>> +        bad_connection_error (&ret.remote_listDomains_ret_u.err);
>> +           // Sanity-check maxids before allocating the on-stack array.
>> +    } else if (*maxids < 0 || *maxids > MAX_DOMAINS) {
>> +        ret.status = -1;
>> +        out_of_memory_error (&ret.remote_listDomains_ret_u.err);
>> +    } else {
>> +        int ids[*maxids];
>> +        int len = virConnectListDomains (conn, ids, *maxids);
>> +
>> +        if (len >= 0) {
>> +            ret.status = 0;
>> +            ret.remote_listDomains_ret_u.ids.ids_len = len;
>> +            ret.remote_listDomains_ret_u.ids.ids_val = ids;
> 
> 	Um, how does that work? You've allocated an array on the stack, with a
> C99 idiom I thought we were avoiding, and you're going to continue using
> that array after returning from the function?

Um .. yes, that'll be a bug then.  Well spotted.

>> +        virDomainPtr dom = virDomainCreateLinux (conn,
>> +                                                 args->xmlDesc, args->flags);
>> +
>> +        if (dom) {
>> +            ret.status = 0;
>> +            // XXX No idea if this is the right thing to do.
>> +            ret.remote_domainCreateLinux_ret_u.domain =
>> +                malloc (sizeof *ret.remote_domainCreateLinux_ret_u.domain);
> 
> 	Well, you don't check the malloc() succeeded for a start. (I know, you
> don't agree it's needed ... but it's not something we should stop doing
> on an ad-hoc basis)

A check is needed here - this is a bug.

As an aside, I only ever claimed we shouldn't check for malloc when the 
only good thing to do is to fail.  In long running daemons or any other 
security-related context that's not the case because it's a DoS attack.

In any case I really need to find out what the right way to allocate 
these structures is (hence XXX comment, etc.)  I'll try and get this 
right in the next iter.

> 	But, I think you're wondering how to free() what you've just allocated.
> That's a very important point, we need to figure that out.

Supposedly clnt_freeres (called by the stub code in SunRPC) should free 
the allocation, but that's what I need to work out.

>> +            ret.remote_domainCreateLinux_ret_u.domain->name =
>> +                (char *) dom->name;
> 
> 	I'd use virDomainGetName()

Can you explain more?

>> +static void
>> +copy_error (remote_error *err, virErrorPtr verr)
>> +{
>> +    err->code = verr->code;
>> +    err->domain = verr->domain;
>> +    err->message = verr->message ? : null_string;
>> +    err->level = verr->level;
>> +    if (verr->dom) {
>> +        // XXX I have no idea if this is the right way to do this.
>> +        err->dom = malloc (sizeof *err->dom);
>> +        err->dom->name = verr->dom->name;
> 
> 	Same issue as above? Where can we free it?

Ditto.

>> +
>> +/*----------------------------------------------------------------------*/
>> +/* Map SunRPC connections to virConnectPtr. */
>> +
>> +/* SunRPC is "connectionless", but in fact when used over TCP it is
>> + * really connection-oriented.  If your TCP connection goes down,
>> + * the client needs to manually reconnect, which the remote client
>> + * never does.  So we associate virConnectPtr with an actual TCP
>> + * connection.
>> + *
>> + * The questions are: (1) How and where do we store this association?
>> + * (2) How can we clean up when the connection goes away?
>> + *
>> + * So for (1) we note that each server-side callback gets a pointer
>> + * to struct svc_req, which contains a pointer to the transport
>> + * (SVCXPRT *).  It turns out (you need to read the code closely)
>> + * that each transport pointer is unique to the connection, so we
>> + * use that.
> 
> 	Presumably this means that on the client side we don't try and share an
> RPC connection amongst multiple libvirt connections?

There's now a 1-1 relationship between virConnectPtr and the underlying 
TCP connection.  Is that what you meant?  This comes out of Dan's email 
about cookies (cookies have been removed completely from the code in the 
meantime).

> 	req->rq_xprt->xp_sock might do the trick too, no difference I guess.
> 
>> +
>> +static struct virconnmap {
>> +    struct virconnmap *next;
>> +
>> +    /* Key. */
>> +    SVCXPRT *xprt;
>> +
>> +    /* conn may be NULL in the case where a mapping exists, but the
>> +     * virConnectPtr has either not been created yet, or has been
>> +     * properly closed using the remote_close call.
>> +     */
>> +    virConnectPtr conn;
>> +} *virconnmap = 0;
> 
> 	(Btw, please use NULL instead of 0)

http://www.faqs.org/faqs/C-faq/faq/ (section 5.2)
'0' in a pointer context is converted to the representation of NULL 
(even if the representation of NULL on some wierd architecture is not 
zero bits).  Or do you mean stylistically you prefer NULL?

> 	Hmm, I think I'd do this with a realloc()able array rather than a
> linked list ... let's see how that looks:
[realloc]

I'll take a look at both and use the one which is easier to 
understand/shorter.  (God, I really hate reimplementing fundamentals :-)
See next iteration.

>> +    /* Do we need to clean up the connection?  If the client called
>> +     * remote_close then conn will be NULL.  Otherwise the connection
>> +     * has been dropped without a clean close, so we close it here.
>> +     */
>> +    if (p->conn)
>> +        (void) virConnectClose (p->conn);
> 
> 	Why the void cast, and why not log an error if it fails?

Yes, that's a bug.  I think my thinking was that there was no place to 
log the error, but in fact that was wrong.

>> +static void
>> +set_connection (struct svc_req *req, virConnectPtr conn)
>> +{
>> +    SVCXPRT *xprt = req->rq_xprt;
>> +
>> +    struct virconnmap *p;
>> +    for (p = virconnmap; p; p = p->next)
>> +        if (p->xprt == xprt) {
>> +            p->conn = conn;
>> +            return;
>> +        }
>> +
>> +    abort (); // Should never happen.
> 
> 	Ouch. I'd be happier if we were using assert() around libvirt for this
> kind of stuff. Other places we just log an error.

This really is a "should never happen" place.  If it does happen, I want 
to know about it.  What's the right way to report that?  I worry that 
logging an ordinary error will mean it's just ignored.

>> +static void
>> +log_peer (int sock)
>> +{
>> +    /* Log the connection to stderr.  It will end up in syslog if
>> +     * logger is running.
>> +     */
>> +    int pid = getpid (), r;
>> +    fprintf (stderr, "libvirtd (%d): new connection\n", pid);
> 
> 	pid should be unsigned long or pid_t and format should be %lu, AFAIR.

Yup, bug, will investigate and fix.

>> +    if (conf) {
>> +        virConfValuePtr p;
>> +
>> +#define CHECK_TYPE(name,typ) if (p && p->type != (typ)) { \
>> +            fprintf (stderr, "libvirtd: %s: %s: expected type " #typ "\n", \
>> +                     conffile, (name));                                  \
>> +            exit (1); \
>> +        }
> 
> 	Why not just log and error and continue?

This is during start-up.  One thing I _really_ dislike about bind9 is 
that errors in the zone files don't cause the thing to fail to start up. 
  Instead you have to manually check /var/log/messages to see if it 
found any problems.  If you don't check then your website disappears off 
the map 12 hours later.  Thus, I like daemons that fail when the 
configuration file is broken.

>> +        p = virConfGetValue (conf, "tls_port");
>> +        CHECK_TYPE ("tls_port", VIR_CONF_STRING);
>> +        tls_port = p ? my_strdup (p->str) : tls_port;
> 
> 	You're presumably going to free these strings at some point, so you
> need to initialize them with strdup()ed strings before parsing so that
> you can safely free them always.

I wasn't planning on it.  Note that svc_run never returns, so libvirtd 
can only ever exit on a signal.  If my OS is leaking memory allocations, 
then I've got a really big problem that free isn't going to solve.

> 	Also, you should free the previous value before assigning.

In this case, the previous value is always a static string.

>> +    // This may not be an error.  cf. /etc/exports
>> +    if (!listen_tls && !listen_tcp && !listen_unix) {
> 
> 	Hmm, what does this comment mean?

In nfsd, if /etc/exports is empty, then nfsd doesn't start up.  So the 
parallel is that if /etc/libvirtd.conf contains directives disabling 
every service, then it is not a bug to just exit.

>> +        err = gnutls_certificate_allocate_credentials (&x509_cred);
>> +        if (err) { gnutls_perror (err); exit (1); }
> 
> 	Don't squash things together like that.
> 
> 	And I really don't like this exit(1) business - it's just lazy. If we
> add a PID file, for example, all these will need to be fixed so that we
> exit gracefully and clean up everything.

I was imagining that the wrapper script would clean up pid files if the 
process doesn't start up correctly.  I wouldn't rely on the process to 
do this - what happens if it segfaults for instance?

>> +    if (listen_unix) {
>> +        /* XXX Not sure if this is the right thing to do. */
>> +        if (unlink (unix_socket) == -1 && errno != ENOENT) {
>> +            perror (unix_socket);
>> +            exit (1);
>> +        }
> 
> 	Are we actually creating this socket in the filesystem instead of the
> abstract namespace?

In the filesystem, yes.

> 	If the exec fails, won't we die with a SIGPIPE the first time we write
> to the pipe?

I'll take a look at this.  There are actually other places where the 
code could die on SIGPIPE -- possibly on client going away unexpectedly.

> 	Return the dh_params and let the caller assign it to the global
> variable. Not sure why you split this code out into a separate function
> and not the rest of it?

Cut'n'paste job from the gnutls example code.

There's a Steve McConnell study where he shows that function length 
doesn't affect code comprehension.  I'll dig it up.

>> +/* Check the IP address matches one on the list of wildcards
>> + * tls_allowed_clients.
>> + */
>> +static int
>> +check_allowed_client (void *vp __attribute__((unused)), const char *addr)
>> +{
>> +    const char **wildcards = tls_allowed_clients;
> 
> 	I'd pass the tls_allowed_clients pointer to svc_create() rather than
> using the global variable here.

As an opaque data pointer?  Yes, that's much better abstracted.  I'll 
fix that in the next release.

> 	Okay, I've run out of steam ... hopefully I'll read over the rest
> another time.

Thanks again for looking at this.

Rich.