[libvirt] [PATCH 11/14] rpc: virnetserver: Fix race on srv->nclients_unauth

John Ferlan jferlan at redhat.com
Thu Dec 21 12:18:48 UTC 2017



On 12/21/2017 06:29 AM, Marc Hartmayer wrote:
> On Fri, Dec 15, 2017 at 07:16 PM +0100, John Ferlan <jferlan at redhat.com> wrote:
>> On 12/12/2017 06:36 AM, Marc Hartmayer wrote:
>>> There is a race between virNetServerProcessClients (main thread) and
>>> remoteDispatchAuthList/remoteDispatchAuthPolkit/remoteSASLFinish (worker
>>> thread) that can lead to decrementing srv->nclients_unauth when it's
>>> zero. Since virNetServerCheckLimits relies on the value
>>> srv->nclients_unauth the underrun causes libvirtd to stop accepting
>>> new connections forever.
>>>
>>> Example race scenario (assuming libvirtd is using policykit and the
>>> client is privileged):
>>>   1. The client calls the RPC remoteDispatchAuthList =>
>>>      remoteDispatchAuthList is executed on a worker thread (Thread
>>>      T1). We're assuming now the execution stops for some time before
>>>      the line 'virNetServerClientSetAuth(client, 0)'
>>>   2. The client closes the connection irregularly. This causes the
>>>      event loop to wake up and virNetServerProcessClient to be
>>>      called (on the main thread T0). During the
>>>      virNetServerProcessClients the srv lock is hold. The condition
>>>      virNetServerClientNeedAuth(client) will be checked and as the
>>>      authentication is not finished right now
>>>      virNetServerTrackCompletedAuthLocked(srv) will be called =>
>>>      --srv->nclients_unauth => 0
>>>   3. The Thread T1 continues, marks the client as authenticated, and
>>>      calls virNetServerTrackCompletedAuthLocked(srv) =>
>>>      --srv->nclients_unauth => --0 => wrap around as nclient_unauth is
>>>      unsigned
>>>   4. virNetServerCheckLimits(srv) will disable the services forever
>>>
>>> To fix it, add an auth_pending field to the client struct so that it
>>> is now possible to determine if the authentication process has already
>>> been handled for this client.
>>>
>>> Setting the authentication method to none for the client in
>>> virNetServerProcessClients is not a proper way to indicate that the
>>> counter has been decremented, as this would imply that the client is
>>> authenticated.
>>>
>>> Additionally, adjust the existing test cases for this new field.
>>>
>>
>> In any case, I think perhaps this can be split into two patches - one
>> that just sets/clears the auth_pending and then what's leftover as the
>> fix.
> 
> I’m not sure it’s necessary as the correct usage of auth_pending is
> already the fix. But let’s see then.
> 

OK. I understand. I assume by this point I was juggling in my head a few
of the various adjustments so I figured that by adding the flag and
separating the logic between just the set/clear flag and the reason it
was added would be easier to review, but then again that would make the
set/clear patch essentially meaningless.

>>
>>> diff --git a/src/libvirt_remote.syms b/src/libvirt_remote.syms
>>> index 7fcae970484d..cef2794e1122 100644
>>> --- a/src/libvirt_remote.syms
>>> +++ b/src/libvirt_remote.syms
>>> @@ -138,6 +138,7 @@ virNetServerClientGetUNIXIdentity;
>>>  virNetServerClientImmediateClose;
>>>  virNetServerClientInit;
>>>  virNetServerClientInitKeepAlive;
>>> +virNetServerClientIsAuthPendingLocked;
>>>  virNetServerClientIsClosedLocked;
>>>  virNetServerClientIsLocal;
>>>  virNetServerClientIsSecure;
>>> @@ -152,6 +153,7 @@ virNetServerClientRemoteAddrStringURI;
>>>  virNetServerClientRemoveFilter;
>>>  virNetServerClientSendMessage;
>>>  virNetServerClientSetAuthLocked;
>>> +virNetServerClientSetAuthPendingLocked;
>>>  virNetServerClientSetCloseHook;
>>>  virNetServerClientSetDispatcher;
>>>  virNetServerClientStartKeepAlive;
>>> diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
>>> index 72105cd9318f..6dd673ff3b23 100644
>>> --- a/src/rpc/virnetserver.c
>>> +++ b/src/rpc/virnetserver.c
>>> @@ -286,7 +286,7 @@ int virNetServerAddClient(virNetServerPtr srv,
>>>      srv->clients[srv->nclients-1] = virObjectRef(client);
>>>
>>>      virObjectLock(client);
>>> -    if (virNetServerClientNeedAuthLocked(client))
>>> +    if (virNetServerClientIsAuthPendingLocked(client))
>>>          virNetServerTrackPendingAuthLocked(srv);
>>>      virObjectUnlock(client);
>>>
>>> @@ -738,6 +738,25 @@ int virNetServerSetTLSContext(virNetServerPtr srv,
>>>
>>>
>>>  /**
>>> + * virNetServerSetClientAuthCompletedLocked:
>>> + * @srv: server must be locked by the caller
>>> + * @client: client must be locked by the caller
>>> + *
>>> + * Sets on the client that the authentication is no longer pending and
>>> + * tracks on @srv that the authentication of this @client has been
>>> + * completed.
>>
>> If the client authentication was pending, clear that pending and update
>> the server tracking.
> 
> Changed it.
> 
>>
>>> + */
>>> +static void
>>> +virNetServerSetClientAuthCompletedLocked(virNetServerPtr srv, virNetServerClientPtr client)
>>
>> One line for each argument.
> 
> Changed.
> 
>>
>>> +{
>>> +    if (virNetServerClientIsAuthPendingLocked(client)) {
>>> +        virNetServerClientSetAuthPendingLocked(client, false);
>>> +        virNetServerTrackCompletedAuthLocked(srv);
>>> +    }
>>> +}
>>> +
>>> +
>>> +/**
>>>   * virNetServerSetClientAuthenticated:
>>>   * @srv: server must be unlocked
>>>   * @client: client must be unlocked
>>> @@ -752,7 +771,7 @@ virNetServerSetClientAuthenticated(virNetServerPtr srv, virNetServerClientPtr cl
>>>      virObjectLock(srv);
>>>      virObjectLock(client);
>>>      virNetServerClientSetAuthLocked(client, VIR_NET_SERVER_SERVICE_AUTH_NONE);
>>> -    virNetServerTrackCompletedAuthLocked(srv);
>>> +    virNetServerSetClientAuthCompletedLocked(srv, client);
>>>      virNetServerCheckLimits(srv);
>>>      virObjectUnlock(client);
>>>      virObjectUnlock(srv);
>>> @@ -870,8 +889,9 @@ virNetServerProcessClients(virNetServerPtr srv)
>>>          if (virNetServerClientIsClosedLocked(client)) {
>>>              VIR_DELETE_ELEMENT(srv->clients, i, srv->nclients);
>>>
>>> -            if (virNetServerClientNeedAuthLocked(client))
>>> -                virNetServerTrackCompletedAuthLocked(srv);
>>> +            /* Mark the authentication for this client as no longer
>>> +             * pending */
>>
>> Some may say a superfluous comment... I'd probably say make it one line
>>
>> /* Update server authentication tracking */
> 
> Changed.
> 
>>
>>> +            virNetServerSetClientAuthCompletedLocked(srv, client);
>>>              virObjectUnlock(client);
>>>
>>>              virNetServerCheckLimits(srv);
>>> diff --git a/src/rpc/virnetserverclient.c b/src/rpc/virnetserverclient.c
>>> index 6adc3ec3ec01..5224bc7de1d5 100644
>>> --- a/src/rpc/virnetserverclient.c
>>> +++ b/src/rpc/virnetserverclient.c
>>> @@ -71,6 +71,7 @@ struct _virNetServerClient
>>>      bool delayedClose;
>>>      virNetSocketPtr sock;
>>>      int auth;
>>> +    bool auth_pending;
>>>      bool readonly;
>>>  #if WITH_GNUTLS
>>>      virNetTLSContextPtr tlsCtxt;
>>> @@ -375,6 +376,7 @@ static virNetServerClientPtr
>>>  virNetServerClientNewInternal(unsigned long long id,
>>>                                virNetSocketPtr sock,
>>>                                int auth,
>>> +                              bool auth_pending,
>>>  #ifdef WITH_GNUTLS
>>>                                virNetTLSContextPtr tls,
>>>  #endif
>>> @@ -384,6 +386,12 @@ virNetServerClientNewInternal(unsigned long long id,
>>>  {
>>>      virNetServerClientPtr client;
>>>
>>> +    /* If the used authentication method implies that the new client
>>> +     * is automatically authenticated, the authentication cannot be
>>> +     * pending */
>>> +    if (auth_pending && virNetServerClientAuthMethodImpliesAuthenticated(auth))
>>                            ^^^^^^
>> Why check twice? I'm not clear on the need for this check.  The setting
>> of auth_pending is based on @auth in the caller or perhaps the value of
>> auth_pending already. If there's some sort of check to be made perhaps
>> it's in ExecRestart.
> 
> Normally I would prefer an assert here… The intent behind this guard was
> that it should prevent the wrong usage of this function.
> 
> But I’ll move the check.
> 

I think this is perhaps why I made the separate the patches comment. I
was starting to overflow my heap.

>>
>> Also, if we returned NULL here without some sort of error, then couldn't
>> we get the libvirt failed for some reason error message which is always
>> not fun to track down.
> 
> Does it makes sense to leave the check here (and additionally add it to
> ExecRestart) and throw an error message here?
> 

Adding a check, a message, and having a NULL return to guard against
some future change passing the wrong parameters perhaps means that
future patch submitter (and eventual reviewer) didn't clearly understand
the purpose of the function...

Still this is a *NewInternal function - I would hope passing correct
arguments wouldn't be problem for it.

>>
>>> +        return NULL;
>>> +
>>>      if (virNetServerClientInitialize() < 0)
>>>          return NULL;
>>>
>>> @@ -393,6 +401,7 @@ virNetServerClientNewInternal(unsigned long long id,
>>>      client->id = id;
>>>      client->sock = virObjectRef(sock);
>>>      client->auth = auth;
>>> +    client->auth_pending = auth_pending;
>>>      client->readonly = readonly;
>>>  #ifdef WITH_GNUTLS
>>>      client->tlsCtxt = virObjectRef(tls);
>>> @@ -440,6 +449,7 @@ virNetServerClientPtr virNetServerClientNew(unsigned long long id,
>>>  {
>>>      virNetServerClientPtr client;
>>>      time_t now;
>>> +    bool auth_pending = !virNetServerClientAuthMethodImpliesAuthenticated(auth);
>>
>> I think it's much easier for readability to have result be auth !=
>> VIR_NET_SERVER_SERVICE_AUTH_NONE instead of this call... Perhaps even
>> easier to find all instances of AUTH_NONE.
> 
> This function/condition is used in four places. But sure, if you’re
> still saying it’s easier to read and to understand that a user is
> authenticated (and therefore the authentication is no longer pending) if
> auth_method == VIR_NET_SERVER_SERVICE_AUTH_NONE then I’ll replace this
> function at all places :)
> 

By this time I wasn't sure what was going to happen from Dan's patch 8
comment and the virNetServerClientNeedAuth[Locked] functions vs. the
name chosen (ImpliesAuthenticated) that was AFAICT trying to convey what
you discovered.

Whether the comparison is hidden behind some call or a direct comparison
I guess just becomes an implementation detail. I use cscope a lot to
find things, so searching on NONE is perhaps easier that searching on
NONE, finding some API that returning true/false based off of NONE, and
then searching on callers for that too.  OTOH, it's cleaner to have a
helper <sigh>.

Shorter answer, your call. If you want the helper, that's fine.

John

>>
>>>
>>>      VIR_DEBUG("sock=%p auth=%d tls=%p", sock, auth,
>>>  #ifdef WITH_GNUTLS
>>> @@ -454,7 +464,7 @@ virNetServerClientPtr virNetServerClientNew(unsigned long long id,
> 
> […snip…]
> 
>>> @@ -1545,6 +1569,22 @@ virNetServerClientNeedAuth(virNetServerClientPtr client)
>>>  }
>>>
>>>
>>> +/* The caller must hold the lock for @client */
>>> +void
>>> +virNetServerClientSetAuthPendingLocked(virNetServerClientPtr client, bool auth_pending)
>>
>> One line per argument
> 
> Changed.
> 
>>
>> John
>>
>>> +{
>>> +    client->auth_pending = auth_pending;
> 
> […snip]
> 
> --
> Beste Grüße / Kind regards
>    Marc Hartmayer
> 
> IBM Deutschland Research & Development GmbH
> Vorsitzende des Aufsichtsrats: Martina Koederitz
> Geschäftsführung: Dirk Wittkopp
> Sitz der Gesellschaft: Böblingen
> Registergericht: Amtsgericht Stuttgart, HRB 243294
> 




More information about the libvir-list mailing list