[libvirt] [PATCH v3 2/2] virtlogd: add missing netserver refcount increment on reload
eskultet at redhat.com
Fri Oct 27 11:58:54 UTC 2017
On Thu, Oct 26, 2017 at 09:14:02AM -0400, John Ferlan wrote:
> On 10/25/2017 05:48 AM, John Ferlan wrote:
> > On 10/25/2017 05:15 AM, Nikolay Shirokovskiy wrote:
> >> On 25.10.2017 12:06, John Ferlan wrote:
> >>> On 10/24/2017 06:47 AM, Nikolay Shirokovskiy wrote:
> >>>> After virNetDaemonAddServerPostExec call in virtlogd we should have
> >>>> netserver refcount set to 2. One goes to netdaemon servers hashtable
> >>>> and one goes to virtlogd own reference to netserver. Let's add
> >>>> missing increment in virNetDaemonAddServerPostExec itself while holding
> >>>> daemon lock.
> >>>> We also have to unref new extra ref after virtlockd call to virNetDaemonAddServerPostExec.
> >>>> ---
> >>>> src/locking/lock_daemon.c | 1 +
> >>>> src/rpc/virnetdaemon.c | 1 +
> >>>> 2 files changed, 2 insertions(+)
> >>>> diff --git a/src/locking/lock_daemon.c b/src/locking/lock_daemon.c
> >>>> index fe3eaf9..41a06b2 100644
> >>>> --- a/src/locking/lock_daemon.c
> >>>> +++ b/src/locking/lock_daemon.c
> >>>> @@ -275,6 +275,7 @@ virLockDaemonNewPostExecRestart(virJSONValuePtr object, bool privileged)
> >>>> virLockDaemonClientFree,
> >>>> (void*)(intptr_t)(privileged ? 0x1 : 0x0))))
> >>>> goto error;
> >>>> + virObjectUnref(srv);
> >>> I need to think a bit more about this one... different model in lockd
> >>> vs. logd over @srv usage especially w/r/t this particular path.
> >>> I see that in this path eventually something calls virNetDaemonGetServer
> >>> instead of storing the @srv in order to add lockProgram... In any case,> I guess I'd be worried/concerned that something could remove @srv
> >>> causing the Hash table code to unref/delete the srv... Also, in the
> >>> cleanup: path of main() wouldn't the Unref(srv) cause problems?
> >> But this unref only balance newly added ref. If there are other
> >> problems with reference conting in virtlockd - its a different
> >> concern.
> > Ok - so what I've learned is that virLockDaemonPostExecRestart returns
> > rv == 1 which indicates a successful restore resulting in a call to
> > virNetDaemonGetServer which will do a virObjectRef anyway leaving us
> > with (theoretically) 2 refs prior to the code that adds programs to @srv
> > (and less concern from my part of the subsequent Unref in cleanup: here).
> > If we saved srv in lockd like logd does, then we wouldn't need to call
> > virNetDaemonGetServer resulting in the same eventual result except for a
> > window after the Unref here where the refcnt == 1 until
> > virNetDaemonGetServer is run. Since I believe nothing could happen in
> > between that would cause the Unref due to HashFree being called, then I
> > think we're OK.
> > The concern being is if virNetDaemonGetServer didn't find the server,
> > then @srv would be NULL and the subsequent call to
> > virNetServerAddProgram for lockProgram would fail miserably, but I don't
> > think that can happen theoretically at least!
> > John
> >>> Again- need to think a bit longer.
> I've been thinking more about this more...
> Looking through history - I see commit id f08b1c58 removed @srv from
> _virLockDaemon, but _virLogDaemon wasn't adjusted likewise.
> I also see commit id 252610f7 which modified the virnetdaemon code to
> use hash tables. What I'm missing is why that patch didn't add the
> virObjectRef as well since the @srv is placed into the hash table to
> match the similar operation done for the virNetDaemonAddServer.
> It seems when first written perhaps 252610f7 came first with f08b1c58
> afterwards, but when committed it was in reverse order.
> There's just something not quite right and I'm not quite sure what the
> exact right patch(es) should be.
> "Theoretically speaking"...
> If we get a REF for alloc, then we need an UNREF
> If we get a REF for hash, then we need an UNREF
> If we get a REF for virNetDaemonGetServer, then we need an UNREF
> Without the REF in PostExec, then LogD will have a problem if there's a
> usage of logd->srv after the UNREF for the hash table happens. In
> particular, the UNREF in virLogDaemonFree. However, for LockD there's
> not a problem because there's REF's for GetServer and no UNREF on the
> allocation. For the "non"-PostExec path - there's a leak of @srv
> (theoretical) and what seems to be normal processing for the PostExec path.
> With the REF in PostExec, LogD would be fixed; however, LockD then has a
> leak because of the extra REF.
> Pushing patch1 seems to solve at least the problem w/ the late/delayed
> client access (as shown by the sleep() comments), but I fear it opens
> the door for a LogD problem because hash table ref gets decremented sooner.
> It's all very tricky and timing based.
> Perhaps someone else wants to have a look too and provide some thoughts?
I'll definitely have a look at the series, but being at KVM Forum, my workflow
is broken so to speak, hopefully I'll manage to get through this thoroughly
before you push this.
More information about the libvir-list