[libvirt PATCH] qemu: fix potential resource leak

Fri Oct 23 15:13:14 UTC 2020

On 10/23/20 1:28 AM, Peter Krempa wrote:
> On Thu, Oct 22, 2020 at 23:20:20 -0400, Laine Stump wrote:
>> On 10/22/20 3:01 AM, Peter Krempa wrote:
>>> On Wed, Oct 21, 2020 at 22:31:09 -0400, Laine Stump wrote:
>>>> On 10/21/20 5:50 PM, Jonathon Jongsma wrote:
>>>>> Coverity reported a potential resource leak. While it's probably not
>>>>> a real-world scenario, the code could technically jump to cleanup
>>>>> between the time that vdpafd is opened and when it is used. Ensure that
>>>>> it gets cleaned up in that case.
>>>>>
>>>>> Signed-off-by: Jonathon Jongsma <jjongsma at redhat.com>
>>>>> ---
>>>>>     src/qemu/qemu_command.c | 3 +++
>>>>>     1 file changed, 3 insertions(+)
>>>>>
>>>>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>>>>> index 5c4e37bd9e..cbe7a6e331 100644
>>>>> --- a/src/qemu/qemu_command.c
>>>>> +++ b/src/qemu/qemu_command.c
>>>>> @@ -8135,6 +8135,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>>>>>             addfdarg = g_strdup_printf("%s,opaque=%s", fdset,
>>>>>                                        net->data.vdpa.devicepath);
>>>>>             virCommandAddArgList(cmd, "-add-fd", addfdarg, NULL);
>>>>> +        vdpafd = -1;
>>>>>         }
>>>>>         if (chardev)
>>>>> @@ -8204,6 +8205,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
>>>>>         VIR_FREE(tapfdName);
>>>>>         VIR_FREE(vhostfd);
>>>>>         VIR_FREE(tapfd);
>>>>> +    if (vdpafd >= 0)
>>>>> +        VIR_FORCE_CLOSE(vdpafd);
>>>> VIR_FORCE_CLOSE() ==>virForceCloseHelper() ==> virFileClose()
>>>>
>>>> and virFileClose() is a NOP if fd < 0, so this doesn't need the conditional.
>>>>
>>>>
>>>> I *was* going to say "With that change,
>>>>
>>>>
>>>> Reviewed-by: Laine Stump <laine at redhat.com>
>>>>
>>>> "
>>>>
>>>>
>>>> but this got me looking at the code of the entire function rather than just
>>>> the diffs in the patch, and I've got a question - is there any reason that
>>>> you only ope n the vdpa device inside the switch, and save everything else
>>>> related to it until later (at the "if (vdpafd)")? You could then device
>>> I'd like to point out that opening anything in the command line
>>> formatters is IMO bad practice.
>>
>> Well... I agree that it is an ugly design, but that's pretty much what's in
>> place for almost everything.
> Sure, but it shouldn't be used as an argument to not use a better
> approach.
>
>
>>>    The command line formatter should format
>>> the commandline and nothing more.
>>
>> It would be nice if that was the case, but it already isn't. :-/
> No. Please don't put it this way. My message is that while the old code
> isn't compliant, it shouldn't be an excuse to make it worse!

Yes, I can see how my statement might be misunderstood as "the battle is 
already lost! Give up!!", which wasn't my intent.

>
> In this instace the code is commited. We don't need to change it,

^^ that was more my intent, and it seems we agree.

>   but
> any further change should be encouraged to use the better approach.
>
>>>    This is visible by the necessity to
>>> have a lot of mock override functions which prevent the command line
>>> formatter from touching resources on the host in the first place.
>>>
>>> Better approach is to open resources in 'qemuProcessPrepareHost' and
>>> store them in private data for command line formatting. Fake data for
>>> tests are added in 'testCompareXMLToArgvCreateArgs'.
>>
>> That's nice in the fact that it eliminates the need for mock overrides
>> (would be even *nicer* if that function had even a single line of
>> documentation included that described its purpose, and what code in the qemu
>> driver it should be mimicking, amirite?).
> I agree, documentation is lacking in many parts. This is the
> not-so-obvious techincal debt.
>
>> But it's bad because the code in qemuProcessPrepareHost() won't be tested
>> (can't be tested if there are no mocks for the system functions it calls).
> IMO this statement is misrepresenting what's happening. Sure
> qemuProcessPrepareHost can't be tested by unit testing. It's replaced by
> another function which fakes inputs. But that is EXACTLY what the mock
> functions preloaded into our tests are doing!
>
> With qemuProcessPrepareHost() you at least have one place and one
> function where all the code is aggregated

Where all the code *to prepare/setup devices* is aggregated. But in the 
process of aggregating that code, you've *split* the code for each 
individual device into multiple places with possibly no overt connection 
between them. Which of those is better depends on what you're doing. If 
you're adding a new feature, it means there is one more place you have 
to seek out and add something to. If there is something in each of those 
places that causes a build error when its omitted from the additions 
(e.g. a switch statement that needs a new case) then everything is 
great, but if it's just something that people are expected to find 
themselves by following the bread crumbs of a similar existing feature, 
then it becomes more trouble.

Keep in mind, I'm not saying that this is worse than the old way, just 
that it brings in its own new set of issues.

>   rather than spreading it
> trhough the command line generator and it being very hard to audit
> afterwards.
>
>> Basically you're trading the extra work of mocking system-level functions
>> for the extra work of filling in stuff in the privateData (and the
>> maintenance of that code), and eliminating testing of the code that's been
>> moved (pretty *lame* testing, I'll admit, since it's getting back canned
>> results from the fake system calls).
> As noted before the extent of testing is exactly identical.

Not identical. With the old method of mocking system calls, the code 
that, e.g., opens a file in sysfs and processes its contents is still 
tested (although with canned data that may not be representative of 
reality). With the new method, that code is segregated off and never 
called during testing - the "mock" is still there, just at a higher 
level and done by explicitly calling a different function rather than by 
faking out the dynamic linking.

If we wanted to satisfy both the segregation of the "prepare" code and 
its testing, then we would need to still maintain the mocked system 
library functions, and initialize the privateData by calling the actual 
prepare code rather than the replacements in 
testCompareXMLToArgvCreateArgs(). (Not that I'm suggesting that's what 
we should do - there is a limit to everything!)

>   The
> difference is that all the code which is touching the host is aggregated
> in one place and replaced by another function vs scattered accross the
> whole file and LD_PRELOAD-ed.
>
>
>> So it's not really the panacea your advocacy implies :-)
>>
>>
>> (Don't get me wrong! I've always disliked the mixing of device/file/whatever
>> init with the commandline generating functions.)
>>
>>
>> (actually a couple months ago I looked into putting the network interface
>> "prepare" stuff into privateData similar to what's done with the slirp stuff
>> now. In the end I gave up because it didn't provide the result I wanted - I
>> was trying to keep track of what device prep actions had been done for which
>> devices during domain startup so that the shutdown in case of startup
>> failure would only shutdown those things that had actually been setup; it
>> ended up being too complicated to make it work correctly both in the case of
>> an aborted startup and a normal shutdown, once you took into account the
>> possibility of libvirtd being restarted as part of a libvirt package update.
>>
>>
>> I'll point out that during all my searches through the code during the
>> experiment referenced in the previous paragraph, I never ran across
>> testCompareXMLToArgvCreateArgs(), and didn't know of its existence (or at
> Yup, sorry I've extracted it recently. Previously we've just piled up
> all the "init" code into testCompareXMLToArgv without thinking twice.
>
>> least didn't remember it, if I had known about it before). Is this
>> documented somewhere? Or is it expected to be learned by reading every patch
>> coming across the mailing list (I unfortunately fail at that in a major
>> way)?
> You see, I've failed the same way pointing out that the approach used
> was outdated.
>
> Also given that the documentation of the new approach would be applied
> via a patch, you could have missed it the same way as the patch
> implementing the new approach to initialize host-specific data (or it
> would be added at the same time). I doubt that any of us is re-reading
> contribution guidelines just for fun.

yup. Information overload is a real thing. :-/

>
>>> I'm aware though that there's a lot of "prior art" in this area though.
>>
>> ... and nothing in the code or the coding practices to warn against it,
>> point people in the other direction.
> Patches are welcome! :P

Dang it!! I had that in the original draft of my message, but removed 
it! (Being the first person to call "Patches are welcome" in an OSS 
mailing list discussion is similar to being the first to call 
"Shotgun!!" when everyone is heading for the car to drive someplace (for 
non-English speakers, the "shotgun" position in a car is the front row 
passenger's seat, next to the window. I guess this comes from the days 
of stagecoaches, when the only people sitting outside were the driver, 
and a man sitting next to him with a shotgun (or some other type of 
gun), to defend the coach from roadside bandits.)

>
>> This sounds like another "saga" in the making - split all commandline
>> generating functions into separate "prepare device" and "generate
>> commandline" parts. I don't know that we should require Jonathon to change
>> his code that much just to fix a memory leak though ... (too bad I hadn't
>> kept up with the latest cool stuff so I would have pointed it out in review
>> of the original patch).
> As said, I'm fine with prior art. We should encourage any further
> contributions to avoid this pattern though. Obviously if Jonathon wants
> to fix it I'm all for it. The fd leak fix is fine though.
>

Okay, we're in agreement then. Move along! Nothing to see here! :-)