[libvirt] [PATCH V2] libxl: fix dom0 balloon logic

Jim Fehlig jfehlig at suse.com
Tue Mar 31 14:16:09 UTC 2015


Martin Kletzander wrote:
> On Tue, Mar 24, 2015 at 02:43:42PM -0600, Jim Fehlig wrote:
>> Recent testing on large memory systems revealed a bug in the Xen xl
>> tool's freemem() function.  When autoballooning is enabled, freemem()
>> is used to ensure enough memory is available to start a domain,
>> ballooning dom0 if necessary.  When ballooning large amounts of memory
>> from dom0, freemem() would exceed its self-imposed wait time and
>> return an error.  Meanwhile, dom0 continued to balloon.  Starting the
>> domain later, after sufficient memory was ballooned from dom0, would
>> succeed.  The libvirt implementation in libxlDomainFreeMem() suffers
>> the same bug since it is modeled after freemem().
>>
>> In the end, the best place to fix the bug on the Xen side was to
>> slightly change the behavior of libxl_wait_for_memory_target().
>> Instead of failing after caller-provided wait_sec, the function now
>> blocks as long as dom0 memory ballooning is progressing.  It will return
>> failure only when more memory is needed to reach the target and wait_sec
>> have expired with no progress being made.  See xen.git commit fd3aa246.
>> There was a dicussion on how this would affect other libxl apps like
>> libvirt
>>
>> http://lists.xen.org/archives/html/xen-devel/2015-03/msg00739.html
>>
>> If libvirt containing this patch was build against a Xen containing
>> the old libxl_wait_for_memory_target() behavior, libxlDomainFreeMem()
>> will fail after 30 sec and domain creation will be terminated.
>> Without this patch and with old libxl_wait_for_memory_target() behavior,
>> libxlDomainFreeMem() does not succeed after 30 sec, but returns success
>> anyway.  Domain creation continues resulting in all sorts of fun stuff
>> like cpu soft lockups in the guest OS.  It was decided to properly fix
>> libxl_wait_for_memory_target(), and if anything improve the default
>> behavior of apps using the freemem reference impl in xl.
>>
>> xl was patched to accommodate the change in
>> libxl_wait_for_memory_target()
>> with xen.git commit 883b30a0.  This patch does the same in the libxl
>> driver.  While at it, I changed the logic to essentially match
>> freemem() in $xensrc/tools/libxl/xl_cmdimpl.c.  It was a bit cleaner
>> IMO and will make it easier to spot future, potentially interesting
>> divergences.
>>
>> Signed-off-by: Jim Fehlig <jfehlig at suse.com>
>> ---
>>
>> V2: Actually use libxl_wait_for_memory_target(), instead of
>>    libxl_wait_for_free_memory()
>>
>> src/libxl/libxl_domain.c | 55
>> +++++++++++++++++++++++-------------------------
>> 1 file changed, 26 insertions(+), 29 deletions(-)
>>
>> diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
>> index 407a9bd..a1739aa 100644
>> --- a/src/libxl/libxl_domain.c
>> +++ b/src/libxl/libxl_domain.c
>> @@ -1121,38 +1121,39 @@ libxlDomainFreeMem(libxlDomainObjPrivatePtr
>> priv, libxl_domain_config *d_config)
>> {
>>     uint32_t needed_mem;
>>     uint32_t free_mem;
>> -    size_t i;
>> -    int ret = -1;
>> +    int ret;
>
> This variable is unnecessary and confusing since you don't return it.
> And without this, you can make ...
>
>>     int tries = 3;
>>     int wait_secs = 10;
>>
>> -    if ((ret = libxl_domain_need_memory(priv->ctx, &d_config->b_info,
>> -                                        &needed_mem)) >= 0) {
>> -        for (i = 0; i < tries; ++i) {
>> -            if ((ret = libxl_get_free_memory(priv->ctx, &free_mem))
>> < 0)
>> -                break;
>> +    ret = libxl_domain_need_memory(priv->ctx, &d_config->b_info,
>> &needed_mem);
>> +    if (ret < 0)
>> +        goto error;
>>
>
> ... these conditions smaller, e.g.:
>
> if (libxl_domain_need_memory(priv->ctx, &d_config->b_info,
> &needed_mem) < 0)
>     goto error;

Yeah, too much copy and paste.  I'll fix this up and send a V3.

> Do you need this for the release? (sorry for noticing that late)

No, this can wait.  From a Xen perspective, more useful work for the
release would have been this series

https://www.redhat.com/archives/libvir-list/2015-March/msg01337.html

But I think it is a bit too late now :-(.

Regards,
Jim




More information about the libvir-list mailing list