[libvirt] [PATCH] add flag to enforce hugepage backing of guest RAM

Marcelo Tosatti mtosatti at redhat.com
Tue Feb 4 16:58:58 UTC 2014


On Tue, Feb 04, 2014 at 04:42:02PM +0000, Daniel P. Berrange wrote:
> On Tue, Feb 04, 2014 at 11:32:47AM -0500, Marcelo Tosatti wrote:
> > 
> > Add an element named "strict-hugepages" to control whether to 
> > refuse guest initialization in case hugepage allocation cannot 
> > be performed.
> > 
> > Signed-off-by: Marcelo Tosatti <mtosatti at redhat.com>
> > 
> > diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> > index ff50214..e79f5e6 100644
> > --- a/docs/formatdomain.html.in
> > +++ b/docs/formatdomain.html.in
> > @@ -632,6 +632,9 @@
> >        <dt><code>hugepages</code></dt>
> >        <dd>This tells the hypervisor that the guest should have its memory
> >          allocated using hugepages instead of the normal native page size.</dd>
> > +      <dt><code>strict-hugepages</code></dt>
> > +      <dd>This tells the hypervisor that the guest should refuse to start
> > +          in case of failure to allocate guest memory with hugepages</dd>
> 
> Huh, we already supply the -mem-prealloc command line flag alongside
> the -mem-path flag which should cause QEMU to exit if it cannot allocate
> all memory upfront.
> 
> > +/*
> > + * Returns bool: whether to fail guest initialization.
> > + *
> > + */
> > +static bool qemuValidateStrictHugepage(virDomainObjPtr vm, virQEMUDriverConfigPtr cfg)
> > +{
> > +    bool ret = false;
> > +    char **maps = NULL;
> > +    int i;
> > +    char *buf;
> > +
> > +    if (!vm->def->mem.strict_hugepages)
> > +        return ret;
> > +
> > +    ret = true;
> > +
> > +    if (!vm->def->mem.hugepage_backed || !cfg->hugepagePath) {
> > +        virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> > +                _("strict huge pages depends on huge pages"));
> > +        return ret;
> > +    }
> > +
> > +    buf = malloc(strlen(cfg->hugepagePath) + 50);
> > +
> > +    /* The parser requires /proc/pid, which only exists on platforms
> > +     * like Linux where pid_t fits in int.  */
> > +    if ((int) vm->pid != vm->pid ||
> > +        qemuParseProcFileStrings(vm->pid, "maps", &maps) < 0)
> > +        goto cleanup;
> > +
> > +    for (i = 0; maps && maps[i]; i++) {
> > +        char *endptr;
> > +        unsigned long start, end;
> > +        const char *map = maps[i];
> > +        bool found = false;
> > +
> > +        sprintf(buf, "%s/qemu_back_mem.pc.ram.", cfg->hugepagePath);
> > +        if (strstr(map,buf) != NULL)
> > +            found = true;
> > +
> > +        sprintf(buf, "%s/kvm.", cfg->hugepagePath);
> > +        if (strstr(map,buf) != NULL)
> > +            found = true;
> > +
> > +        if (!found)
> > +            continue;
> > +
> > +        errno = 0;
> > +        start = strtol(map, &endptr, 16);
> > +        if ((errno == ERANGE && (start == LONG_MAX || start == LONG_MIN))
> > +                || (errno != 0 && start == 0)) {
> > +                continue;
> > +        }
> > +
> > +        if (endptr && *endptr == '-')
> > +            endptr++;
> > +
> > +        if (!*endptr)
> > +            continue;
> > +
> > +        errno = 0;
> > +        end = strtol(endptr, NULL, 16);
> > +        if ((errno == ERANGE && (end == LONG_MAX || end == LONG_MIN))
> > +                || (errno != 0 && end == 0)) {
> > +                continue;
> > +        }
> > +
> > +        if (end-start >= vm->def->mem.max_balloon * 1024) {
> > +            ret = false;
> > +            break;
> > +        }
> > +    }
> 
> This code is really very nasty. It is both Linux specific and exposes
> libvirt to the private implementation detail of QEMU's backing file
> naming. On top of which I don't see why we need this when we're already
> setting -mem-prealloc to enforce exactly this.

BTW, you already do /proc parsing and already do hugetlbfs which is
Linux specific.

So if you can detail in what new way this code is nasty (accordingly to
your judgement above), i can look into improving it.




More information about the libvir-list mailing list