[libvirt] Re: [PATCH] Add huge page support to libvirt, v2..

Daniel P. Berrange berrange at redhat.com
Tue Jul 28 10:16:25 UTC 2009


On Tue, Jul 28, 2009 at 10:55:58AM +0100, Mark McLoughlin wrote:
> On Mon, 2009-07-27 at 22:55 +0100, Daniel P. Berrange wrote:
> > On Thu, Jul 23, 2009 at 09:00:18PM -0400, john cooper wrote:
> > > I've incorporated feedback received on the
> > > prior version into the patch below.
> > > 
> > > The host mount point for hugetlbfs is queried by
> > > default from /proc/mounts unless overridden in
> > > qemu.conf via:
> > > 
> > >     hugepage_mount = "<path_to_mount_point>"
> > > 
> > > This should make the concern of establishing
> > > a mount point path convention a non-issue for
> > > the general case while still allowing the same
> > > to be deterministically set if needed.
> > 
> > In light of what Chris said about extended attribute support
> > for SELinux I think we, sadly, have no choice by to mount
> > a new instance of hugetlbfs per VM, labelled with the context
> > of that VM.
> 
> I haven't played with hugetlbfs much, so let me make sure I understand
> the issue:
> 
>   -) With --mempath, qemu creates a file under the supplied directory 
>      and mmap()s it, using the mapped region for guest memory
> 
>   -) Because the directory is under hugetlbfs, that memory is huge page 
>      backed
> 
>   -) Guest memory can be read by reading this file
> 
>   -) The file is owned by the uid of the qemu process, so any other 
>      processes with that uid can access guest memory
> 
>   -) If we could label the file, we could use policy to prevent qemu 
>      processes from accessing each other's files
> 
>   -) We can't, so the alternative is to use a separate hugetlbfs mount 
>      for each guest, each with a different label - each mount will only 
>      contain the file for that guest, and only be accessible by that 
>      guest

Yes, that's pretty much the size of it.

NB, the file QEMU creates in hugetlbfs is immediately deleted, but you
can still access it via /proc/$PID/fd 

eg

 $ mount -t hugetlbfs none /mnt/
 $ qemu-kvm -m 60 --mem-prealloc --mem-path /mnt/ -cdrom ~berrange/boot.iso 
 $ pgrep qemu-kvm
 12441

 $ lsof -p 12441 | grep /mnt
 qemu-kvm 12441 root  DEL    REG       0,19              1756707 /mnt/kvm.izG73m
 qemu-kvm 12441 root    8u   REG       0,19  85983232    1756707 /mnt/kvm.izG73m (deleted)


 $ cat > filecon.c <<EOF
 #include <selinux/selinux.h>
 #include <stdio.h>
 #include <unistd.h>
 #include <fcntl.h>

 int main(int argc, char **argv) {
   int fd;
   security_context_t con;

   fd = open (argv[1], O_RDONLY);
   fgetfilecon(fd, &con);
   fprintf(stderr, "Context: %s\n", (char *)con);
   close(fd);
   return 0;
 }

 $ gcc -lselinux -o filecon filecon.c 
 $ ./filecon /proc/12441/fd/8
 Context: system_u:object_r:hugetlbfs_t:s0

We need the latest 's0' be to be different for every VM. Normally this 
would 'just work' since, if qemu is running under ':s75', then any fle
it create would be labelled ':s75' , but hugetlbfs doesn't support 
setting attributes

 # touch /mnt/foo
 # ls -lZ /mnt/foo 
 -rw-r--r--. root root system_u:object_r:hugetlbfs_t:s0 /mnt/foo
 # chcon system_u:object_r:hugetlbfs_t:s75 /mnt/foo 
 chcon: failed to change context of `/mnt/foo' to `system_u:object_r:hugetlbfs_t:s75': Invalid argument

But we can change the context per mount

 # mount -o remount,context=system_u:object_r:hugetlbfs_t:s75 /mnt/
 # ls -lZ /root/foo
 -rw-r--r--. root root system_u:object_r:hugetlbfs_t:s75 /root/foo

Regards,
Daniel
-- 
|: Red Hat, Engineering, London   -o-   http://people.redhat.com/berrange/ :|
|: http://libvirt.org  -o-  http://virt-manager.org  -o-  http://ovirt.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505  -o-  F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|




More information about the libvir-list mailing list