[libvirt] [PATCH] cgroup: be robust against cgroup movement races
Daniel P. Berrange
berrange at redhat.com
Tue May 21 09:43:01 UTC 2013
On Mon, May 20, 2013 at 09:07:17PM -0600, Eric Blake wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=965169 documents a
> problem starting domains when cgroups are enabled; I was able
> to reliably reproduce the race about 5% of the time when I added
> hooks to domain startup by 3 seconds (as that seemed to be about
> the length of time that qemu created and then closed a temporary
> thread, probably related to aio handling of initially opening
> a disk image).
>
> There are some inherent TOCTTOU races when moving tasks between
> kernel cgroups, precisely because threads can be created or
> completed in the window between when we read a thread id from the
> source and when we write to the destination. As the goal of
> virCgroupMoveTask is merely to move ALL tasks into the new
> cgroup, it is sufficient to iterate until no more threads are
> being created in the old group, and ignoring any threads that
> die before we can move them.
>
> It would be nicer to start the threads in the right cgroup to
> begin with, but by default, all child threads are created in
> the same cgroup as their parent, and we don't want vcpu child
> threads in the emulator cgroup, so I don't see any good way
> of avoiding the move. It would also be nice if the kernel were
> to implement something like rename() as a way to atomically move
> a group of threads from one cgroup to another, instead of forcing
> a window where we have to read and parse the source, then format
> and write back into the destination.
>
> * src/util/vircgroup.c (virCgroupAddTaskStrController): Ignore
> ESRCH, because a thread ended between read and write attempts.
> (virCgroupMoveTask): Loop until all threads have moved.
>
> Signed-off-by: Eric Blake <eblake at redhat.com>
> ---
> src/util/vircgroup.c | 28 ++++++++++++++++++++--------
> 1 file changed, 20 insertions(+), 8 deletions(-)
>
> diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c
> index 07ea2c3..6d44694 100644
> --- a/src/util/vircgroup.c
> +++ b/src/util/vircgroup.c
> @@ -1037,7 +1037,11 @@ static int virCgroupAddTaskStrController(virCgroupPtr group,
> goto cleanup;
>
> rc = virCgroupAddTaskController(group, p, controller);
> - if (rc != 0)
> + /* A thread that exits between when we first read the source
> + * tasks and now is not fatal. */
> + if (rc == -ESRCH)
> + rc = 0;
> + else if (rc != 0)
> goto cleanup;
>
> next = strchr(cur, '\n');
> @@ -1074,15 +1078,23 @@ int virCgroupMoveTask(virCgroupPtr src_group, virCgroupPtr dest_group)
> !dest_group->controllers[i].mountPoint)
> continue;
>
> - rc = virCgroupGetValueStr(src_group, i, "tasks", &content);
> - if (rc != 0)
> - return rc;
> + /* New threads are created in the same group as their parent;
> + * but if a thread is created after we first read we aren't
> + * aware that it needs to move. Therefore, we must iterate
> + * until content is empty. */
> + while (1) {
> + rc = virCgroupGetValueStr(src_group, i, "tasks", &content);
> + if (rc != 0)
> + return rc;
> + if (!*content)
> + break;
>
> - rc = virCgroupAddTaskStrController(dest_group, content, i);
> - if (rc != 0)
> - goto cleanup;
> + rc = virCgroupAddTaskStrController(dest_group, content, i);
> + if (rc != 0)
> + goto cleanup;
>
> - VIR_FREE(content);
> + VIR_FREE(content);
> + }
> }
>
> cleanup:
ACK, this looks like the best we can do here.
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://autobuild.org -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
More information about the libvir-list
mailing list