[PATCH] network: Fix a race condition when shutdown & start vm at the same time

Daniel P. Berrangé berrange at redhat.com
Fri Jun 19 17:16:02 UTC 2020

On Fri, Jun 19, 2020 at 01:06:33PM -0400, Laine Stump wrote:
> On 6/19/20 12:26 PM, Daniel P. Berrangé wrote:
> > Maintain a global "int nextTapID" counter, and just iterate on this.
> > NIC names can be upto 16 bytes, so we'll create billions of devices
> > before we have any chance of wrapping around and risking a collision
> > with a concurrently shutting down guest.
> I remember I tried doing a monotonically increasing "next" counter for
> macvtap (never actually pushed, but shown to [some users, I forget who,
> maybe ovirt devs?], and they didn't like it because the devices ended up
> with long difficult-for-a-human-to-remember/pick-out/recite names like
> macvtap42301 and macvtap43021. So instead we keep track of the names that
> are in use, and always look for the lowest available number when creating a
> new one. (of course doing that would greatly increase the likelyhood of
> exposing any race conditions, so...) Definitely if we change the behavior in
> this way we're going to hear about it, though :-)

People might complain, but I'm not convinced it really matters. Pid
numbers used to be nice & short until someone raised pid max and now
my processes have 7-digit long pids. It is surprising at first, but
it didn't really cause me any functional problems.

And there's still the option of just providing a fixed name if you
really need something predictable for a guest.

> Another issue I remember from macvtap is the possibility of a different
> process (not libvirt) having already created a macvtap device with the name
> that we are wanting to use. That is easily solved in the case of macvtap by
> just iterating through until we find one that we can create without failure.
> So basically you have the list/bitmap/whatever of devicenames currently in
> use, use that as a starting point to pick a name for the new device, but
> then also allow for that name to already be used, and retry.

I wasn't thinking we need a bitmap, literally just a single counter.
We start where we left off, trying until we succeeed, which is what
the kernel does anyway IIRC. Most other competing processes will
rely on the kernel auto-allocation so won't clash with us very
frequently if at all.

> Also, in the name of backward compatibility, we will need to support tap
> device names from the config that have %d in the name, e.g. the default name
> we send is "vnet%d", but we allow for someone to specify "mytap%d".

We could keep a counter per prefix, or just have a single global

|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

More information about the libvir-list mailing list