[PATCH] network: Fix a race condition when shutdown & start vm at the same time

Fri Jun 19 17:06:33 UTC 2020

On 6/19/20 12:26 PM, Daniel P. Berrangé wrote:
> On Fri, Jun 19, 2020 at 12:19:05PM -0400, Laine Stump wrote:
>> This would all be much simpler if the kernel would put a "FIN WAIT" state on
>> all tap device names so that they couldn't be re-used for a few seconds
>> after deletion, but it doesn't, so we have to work with what we've got.
> The problem is that the kernel reuses tap device names.
>
> Can we just take the kernel out of the equation and do auto-assignment
> of names ourselves.

Yeah, that's what we do for macvtap. It is extra code, non-trivial, and 
therefore prone to errors (as in, "anything over 1 line long is prone to 
errors").

I *thought* I had remembered that couldn't work properly for standard 
tap devices couldn't work due to the way that tap devices are "created" 
(you first open /dev/net/tun, then issue a ioctl(TUNSETIFF, 
"desiredname")  on the handle returned by open. I had been concerned 
that if you did that with an existing name it would just connect you to 
the existing tap, but I tried it just now and got an error the 2nd time 
I tried to use the same tap device name, so perhaps/probably I'm again 
misremembering.

>
> Maintain a global "int nextTapID" counter, and just iterate on this.
> NIC names can be upto 16 bytes, so we'll create billions of devices
> before we have any chance of wrapping around and risking a collision
> with a concurrently shutting down guest.

I remember I tried doing a monotonically increasing "next" counter for 
macvtap (never actually pushed, but shown to [some users, I forget who, 
maybe ovirt devs?], and they didn't like it because the devices ended up 
with long difficult-for-a-human-to-remember/pick-out/recite names like 
macvtap42301 and macvtap43021. So instead we keep track of the names 
that are in use, and always look for the lowest available number when 
creating a new one. (of course doing that would greatly increase the 
likelyhood of exposing any race conditions, so...) Definitely if we 
change the behavior in this way we're going to hear about it, though :-)

Another issue I remember from macvtap is the possibility of a different 
process (not libvirt) having already created a macvtap device with the 
name that we are wanting to use. That is easily solved in the case of 
macvtap by just iterating through until we find one that we can create 
without failure. So basically you have the list/bitmap/whatever of 
devicenames currently in use, use that as a starting point to pick a 
name for the new device, but then also allow for that name to already be 
used, and retry.

Note that since macvtap and standard tap devices (and, actually *all* 
network devices) in the same namespace need to not have conflicting 
names, we could keep a single list of in-use names. Since we would never 
auto-generate a tap device name that conflicts with an auto-generated 
macvtap name, that is probably unnecessary, but just in case it makes 
things easier...

Also, in the name of backward compatibility, we will need to support tap 
device names from the config that have %d in the name, e.g. the default 
name we send is "vnet%d", but we allow for someone to specify "mytap%d".

In the end, if we can do something simple that preserves current 
behavior while covering the hole, that would probably be more expedient, 
but I have to say the thought of naming the devices ourselves has 
crossed my mind.