[libvirt] Need to re-work final "peer address" patches and re-push them

Thu May 12 19:31:16 UTC 2016

On 05/12/2016 05:12 AM, Daniel P. Berrange wrote:
> On Wed, May 11, 2016 at 11:57:36AM -0400, Laine Stump wrote:
>> I reverted these three patches that introduced and enabled a "peer"
>> attribute for type='ethernet' interface <ip> elements prior to the release
>> of 1.3.4 with the intent of fixing/re-posting them after release, but forgot
>> until today:
>>
>> https://www.redhat.com/archives/libvir-list/2016-April/msg01995.html
>>
>> I have patches for most of the bugs, but the one problem that still doesn't
>> have resolution is the naming of the "peer" attribute. In my opinion, having
>> the two address attributes named "address" and "peer" makes it ambiguous
>> which address is for the guest side and which for the host side (especially
>> since the attribute that has been named "peer" would be set to the "address"
>> in the netlink command, and the attribute named "address" would be set to
>> "peer" in the netlink command :-O).
>>
>> Since "address" is an existing attribute, and already used for the guest
>> side IP address in lxc type='bridge' interfaces, it must remain as-is. In
>> order to make it obvious that the new address is for the host side of the
>> tap (or veth pair in the case of lxc), I propose calling it either "host",
>> or "hostAddress", e.g:
>>
>>       <ip address='192.168.123.43' host='192.168.123.1' prefix='25'/>
>>
>> or
>>
>>       <ip address='192.168.123.4' hostAddress='192.168.123.1' prefix='25'/>
>>
>> (Vasiliy had suggested "hostPeer", but I dislike that, since it sounds like
>> "the peer of the host", which is even more misleading).
>>
>> Can some of you normally-opinionated people weigh in on this? I don't like
>> the feeling of making a unilateral decision :-)
>>
>> Also, I'm realizing that, although there was a patch to support setting the
>> host-side address (hmm - "hostSide"? nah) for lxc type='bridge' interface,
>> this is not at all useful, because anything plugged into a bridge should not
>> have any IP on the side plugged into the bridge. The place where it would be
>> useful for lxc would be (just as it is for qemu) with a type='ethernet'
>> interface - the guest-side veth would have "address" and the host-side veth
>> would have "hostAddress", and it would then properly work without needing a
>> bridge (which I think is the entire point). Since lxc doesn't currently
>> support type='ethernet', I think that initial support should be made for
>> qemu only, and when type='ethernet is added to lxc, it can be made to
>> support an IP address on both sides of the veth pair from the start.
>>
>> Lacking any useful responses, I'm thinking to update Vasiliy's patches to
>> use "hostAddress" (and fix the other bugs I had found) and re-post them.
> I'm not actually convinced your host/guest distinction actually matches
> what was being done with the peer attribute.
>
> First, off the virNetDevSetIPAddress change was doing the following
>
>   - "address" attribute is mapped to IFA_LOCAL in netlink
>   - "peer" attribute is mapped to IFA_ADDRESS in netlink
>
> What is the difference between IFA_LOCAL and IFA_ADDRESS you might
> ask ? You can see that in /usr/include/linux/if_addr.h comments:
>
>   * IFA_ADDRESS is prefix address, rather than local interface address.
>   * It makes no difference for normally configured broadcast interfaces,
>   * but for point-to-point IFA_ADDRESS is DESTINATION address,
>   * local address is supplied in IFA_LOCAL attribute.
>
> So we're setting the peer / IFA_ADDRESS to make point-to-point
> routing work correctly.
>
>
> In LXC containers, we set an IP address on the *guest* side of the
> interface, based on the 'address' attribute. The patches extended
> that by also setting the 'peer' address on the *guest* side. This
> is true regardless of the type of <interface> backend configured
>
> in QEMU machines, we set an IP address on the *host* side of the
> interface, based on the 'address' attributee. The patches extended
> that by also setting the 'peer' address on the *host* side. We only
> do this for type=ethernet backends.

Yes, thanks for making me realize that - I had been treating the veth 
device pair as a single interface, and naively assumed (because I'd been 
unable to test it due to the peer attribute missing from the formatter 
output) that the pair was altogether treated as a single device for 
purposes of IP configuration. (chalk it up to my pre-history working on 
PPP, where the two ends of the link always agreed on what each others' 
addresses were).

Still, I think it's wrong that <ip address='1.2.3.4'/> on qemu should 
set the IP address on the host side, and the exact same element in the 
same place in an lxc config should set the ip address on the guest 
side.Why be purposefully inconsistent between hypervisors when we don't 
need to (and when doing so could be the cause of even further divergence 
in the future)?

> IOW, whether we set addresses on the host or guest side of the
> interface right now is being determined by whether we use QEMU
> or LXC.  You can't say the existing 'address' attribute is either
> host or guest - it could be for either.

and that's what I don't like.

> Likewise the added 'peer'
> attribute is also either for host or guest address - and will
> *always* match the side used for the 'address' attribute. ie if
> 'address' was set on the host, then 'peer' would also be set
> on the host.
>
>
> So based on this understanding, I don't think your suggestion to
> try and distinguish 'address' as being a guest thing and 'peer'
> as being a host thing is actually correct. In fact I think that
> 'peer' name was in fact probably correct choice of naming.

I agree, if you're trying to exactly describe what's happening at the 
lowest level of the configuration of each individual device.

But if you look at it from a functional point of view, you have a single 
link with an IP address at both ends, and want to configure both of 
those addresses. At the lower level, the link is implemented with two 
devices not one (one on host , one on guest), and you almost certainly 
want to configure the two devices such that guestPeer == hostLocal and  
guestLocal == hostPeer (since it otherwise won't work properly).  (MB: 
my understanding/opinion of this has changed slightly since considering 
Andrea's question about differing prefixes for host and guest sides - 
see my response in that sub-thread)

In the case of LXC:

               (link handled by veth driver)
   guest veth <============================> host veth

We can configure both of these devices in libvirt's setup code because 
libvirt creates both devices prior to starting the container, although 
currently we only configure the guest-side veth because the host side is 
always attached to a bridge, and you don't need/want an IP address on a 
device connected to a bridge (it's assumed of course that devices 
*beyond* the bridge-attached veth will have an IP address). If we 
wanted, though, we could set the IP address of the host-side veth, which 
would be useful in case of <interface type='ethernet'> (if that were 
supported on lxc, which currently isn't the case).

qemu is more problematic though:

                      (link handled by qemu or vhost-net)
   guest emulated NIC <========================> host tap

Since the NIC device in the guest has no visibility to the host (it 
doesn't exist until qemu is started, and is only configurable by the 
guest OS), libvirt can only reasonably configure the host side tap 
device. (I suppose we could run a dnsmasq listening on the tap that 
could answer dhcp queries from the guest, or maybe add the capability to 
set guest-side IP addresses  via a guest agent (or maybe even run PPPoE 
on host and guest, thus creating another level of devices :-O), but at 
present for all practical purposes the best we can do is simply hope 
that the guest-side IP configuration we can't controll matches/mirrors 
the host side configuration that we can control.)

To come back to the point:

1) libvirt attempts to provide the same end-result (or as close as 
possible) for all the hypervisors for any given configuration; I think 
that having <ip address='blah'> set the guest-side local IP on one 
hypervisor, and the host side local IP on another doesn't live up to "as 
close as possible". Those are different entities, and should be 
configured separately.

2) It may be possible that there are valid configurations where 
guestPeer != hostLocal orguestLocal != hostPeer; such a thing doesn't 
come to mind at the moment. If not, then having dual config so that all 
4 can be represented seems like overkill. (I'm going to try some 
experiments with this after I'm done typing.) If we do decide that it is 
overkill, then rather than changing the semantics of <ip> based on which 
hypervisor we're using, I think it would be better that the address 
attribute of <ip>, which currently has meaning only for lxc and means 
"the local-side IP of the guest interface" should continue to have that 
meaning when support for setting IP addresses for qemu is added.

3) If we do want to configure the host-side local and peer addresses 
separately from the guest-side, then we could consider this:

         <ip address='1.2.3.4' peer='4.3.2.1' prefix='8'/>
         <source ..... >
            <ip address='4.3.2.1' peer='1.2.3.4' prefix='32'/>
         </source>

(the ip element under "source" would be used to set the ip address on 
the host side. As an aside, it would have made more sense to have this 
IP address specified in the same place as the host-side device is named, 
but for some odd reason it is named in the <target> element, which has 
come to be the place where attributes pertaining to how the device 
appears on the *guest* side live :-/).