[libvirt] dhcp6, radvd, ip6tables, etc. (update)

Tue Oct 30 15:58:07 UTC 2012

On 10/29/2012 04:08 PM, Laine Stump wrote:
> On 10/29/2012 08:26 AM, Gene Czarcinski wrote:
>> On 10/27/2012 03:18 PM, Gene Czarcinski wrote:
>>> OK, I have the basic implementation for libvirt support of dhcp6. Let
>>> me say again that 98% of the work was already done.  There is still a
>>> bunch of work today which includes writing some tests, understanding
>>> how things such as bootp, dhcp-host, etc. should be supported with
>>> dhcp6, as well as the items I discuss below.
>>>
>>> 1.  Right now, the only way that dhcp6 is in effect is if there is no
>>> dhcp4 range definition.  This will be fixed/expanded so that, at a
>>> minimum, you can have both a dhcp4 and dhcp6 on the same interface.
>>> However, it appears to be easier to just pass to dnsmasq ANY/EVERY
>>> dhcp4 range or dhcp6 range defined in the xml.
>>>
>>> Comments?  Any input on which approach to use or avoid?
>> For the current situation, the implementation is for one (the first)
>> IPv4 dhcp and one (the first) IPv6 dhcp.  This introduces enough
>> little gotchas that need to be worked out.
> I think that is the proper thing to do for now. As discussed earlier,
> before supporting dhcp on multiple subnets of the same protocol (ipv6 vs
> ipv4) we would need to decide why and how we want to do that - IPs
> assigned from different subnets need to be matched with the IP address
> of that subnet, and it will take a more complicated dnsmasq commandline
> to do that, iirc.
I cannot think of a good reason to have multiple IPv4 or IPv6 dhcp-range 
specification.  Some day, someone will come up with a good reason but, 
for right now, I believe that one IPv4 and one IPv6 dhcp-range 
specifications is one of those "good enough" answers.
>
>>> 2.  I have modified radvd so both stateful (dhcp6) and stateless
>>> (SLAAC) addressing is supported with radvd for the default route.
>>> This is done on an interface basis (that is the way it works).  So if
>>> any dhcp6 range is specified, then stateful is used.  The way this is
>>> implemented will make it easy to add some tests verifying that the
>>> configuration parameters are working.  I intend this to be an
>>> expansion to networkxml2argvtest since it has the xml specification
>>> files which determine both dnsmasq and radvd configuration parameters.
>>>
>> NC ... working fine.
>>> 3. After completing what I thought was code that should result in a
>>> guest getting dhcp6 addresses, it was not working.  Once more it took
>>> me a little time to realize that ip6tables rules were blocking it. [I
>>> have been down this path before, you would think I would realize the
>>> problem sooner.]
>>>
>>> 3a. In looking over the ip6tables rules, I saw a whole bunch of
>>> additions at the top of the INPUT chain which were accepts for
>>> udp/tcp port 53.  In looking at the code in bridge_driver.c, I found
>>> that, every time a network device was started, 3 FORWARD rules and 2
>>> INPUT rules were added, but, when the network device was destroyed,
>>> only the 3 FORWARD rules were removed.  I believe this is a bug (but
>>> not high priority) and I will be submitting a separate patch to fix
>>> this.
>>>
>>> 3b. There are two different approaches for the rule which allows the
>>> dhcp6 server to work.  I could add (actually insert) one rule to the
>>> INPUT chain which accepted the packet if it is "-d ff02::1:2 "--dport
>>> 547".  Or, I could add (insert) a rule specifying "-i virbr__" for
>>> every IPv6 device which would be removed when the device was destroyed.
>> OBE - I chose the approach of adding (and removing) a rule per
>> interface.  The rule adds "--dport 547" but does NOT specify "-d
>> ff02::1:2".
> I haven't looked at how dhcp6 works, but if its anything like dhcp4, the
> IP address is irrelevant and shouldn't be included in the rule. As long
> as your rule specifies both the interface and port, that should be fine
> (take a look at the rules already being added for dhcp4) (and no, I have
> absolutely no idea why we add a rule to allow *tcp* on the dhcp port.
> It's just been that way since the first day I set eyes on the code).
Well, ff02::1:2 does have some meaning in dhcp6.

 From what I have seen by "well behaved" clients is that the client 
always uses port 546 and the server always uses port 547.  But, dnsmasq 
have some comments/code which indicates that not all clients are "well 
behaved."

In dhcp6, a little four dhcpv6 dance is performed to establish the 
clients address:

1. dhcpv6 solicit:  from=fe80::client:546  to=ff02::1:2:547
2. dhcpv6 advertise:  from=fe80::server:547  to=fe80::client:546
3. dhcpv6 request:  from=fe80::client:546  to=ff02::1:2:547
4. dhcpv6 reply:  from=fe80::server:547  to=fe80::client:546

Or, in other words: (1) need dhcpv6, (2) I serve it, (3) OK, give me 
one, and (4) here it is.

Since dnsmasq does its own packet filtering and with bind-interfaces 
having a real meaning, it all works (assuming that radvd has the right 
configuration).
>
>
>> This works With the radvd configuration and a dhcp-range specified for
>> a ipv6 subnet, a guest will get a dhcp6 address and RA default route.
> Interesting - so both radvd and dnsmasq are involved, correct?
Yes, why not?  Sometime in the future this should be reconsidered and 
either everything is done by dnsmasq or it is at least an option.  I 
must say that having dnsmasq do everything does have appeal ... one less 
dependency.
>
>>> 4.  After getting all of this working to my satisfaction, my next
>>> mountain to climb is VM ... it really does not like network xml
>>> definitions which include a dhcp-range for an ipv6 definition.
>>>
>>> Comments?
>>>
>>> NOTE:  I am implementing all of this assuming that my previous
>>> patches have been accepted ... the ones for creating a dnsmasq
>>> conf-file for parameters rather than using the dnsmasq command-line.
> I have no problem with the "convert from long commandline to conf file"
> patch except for the bit that points to a "conf directory" where user
> supplied conf files can be added. Aside from that part needing to be in
> a spearate patch, if we're going to add that kind of configurability, we
> need to do it in a way that will allow us to easily see that the user is
> playing outside the fence (otherwise we spend a lot of time chasing
> "bugs" that end up being caused by user-supplied options).
Originally, I wanted the conf-dir so that I could pop in/out some 
configuration changes that would happen when dnsmasq re-read the 
configuration.  Well, it does not work that way and dnsmasq has to be 
restarted for some of the more interesting changes.  Given this, I 
believe that conf-dir serves no useful purpose and should be removed and 
I will remove it and resubmit the patch.
>
> Because we're in freeze right now I haven't spent a lot of time
> discussing that, but planned to send a message about it when I get a minute.
Right now I am working on getting dhcpv6 functional and, while what I 
have works, there is still more to do.
>
>>> I am sure that someone could spend the time refitting the dhcp6
>>> patches to the old code but why get aggravated?  If you folks do not
>>> want to do things that way, fine, please say so.  But if it is going
>>> to be accepted, then I would like some indication of this.
>> 5. As far as I can tell (or at least this is for dnsmasq),
>> "dhcp-no-override", "enable-tftp", "tftp-root=", and "dhcp-boot=" are
>> all IPv4 only and thus ignored for IPv6 in bridge_driver.  I have not
>> looked to see what network_conf.c does.
> "what network_conf.c does"? Well, it of course doesn't deal directly
> with those options, but the config that feeds into some of those options
> is parsed in virNetworkIPParseXML(), and is only done if the <ip>
> element is ipv4. But then you've already seen that code if you have dhcp
> working for ipv6 - the <dhcp> element is also only parsed for ipv6. The
> format-side code doesn't have that extra check; I guess I figured that
> if there was no way to configure ipv6 with dhcp or tftp, it was safe to
> assume any ip element with dhcp or tftp info was ipv4 anyway.
I have now dived into network_conf.c and a little into dnsmasq.c. Yet 
again I was surprised because most of what was needed for dhcpv6 was a 
little tweaking here and there.

To support dhcp-host for IPv6, I did assumed that for IPv6 no MAC 
address would be specified since it does not have a defined meaning in 
DHCPv6.  Therefore, in dnsmasq.c/hostsfileAdd(), if the mac==NULL, I use 
the IPv6 format of <hostname>,[<ipv6-addr>]  whereas for IPv4 it is 
either MAC,<ipv4-addr>,<hostname. or MAC,<ipv4-addr>.

This way most of the code works as is.  Dnsmasq has lots of options and 
different ways that dhcp-host= can be specified, but this is simple and 
I know it works.
>
>> 6.  Handling of the info for addn-hosts file and the dhcp-hostsfile.
>> This currently works because things are forced so that one and only
>> one IPv4 dhcp definition will be handled.  With the addition of IPv6
>> dhcp, things fall apart.
>>
>> 6a. addn-hosts:  The addn-hosts file is similar to the /etc/hosts file
>> in both form and function.  The <dns>-<host> specification is done on
>> an interface bases and, thus, the processing of the data and creation
>> of the file should only be done once.
>>
>> 6b. dhcp-hostsfile (dhcp-host=):  This needs to be done at least for
>> every ip definition that is processed for dhcp.  Initially, this will
>> be for dhcp4 only until I can figure out how to do it for dhcp6.
>>
>> 6c. Thus, networkBuildDnsmasqHostsfile() needs to be split into two
>> functions [one for addn-hosts and one for dhcp-hosts]. Additionally,
>> all the functions which call dnsmasqSave() need to be reworked
>> appropriately.
>
> I've actually never liked the "dnsmasqContext" concept, as it seems like
> overkill and has conceptual problems such as what you've described. I
> would be just as happy with replacements that were simpler and easier to
> deal with. I think part of what complicates dnsmasq.[hc] is that it's in
> the util directory, so it isn't allowed to understand the contents of
> virNetworkDef, and must instead be sent the list of hosts in a simpler
> format. If, instead, there was a file src/network/bridge_dnsmasq.[hc],
> that could have functions that took virNetworkDef as an arg, and just
> immediately return a string (or, in a separate function, write to a
> file). That should simplify calling, and writing tests. And existing
> dnsmasq-related functions in bridge_driver.c could be moved there as
> well, reducing clutter in bridge_driver.c. (Of course I'm saying all
> this without ever seriously considering it, just talking off the cuff,
> so I may be completely wrong :-)
Right now I have fixed things up so they work.  I would like to leave 
this as an exercise for someone else [or at least a later time].

Besides, isn't bridge_driver.c pretty much for dnsmasq only?
>
>
>> 7. So far, the only things I have done involving the xml specification
>> is to enable <dhcp> for IPv6.  However, the  xml to specify a dns
>> addn-hosts appears, IMHO, to be overly verbose and complicated.
> It is made that way because a single IP address may have multiple
> hostnames associated with it, and we want to avoid having multiple
> methods of describing the same thing. What you propose in the next
> couple sentences was already proposed, tried, and rejected when dns host
> support was originally added.
OK.
>
>>    So, while allowing the current xml to be valid, I suggest adding an
>> alternate form for which is similar to that used for dhcp-host.  An
>> example is "<host ip='1.2.3.4' name='one' />"
>>
> There are a few places in libvirt's XML where the same thing can be
> expressed in two different ways, but that is only done when necessary
> because the existing XML is unable to completely describe the new
> functionality but backward compatibility is required. It creates all
> sorts of problems when formatting the config back into XML though (which
> of the two do you choose? Or do you do both? Either of these is a bad
> answer), therefore we definitely don't want to do that except in cases
> where it is absolutely necessary; this isn't one of those cases.
This is why I have tried to tweak and bend things so it works (and 
because it mostly worked before).

And now, as the saying goes, one more thing.

I now realize that I am going to need to get into virsh net-update since 
I am adding things to the xml specification and net-update will need to 
differentiate between dhcp4 and dhcp6 changes.

Another thought that occurs to me is whether there has any consideration 
been given having a "virsh net-restart" which would just restart dnsmasq 
and radvd.  Typing stuff in for the command line of net-update is a 
little prone to typos.  Wouldn't having net-edit and net-restart do what 
is intended for net-update.  Maybe there is a way to have net-update do 
the equivalent of net-edit/net-restart.  For example, if you only did 
"virsh net-update <network>" it would do it.

BTW, as I mentioned in another message, net-update for <dns> <host> does 
not work.

Gene