[libvirt] [PATCH] network: Add support for configuring dhcp lease time

Nehal J Wani nehaljw.kkd1 at gmail.com
Fri Sep 23 19:42:47 UTC 2016


On Fri, Sep 23, 2016 at 9:26 PM, Laine Stump <laine at laine.org> wrote:
> On 09/23/2016 08:24 AM, Nehal J Wani wrote:
>>
>> On Fri, Sep 23, 2016 at 4:51 PM, Peter Krempa <pkrempa at redhat.com> wrote:
>>>
>>> On Fri, Sep 23, 2016 at 15:33:46 +0530, Nehal J Wani wrote:
>>>>
>>>> On Fri, Sep 23, 2016 at 1:20 PM, Peter Krempa <pkrempa at redhat.com>
>>>> wrote:
>>>>>
>>>>> On Fri, Sep 23, 2016 at 12:19:49 +0530, Nehal J Wani wrote:
>>>>>>
>>>>>> The default dhcp lease time set by dnsmasq is only one hour, which can
>>>>>> be
>>>>>> pretty small for developers relying on ip address(es) to be consistent
>>>>>> across reboots.
>>>>>
>>>>> This should be handled through the leases file which is supposed to
>>>>> remember the IP addresses given to a host despite a short lease time.
>>>>> Is
>>>>> that broken for you?
>>>>
>>>> We can't ask developers to parse the .status file and the DHCP leases
>>>> API would never show expired leases. Or am I misinterpreting what you
>>>
>>> If a lease is expired then the VM is necessarily not using it for a
>>> while.
>>>
>>>> mean by 'remember'? Even dnsmasq depends on the output for the init
>>>
>>> I meant 'remember' as dnsmasq should assign the same address even if the
>>> lease was expired for a while.
>>
>> Yes, but then you are assuming that dnsmasq hasn't been restarted
>> after the lease has expired. User may switch distros in between or
>> shutdown is computer, etc.
>>>>
>>>> event sent to the leaseshelper program, which will not output lease
>>>> which have expired.
>>>
>>> So then maybe this is the actual bug. The expired leases still can be
>>> renewed and AFAIK dnsmasq reloads them even for the expired entries so
>>> we probably should remove that part from the leaseshelper.
>>
>> After addition of leaseshelper, because of the --leasefile-ro option
>> that we specify to dnsmasq, the leases database is entirely controlled
>> by it. If the developer shuts down his VM, the lease expires and then
>> he restarts the computer (which will involve restarting the dnsmasq
>> service), dnsmasq will rely on the leaseshelper program for the
>> initial input (aka init event). The first parameter from the 'init'
>> output is the expiry time of the lease. If the lease has already
>> expired, then we can't print it in virLeasePrintLeases(). Hence I fail
>> to understand why you would think that this may be a bug.
>>
>> Why would libvirt/leaseshelper renew a certain lease? Isn't that the
>> job of the client?
>
>
> I'm pretty sure that's not what Peter is suggesting. What he's saying is
> that dnsmasq normally keeps track of even leases that have expired and tries
> not to give them out to others who are simply requesting "give me an
> address", so that when a client returns at some later time they have a
> chance of getting back the same address they had before even if their lease
> had expired (as long as it wasn't needed for some other client due to all
> the other available addresses being used).
>
> I at least *thought* this worked with a standard dnsmasq setup.
>
>>>>>> This patch adds support for setting a lease time in the network
>>>>>> definition.
>>>>>> For now, all IP ranges under one <dhcp> will share a common leasetime.
>>>>>>
>>>>>> An example:
>>>>>> <dhcp>
>>>>>>    <leasetime units='hours'>12</leasetime>
>>>>>> </dhcp>
>>>>>
>>>>> This sounds like a reasonable feature, but I don't really want to sell
>>>>> it as a workaround for the problem you've described above. Setting the
>>>>> lease time is useful, but guests should get the same IP addresses
>>>>> thanks to the lease file.
>>>>
>>>> Actually, the major reason behind this feature was mentioned at
>>>> https://www.redhat.com/archives/libvir-list/2016-April/msg01071.html
>
>
>
> I'm a bit curious if you had any contact with the author of that patch other
> than CC'ing them to your patch mail. From the BZ
> (https://bugzilla.redhat.com/show_bug.cgi?id=913446 ), it sounds like he was
> still intending to update his patch but was bogged down implementing some
> extra functionality.
>
>>> As I've said, adding ability to specify longer lease time may be
>>> desired, but the fact that you don't get the address most probably is
>>> not related to this and should be addressed separately.
>>>
>>>> Let's say that the lease time is 2 hours and I shutdown my VM. Now I
>>>> reboot it after 2hours 5mins, the lease will be gone, since it will be
>>>
>>> The lease will expire, but dnsmasq still can reuse that address for the
>>> particular mac address until the pool is depleted.
>>>
>>>> expired. The .status file is read and the leases which have expired
>>>> are removed. So, there is no guarantee that my VM will get the same IP
>>>
>>> Again, this looks like the actual bug.
>
>
> Agreed. Here's another report of what I think is the same bug:
>
> https://www.redhat.com/archives/libvir-list/2016-September/msg00739.html
>
> and a patch that attempts to fix it in a different manner (by adding
> "dhcp-authoritative" to the dnsmasq.conf file):
>
> https://www.redhat.com/archives/libvir-list/2016-September/msg00909.html
>
> I actually don't think that either method is the correct solution to the
> problem.
>
> To back up for a minute - I think the source of everyone's problems is that
> dnsmasq as configured by current libvirt *completely* forgets leases as soon
> as they have expired, so that when a guest asks for the same address again,
> it is refused (because 1) although dnsmasq doesn't have any info that the
> address is currently assigned to anyone else, it is concerned that it may
> not be the only dhcp server on the network, and some other server may have
> given it out, and 2) what business does some lowly client have demanding
> which address it wants from dnsmasq anyway?!? </tongueInCheek>).

I was wrong. Peter and you both are right. There does seem to be a bug
in the leases helper program. All it has to be, is a program to
maintain a database and follow certain events. It was never supposed
to control the expiry of leases on its own. Here is what it does in
case of any event:

112         /* Check whether lease has expired or not */
113         if (expirytime < currtime) {
114             i++;
115             continue;
116         }

Which, IMO, is wrong. When init event takes place, the leases helper
is supposed to print all leases, irrespective of whether or not they
have expired. Since it doesn't do so, dnsmasq doesn't come to know
about the expired lease.
According to the discussion at
http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2014q2/008657.html
, the output from the init event has to be exactly as same as that of
the leases file which dnsmasq used to maintain previously for
libvirt-managed networks.

In dnsmasq's code, it almost treats both databases to be same:

Without --leasefile-ro:

leasestream = daemon->lease_stream = fopen(daemon->lease_file, "a+");

With --leasefile-ro and --dhcp-script:

          strcpy(daemon->dhcp_buff, daemon->lease_change_command);
          strcat(daemon->dhcp_buff, " init");
          leasestream = popen(daemon->dhcp_buff, "r");

For an expired lease, dnsmasq sends a 'del' event anyway, so leases
helper doesn't need to worry about expiry at all.

I tried it out with hard coded 2m dhcp lease time and confirmed that
we are not sending the expired leases info on the init event. I also
confirmed that if leases helper doesn't control the expiry time on its
own, and prints the lease info on the init event, dnsmasq properly
sends the del event to remove it from our database instantly.

With the above change, IMO, we can rely on the fact that dnsmasq will
always know about the expired leases too.


>
> Martin's patch tries to solve the problem with "dhcp-authoritative" which,
> as far as I understand, tells dnsmasq "you are the keeper of *all* lease
> information on this network, so if you think the address is unused, it
> really is unused"; dnsmasq uses this information to freely grant a guest any
> address it asks for, as long as there is no current lease for it. This
> sounds troublesome to me - If client A's lease has expired (because it's
> turned off for a bit), is it okay for client B to come barging in and insist
> on grabbing the address that client A has just lost even though there are
> many other addresses still available? Sure, technically it's legal, but it
> seems unnecessarily disruptive.
>
> On the other hand, this patch tries to solve the problem by providing a way
> to lengthen the lease. As Peter points out a few paragraphs below,
> unnecessarily lengthening the lease is a bad idea too, because it will lead
> to complete depletion of the address pool in a network that has a lot of
> clients who are only transiently online (think of a test setup where maybe
> 2000 new guests are created, used, and destroyed in a day).
>
>
> I think the *real* solution is to fix the lease handling so that dnsmasq
> remembers leases after they've expired (assuming that can be done once
> leasefile-ro is set). They would be marked as "expired", and probably not
> even reported externally, but all their info would still be there internally
> for dnsmasq's use when considering what to do with incoming requests for
> specific IP addresses.
>
>
>>>
>>>> address again. Hence the developer would want to increase the lease
>>>
>>> No it's not guaranteed, but very likely.
>>>
>>>> time to something big, like 1 week, and go to sleep happily when the
>>>> VM is turned off at night :)
>>>
>>> That's a rather weak point. Setting too long lease time may on the other
>>> hand exhaust the pool. Usually the configured DHCP lease times are
>>> rather short.
>
>
> Again agreed (see above).
>
>
>>>
>>>>>> The value for attribute 'units' can be
>>>>>> seconds/hours/days/weeks/s/h/d/w.
>>>>>> If the leasetime specified is -1, it is considered to be infinite.
>>>
>>> [...]
>>>
>>>>>> diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng
>>>>>> index 1b4f980..023edfb 100644
>>>>>> --- a/docs/schemas/basictypes.rng
>>>>>> +++ b/docs/schemas/basictypes.rng
>>>>>> @@ -13,6 +13,12 @@
>>>>>>         <param name='pattern'>[0-9]+</param>
>>>>>>       </data>
>>>>>>     </define>
>>>>>> +  <!-- Our long doesn"t allow a leading "+" in its lexical form -->
>>>>>> +  <define name='long'>
>>>>>> +    <data type='long'>
>>>>>
>>>>> This looks weird. Also why would you need a separate type for this?
>>>>>
>>>> I didn't find any available data type for signed integers. Which one
>>>> would you rather use? (Signed, because we want to support -1 too)
>>>
>>> A quick grep revealled that there's a builtin "long" type used in at
>>> least one case.
>>>
>>>>>> +      <param name='pattern'>-?[0-9]+</param>
>>>>>> +    </data>
>>>
>>> [...]
>>>
>>>>>> diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c
>>>>>> index aa39776..d2372f2 100644
>>>>>> --- a/src/conf/network_conf.c
>>>>>> +++ b/src/conf/network_conf.c
>>>>>> @@ -1032,21 +1032,76 @@ virNetworkDHCPHostDefParseXML(const char
>>>>>> *networkName,
>>>
>>> [...]
>>>
>>>>>> +
>>>>>> +    save = ctxt->node;
>>>>>> +    ctxt->node = node;
>>>>>> +
>>>>>> +    leasetimeRV = virXPathLongLong("string(./text())", ctxt,
>>>>>> &leasetime);
>>>>>> +    if (leasetimeRV == -2) {
>>>>>> +        virReportError(VIR_ERR_XML_ERROR,
>>>>>> +                       _("Invalid long long value specified for
>>>>>> leasetime "
>>>>>> +                         "in definition of network '%s'"),
>>>>>> +                       networkName);
>>>>>> +        goto cleanup;
>>>>>> +    } else {
>>>>>> +        if (leasetime < 0) {
>>>>>> +            leasetime = -1;       /* infinite */
>>>>>> +        } else {
>>>>>> +            unit = virXPathString("string(./@unit)", ctxt);
>>>>>> +            if (virScaleTime(&leasetime, unit, scale,
>>>>>> +                             VIR_NETWORK_DHCP_LEASE_TIME_MAX) < 0) {
>>>>>> +                // let virScaleTime() report the appropriate error
>>>>>
>>>>> '//' comments should not be used in libvirt. Also the comment does not
>>>>> make sense since it's obvious what is happening.
>>>>
>>>> I have a query regarding this. If we rely on virScaleTime to report
>>>> the appropriate error, it wouldn't be including the network name. Is
>>>
>>> That happens in quite a few cases.
>>>
>>>> that bad? Should I be modifying virScaleTime to return different
>>>> return values for different errors and shape the error message
>>>> appropriately here? (That would sort be inconsistent with
>>>
>>> No, not really.
>>>
>>>> virScaleInteger which doesn't have different error codes for different
>>>> scenarios)
>>>>
>>>> Whoops on the comment. Should have read hacking.html more carefully!
>>>
>>> [...]
>>>
>>>>>> @@ -2675,6 +2731,13 @@ virNetworkIPDefFormat(virBufferPtr buf,
>>>>>>           virBufferAddLit(buf, "<dhcp>\n");
>>>>>>           virBufferAdjustIndent(buf, 2);
>>>>>>
>>>>>> +        if (def->leasetime) {
>>>>>> +            virBufferAddLit(buf, "<leasetime");
>>>>>> +            virBufferAsprintf(buf, " unit='seconds'>%lld",
>>>>>
>>>>> Why aren't we remembering the unit the user used originally? Other
>>>>> places can't do that since the unit was added later, but with this
>>>>> newly
>>>>> added code you can remember and format back the unit just fine.
>>>>>
>>>> Actually, I followed what was originally done for memory units. Even
>>>> if user specifies MiB, everything is stored in KiB and on dumpxml, the
>>>
>>> This was because libvirt added support for scaled integers later than
>>> the original code. Thus we needed to keep compatibility. This is not
>>> required for new elements.
>>>
>>>> user is shown the unit in KiB. In case we want to remember the unit
>>>> specified by the user, we'll have to add another member in the
>>>> _virNetworkIPDef struct. Since we haven't done this anywhere, I was
>>>> apprehensive about having inconsistent behaviour for 'unit'.
>>>>
>>>>>> +                              def->leasetime);
>>>>>> +            virBufferAddLit(buf, "</leasetime>\n");
>>>>>> +        }
>>>
>>> [...]
>>>
>>>>>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>>>>>> index b57a195..81a60e6 100644
>>>>>> --- a/src/util/virutil.c
>>>>>> +++ b/src/util/virutil.c
>>>>>> @@ -275,6 +275,76 @@ virHexToBin(unsigned char c)
>>>>>>       }
>>>>>>   }
>>>>>>
>>>>>> +/**
>>>>>> + * virScaleTime:
>>>>>
>>>>> This change should be done as a separate patch.
>>>>>
>>>>>> + * @value: pointer to the integer which is supposed to hold value
>>>>>> + * @unit: pointer to the string holding the unit
>>>>>> + * @scale: integer holding the value of scale
>>>>>
>>>>> ^^ at least this value is not documented enough.
>>>>>
>>>>>> + * @limit: upper limit on scaled value
>>>>>> + *
>>>>>> + * Scale an integer @value in-place by an optional case-insensitive
>>>>>> @unit,
>>>>>
>>>>> The statement about case insensitivity doesn't look to be true
>>>>> according
>>>>> to the code blow.
>>>>
>>>> With case insensitivity, 'Seconds' and 'SEconDs' would also be valid
>>>> inputs, and it would be difficult to accommodate it in the .rng file,
>>>> plus I was told that if attributes are words, then we shouldn't be
>>>> going for case-insensitivity anyway. I'll fix this in the next patch
>>>> set.
>
>
> I had already said in IRC when you asked about representing case
> insensitivity in the RNG that any attribute values should *not* be made
> case-insensitive. Case insensitivity is for the FAT32 filesystem, not for
> libvirt XML.
>
>>> Either case, the comment and code should not behave differently. The
>>> comment looks copied from the memory number scaler, but the code
>>> certainly doesn't behave that way.
>>>
>>>>>> + * defaulting to @scale if @unit is NULL or empty. Recognized units
>>>>>> include
>>>>>> + * (w)eeks, (d)ays, (h)ours, (m)inutes and (s)econds. Ensure that the
>>>>>> result
>
>
> We don't want to have "w" and "weeks" represent the same thing. A single
> string should represent a single value. This can be easily handled by
> setting up and enum, then adding VIR_ENUM_DECL() and VIR_ENUM_IMPL() for it.
> We're not reimplementing /usr/bin/date.
>
>
>>>>>> + * does not exceed @limit.  Return 0 on success, -1 with error
>>>>>> message raised
>>>>>> + * on failure.
>>>>>
>>>>> This does not document the scale of the returned value at all.
>>>>>
>>>>>> + */
>>>>>> +int
>>>>>> +virScaleTime(long long *value, const char *unit,
>>>>>> +             long long scale, long long limit)
>>>>>
>>>>> Scale can't really be negative. Also the result being negative does not
>>>>> make much sense to me.
>>>>
>>>> Should this be changed to unsigned long long or unsigned long? Since
>>>> leasetime has to be a signed integer, what would be the ideal way to
>>>> pass it to the modified function? Should I typecast it?
>
>
> From above, it seems you want it to accept a negative value so you can
> specify "-1" to be "unlimited"?' What about "0"? For that matter, do you
> *really* want to allow that?
>
>
>
>>> Negative scale doesn't make sense. For time values, negative time value
>>> might make sense for relative time values. You need to check then that
>>> the overflow check works for negative numbers as well, which it doesn't
>>> since it was just copied from the memory integer scaler.
>>>
>>>
>>
>>
>



-- 
Nehal J Wani




More information about the libvir-list mailing list