[libvirt] [PATCH v1 02/11] bandwidth: Create hierarchical shaping classes

Michal Privoznik mprivozn at redhat.com
Mon Dec 3 15:56:20 UTC 2012


On 30.11.2012 17:52, Laine Stump wrote:
> On 11/19/2012 11:51 AM, Michal Privoznik wrote:
>> These classes can borrow unused bandwidth. Basically,
>> only egress qdsics can have classes, therefore we can
>> do this kind of traffic shaping only on host's outgoing,
>> that is domain's incoming traffic.
>> ---
>>  src/lxc/lxc_process.c         |    3 ++-
>>  src/network/bridge_driver.c   |    3 ++-
>>  src/qemu/qemu_command.c       |    3 ++-
>>  src/qemu/qemu_driver.c        |    2 +-
>>  src/util/virnetdevbandwidth.c |   33 +++++++++++++++++++++++++++------
>>  src/util/virnetdevbandwidth.h |    4 +++-
>>  src/util/virnetdevmacvlan.c   |    2 +-
>>  7 files changed, 38 insertions(+), 12 deletions(-)
>>
>> diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c
>> index 079bc3a..91ce2d3 100644
>> --- a/src/lxc/lxc_process.c
>> +++ b/src/lxc/lxc_process.c
>> @@ -335,7 +335,8 @@ static int virLXCProcessSetupInterfaceBridged(virConnectPtr conn,
>>          goto cleanup;
>>  
>>      if (virNetDevBandwidthSet(net->ifname,
>> -                              virDomainNetGetActualBandwidth(net)) < 0) {
>> +                              virDomainNetGetActualBandwidth(net),
>> +                              false) < 0) {
>>          virReportError(VIR_ERR_INTERNAL_ERROR,
>>                         _("cannot set bandwidth limits on %s"),
>>                         net->ifname);
>> diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c
>> index c153d36..256b601 100644
>> --- a/src/network/bridge_driver.c
>> +++ b/src/network/bridge_driver.c
>> @@ -2240,7 +2240,8 @@ networkStartNetworkVirtual(struct network_driver *driver,
>>          VIR_FORCE_CLOSE(tapfd);
>>      }
>>  
>> -    if (virNetDevBandwidthSet(network->def->bridge, network->def->bandwidth) < 0) {
>> +    if (virNetDevBandwidthSet(network->def->bridge,
>> +                              network->def->bandwidth, true) < 0) {
>>          virReportError(VIR_ERR_INTERNAL_ERROR,
>>                         _("cannot set bandwidth limits on %s"),
>>                         network->def->bridge);
>> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
>> index 22bb209..c1f8680 100644
>> --- a/src/qemu/qemu_command.c
>> +++ b/src/qemu/qemu_command.c
>> @@ -292,7 +292,8 @@ qemuNetworkIfaceConnect(virDomainDefPtr def,
>>  
>>      if (tapfd >= 0 &&
>>          virNetDevBandwidthSet(net->ifname,
>> -                              virDomainNetGetActualBandwidth(net)) < 0) {
>> +                              virDomainNetGetActualBandwidth(net),
>> +                              false) < 0) {
>>          virReportError(VIR_ERR_INTERNAL_ERROR,
>>                         _("cannot set bandwidth limits on %s"),
>>                         net->ifname);
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index 595c452..9ae0c1b 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -8960,7 +8960,7 @@ qemuDomainSetInterfaceParameters(virDomainPtr dom,
>>                     sizeof(*newBandwidth->out));
>>          }
>>  
>> -        if (virNetDevBandwidthSet(net->ifname, newBandwidth) < 0) {
>> +        if (virNetDevBandwidthSet(net->ifname, newBandwidth, false) < 0) {
>>              virReportError(VIR_ERR_INTERNAL_ERROR,
>>                             _("cannot set bandwidth limits on %s"),
>>                             device);
>> diff --git a/src/util/virnetdevbandwidth.c b/src/util/virnetdevbandwidth.c
>> index bddb788..fcc6b56 100644
>> --- a/src/util/virnetdevbandwidth.c
>> +++ b/src/util/virnetdevbandwidth.c
>> @@ -45,17 +45,21 @@ virNetDevBandwidthFree(virNetDevBandwidthPtr def)
>>   * virNetDevBandwidthSet:
>>   * @ifname: on which interface
>>   * @bandwidth: rates to set (may be NULL)
>> + * @hierarchical_class: whether to create hierarchical class
>>   *
>>   * This function enables QoS on specified interface
>>   * and set given traffic limits for both, incoming
>>   * and outgoing traffic. Any previous setting get
>> - * overwritten.
>> + * overwritten. If @hierarchical_class is TRUE, create
>> + * hierarchical class. It is used to guarantee minimal
>> + * throughput ('floor' attribute in NIC).
>>   *
>>   * Return 0 on success, -1 otherwise.
>>   */
>>  int
>>  virNetDevBandwidthSet(const char *ifname,
>> -                      virNetDevBandwidthPtr bandwidth)
>> +                      virNetDevBandwidthPtr bandwidth,
>> +                      bool hierarchical_class)
>>  {
>>      int ret = -1;
>>      virCommandPtr cmd = NULL;
>> @@ -71,7 +75,7 @@ virNetDevBandwidthSet(const char *ifname,
>>  
>>      virNetDevBandwidthClear(ifname);
>>  
>> -    if (bandwidth->in) {
>> +    if (bandwidth->in && bandwidth->in->average) {
>>          if (virAsprintf(&average, "%llukbps", bandwidth->in->average) < 0)
>>              goto cleanup;
>>          if (bandwidth->in->peak &&
>> @@ -83,15 +87,32 @@ virNetDevBandwidthSet(const char *ifname,
>>  
>>          cmd = virCommandNew(TC);
>>          virCommandAddArgList(cmd, "qdisc", "add", "dev", ifname, "root",
>> -                             "handle", "1:", "htb", "default", "1", NULL);
>> +                             "handle", "1:", "htb", "default",
>> +                             hierarchical_class ? "2" : "1", NULL);
> 
> Don't you really only need to do this if floor has been set? If you
> create the hierarchical class and don't need it, does this result in any
> extra overhead? If so, I think you should do (hierarchical_class &&
> bandwidth->floor).

Yes I do need that unless I want to create hierarchical classes when the
first interface with bandwidth->floor is plugged in. Note that
hierarchical_class is true only on bridges and since only interfaces are
allowed to have a floor, only one of bandwith->floor and
hierarchical_class can be non-zero. However, dynamically creating
classes would complicate the code even more and doesn't really provide
noticeable trade off between overhead and code complexity.

>>          if (virCommandRun(cmd, NULL) < 0)
>>              goto cleanup;
>>  
>> +        if (hierarchical_class) {
> 
> similar
> 
>> +            /* If we are creating hierarchical class, all non guaranteed traffic
>> +             * goes to 1:2 class which will adjust 'rate' dynamically as NICs with
>> +             * guaranteed throughput are plugged and unplugged. Class 1:1 is there
>> +             * so we don't exceed the maximum limit for network. For each NIC with
>> +             * guaranteed throughput a separate classid will be created.
>> +             * NB '1:' is just a shorter notation of '1:0' */
> 
> I have to admit that I don't follow the full discussion of the details
> of this, but trust that you've done some testing to verify that the
> results are as desired when floor is specified, and identical to
> previous code if floor isn't specified.

You bet I did :)

> 
>> +            virCommandFree(cmd);
>> +            cmd = virCommandNew(TC);
>> +            virCommandAddArgList(cmd, "class", "add", "dev", ifname, "parent",
>> +                                 "1:", "classid", "1:1", "htb", "rate", average,
>> +                                 "ceil", peak ? peak : average, NULL);
>> +            if (virCommandRun(cmd, NULL) < 0)
>> +                goto cleanup;
>> +        }
>>          virCommandFree(cmd);
>>          cmd = virCommandNew(TC);
>>          virCommandAddArgList(cmd,"class", "add", "dev", ifname, "parent",
>> -                             "1:", "classid", "1:1", "htb", NULL);
>> -        virCommandAddArgList(cmd, "rate", average, NULL);
>> +                             hierarchical_class ? "1:1" : "1:", "classid",
>> +                             hierarchical_class ? "1:2" : "1:1", "htb",
>> +                             "rate", average, NULL);
>>  
>>          if (peak)
>>              virCommandAddArgList(cmd, "ceil", peak, NULL);
>> diff --git a/src/util/virnetdevbandwidth.h b/src/util/virnetdevbandwidth.h
>> index 35f8b89..d308ab2 100644
>> --- a/src/util/virnetdevbandwidth.h
>> +++ b/src/util/virnetdevbandwidth.h
>> @@ -42,7 +42,9 @@ struct _virNetDevBandwidth {
>>  
>>  void virNetDevBandwidthFree(virNetDevBandwidthPtr def);
>>  
>> -int virNetDevBandwidthSet(const char *ifname, virNetDevBandwidthPtr bandwidth)
>> +int virNetDevBandwidthSet(const char *ifname,
>> +                          virNetDevBandwidthPtr bandwidth,
>> +                          bool hierarchical_class)
>>      ATTRIBUTE_NONNULL(1) ATTRIBUTE_RETURN_CHECK;
>>  int virNetDevBandwidthClear(const char *ifname)
>>      ATTRIBUTE_NONNULL(1);
>> diff --git a/src/util/virnetdevmacvlan.c b/src/util/virnetdevmacvlan.c
>> index bd26ba9..cfe3f87 100644
>> --- a/src/util/virnetdevmacvlan.c
>> +++ b/src/util/virnetdevmacvlan.c
>> @@ -925,7 +925,7 @@ create_name:
>>          rc = 0;
>>      }
>>  
>> -    if (virNetDevBandwidthSet(cr_ifname, bandwidth) < 0) {
>> +    if (virNetDevBandwidthSet(cr_ifname, bandwidth, false) < 0) {
>>          virReportError(VIR_ERR_INTERNAL_ERROR,
>>                         _("cannot set bandwidth limits on %s"),
>>                         cr_ifname);
> 
> Huh. I thought I had remembered that bandwidth (like iptables rules)
> couldn't be applied to macvtap interfaces. Is that wrong? (unrelated to
> this patchset anyway, just curious).

Not on the level we are doing things. In linux, we have these traffic
shaping classes which actually limit the flow of packets which resides
within them. To send a packet into appropriate class is job for the
filters. These can look at the packet headers and decide to which class
is packet sent. The filter can see full range of headers in terms of
ISO/OSI model - that is, a decision can be made depending on TCP flag
set, src/dst IP address, src/dst MAC address, iptables marks (iptables
... -j MARK --set-mark <some-mark>). And since libvirt doesn't know
anything about IP addresses of guest we are left only with 2nd level of
the ISO/OSI model - MAC addresses.
Problem with iptables marks was - IIRC - they doesn't apply on macvtap
devices. For some reason the mark weren't applied in my testing so I've
brushed the idea aside. In addition, I think you have to enable
something in kernel for packets from a macvtap NIC to go through a bridge.

> 
> ACK with the above questions appropriately answered.
> 
> --
> libvir-list mailing list
> libvir-list at redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
> 




More information about the libvir-list mailing list