[libvirt] [PATCH] Add iptables rule to fixup DHCP response checksum.

Daniel Veillard veillard at redhat.com
Tue Jul 13 09:05:29 UTC 2010


On Tue, Jul 13, 2010 at 02:17:17AM -0400, Laine Stump wrote:
> From: Laine Stump <laine at redhat.com>
> 
> (I don't know whether or not we want to commit this upstream yet - the
> proposed iptables and kernel module backend for the changes have been
> posted but not yet committed upstream. On the other hand, the new
> libvirt code ends up simply printing a warning message if the
> necessary iptables support isn't yet in place.)

  Well in general we try to avoid this kind of preventive code change
as we're not 100% sure the support will ever make it unchanged in
upstream for dependancies. But that should not stop us from reviewing
the patches !

> This patch attempts to take advantage of a newly added netfilter
> module to correct for a problem with some guest DHCP client
> implementations when used in conjunction with a DHCP server run on the
> host systems with packet checksum offloading enabled.
> 
> The problem is that, when the guest uses a RAW socket to read the DHCP
> response packets, the checksum hasn't yet been fixed by the IP stack,
> so it is incorrect.
> 
> The fix implemented here is to add a rule to the POSTROUTING chain of
> the mangle table in iptables that fixes up the checksum for packets on
> the virtual network's bridge that are destined for the bootpc port (ie
> "dhcpc", ie port 68) port on the guest.
> 
> Only very new versions of iptables will have this support (it has been
> submitted upstream, but not yet committed), so a failure to add this
> rule only results in a warning message. The iptables patch is here:
> 
>   http://patchwork.ozlabs.org/patch/58525/
> 
> A corresponding kernel module patch is also required (the backend of
> the iptables patch) and has been submitted, but I don't have the
> details for that (I tested using a pre-built image I received from the
> developer, Michael Tsirkin).
> ---
>  src/libvirt_private.syms    |    2 +
>  src/network/bridge_driver.c |   17 ++++++++++
>  src/util/iptables.c         |   71 +++++++++++++++++++++++++++++++++++++++++++
>  src/util/iptables.h         |    6 ++++
>  4 files changed, 96 insertions(+), 0 deletions(-)
> 
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 778ceb1..d81d4cf 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -328,6 +328,7 @@ iptablesAddForwardAllowRelatedIn;
>  iptablesAddForwardMasquerade;
>  iptablesAddForwardRejectIn;
>  iptablesAddForwardRejectOut;
> +iptablesAddOutputFixUdpChecksum;
>  iptablesAddTcpInput;
>  iptablesAddUdpInput;
>  iptablesContextFree;
> @@ -339,6 +340,7 @@ iptablesRemoveForwardAllowRelatedIn;
>  iptablesRemoveForwardMasquerade;
>  iptablesRemoveForwardRejectIn;
>  iptablesRemoveForwardRejectOut;
> +iptablesRemoveOutputFixUdpChecksum;
>  iptablesRemoveTcpInput;
>  iptablesRemoveUdpInput;
>  
> diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c
> index 72255c1..c4480ff 100644
> --- a/src/network/bridge_driver.c
> +++ b/src/network/bridge_driver.c
> @@ -781,6 +781,19 @@ networkAddIptablesRules(struct network_driver *driver,
>               !networkAddRoutingIptablesRules(driver, network))
>          goto err8;
>  
> +    /* If we are doing local DHCP service on this network, attempt to
> +     * add a rule that will fixup the checksum of DHCP response
> +     * packets back to the guests (but report failure without
> +     * aborting, since not all iptables implementations support it).
> +     */
> +
> +    if ((network->def->ipAddress || network->def->nranges) &&
> +        (iptablesAddOutputFixUdpChecksum(driver->iptables,
> +                                         network->def->bridge, 68) != 0)) {
> +        VIR_WARN("Could not add rule to fixup DHCP response checksums "
> +                 "on network '%s'", network->def->name);
> +    }
> +
>      return 1;
>  
>   err8:
> @@ -811,6 +824,10 @@ networkAddIptablesRules(struct network_driver *driver,
>  static void
>  networkRemoveIptablesRules(struct network_driver *driver,
>                           virNetworkObjPtr network) {
> +    if (network->def->ipAddress || network->def->nranges) {
> +        iptablesRemoveOutputFixUdpChecksum(driver->iptables,
> +                                           network->def->bridge, 68);
> +    }

  hum, there is no return code to check, if yes the block isn't
  necessary. Also a empty line should be added to separate from next if
  block, but it's purely cosmetic :-)

>      if (network->def->forwardType != VIR_NETWORK_FORWARD_NONE) {
>          if (network->def->forwardType == VIR_NETWORK_FORWARD_NAT) {
>              iptablesRemoveForwardMasquerade(driver->iptables,
> diff --git a/src/util/iptables.c b/src/util/iptables.c
> index d06b857..9b888e5 100644
> --- a/src/util/iptables.c
> +++ b/src/util/iptables.c
> @@ -60,6 +60,7 @@ struct _iptablesContext
>      iptRules *input_filter;
>      iptRules *forward_filter;
>      iptRules *nat_postrouting;
> +    iptRules *mangle_postrouting;
>  };
>  
>  static void
> @@ -188,6 +189,9 @@ iptablesContextNew(void)
>      if (!(ctx->nat_postrouting = iptRulesNew("nat", "POSTROUTING")))
>          goto error;
>  
> +    if (!(ctx->mangle_postrouting = iptRulesNew("mangle", "POSTROUTING")))
> +        goto error;
> +
>      return ctx;
>  
>   error:
> @@ -210,6 +214,8 @@ iptablesContextFree(iptablesContext *ctx)
>          iptRulesFree(ctx->forward_filter);
>      if (ctx->nat_postrouting)
>          iptRulesFree(ctx->nat_postrouting);
> +    if (ctx->mangle_postrouting)
> +        iptRulesFree(ctx->mangle_postrouting);
>      VIR_FREE(ctx);
>  }
>  
> @@ -753,3 +759,68 @@ iptablesRemoveForwardMasquerade(iptablesContext *ctx,
>  {
>      return iptablesForwardMasquerade(ctx, network, physdev, REMOVE);
>  }
> +
> +
> +static int
> +iptablesOutputFixUdpChecksum(iptablesContext *ctx,
> +                             const char *iface,
> +                             int port,
> +                             int action)
> +{
> +    char portstr[32];
> +
> +    snprintf(portstr, sizeof(portstr), "%d", port);
> +    portstr[sizeof(portstr) - 1] = '\0';
> +
> +    return iptablesAddRemoveRule(ctx->mangle_postrouting,
> +                                 action,
> +                                 "--out-interface", iface,
> +                                 "--protocol", "udp",
> +                                 "--destination-port", portstr,
> +                                 "--jump", "CHECKSUM", "--checksum-fill",
> +                                 NULL);
> +}
> +
> +/**
> + * iptablesAddOutputFixUdpChecksum:
> + * @ctx: pointer to the IP table context
> + * @iface: the interface name
> + * @port: the UDP port to match
> + *
> + * Add an rule to the mangle table's POSTROUTING chain that fixes up the
> + * checksum of packets with the given destination @port.
> + * the given @iface interface for TCP packets.
> + *
> + * Returns 0 in case of success or an error code in case of error.
> + * (NB: if the system's iptables does not support checksum mangling,
> + * this will return an error, which should be ignored.)
> + */
> +
> +int
> +iptablesAddOutputFixUdpChecksum(iptablesContext *ctx,
> +                                const char *iface,
> +                                int port)
> +{
> +    return iptablesOutputFixUdpChecksum(ctx, iface, port, ADD);
> +}
> +
> +/**
> + * iptablesRemoveOutputFixUdpChecksum:
> + * @ctx: pointer to the IP table context
> + * @iface: the interface name
> + * @port: the UDP port of the rule to remove
> + *
> + * Removes the checksum fixup rule that was previous added with
> + * iptablesAddOutputFixUdpChecksum.
> + *
> + * Returns 0 in case of success or an error code in case of error
> + * (again, if iptables doesn't support checksum fixup, this will
> + * return an error, which should be ignored)
> + */
> +int
> +iptablesRemoveOutputFixUdpChecksum(iptablesContext *ctx,
> +                                   const char *iface,
> +                                   int port)
> +{
> +    return iptablesOutputFixUdpChecksum(ctx, iface, port, REMOVE);
> +}
> diff --git a/src/util/iptables.h b/src/util/iptables.h
> index 7d55a6d..5c2e553 100644
> --- a/src/util/iptables.h
> +++ b/src/util/iptables.h
> @@ -89,5 +89,11 @@ int              iptablesAddForwardMasquerade    (iptablesContext *ctx,
>  int              iptablesRemoveForwardMasquerade (iptablesContext *ctx,
>                                                    const char *network,
>                                                    const char *physdev);
> +int              iptablesAddOutputFixUdpChecksum (iptablesContext *ctx,
> +                                                  const char *iface,
> +                                                  int port);
> +int              iptablesRemoveOutputFixUdpChecksum (iptablesContext *ctx,
> +                                                     const char *iface,
> +                                                     int port);
>  
>  #endif /* __QEMUD_IPTABLES_H__ */

  I appreciate the level of comments :-)
Patch looks fine but I would wait a bit until support is upstream, once
it is, ACK !

Daniel

-- 
Daniel Veillard      | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
daniel at veillard.com  | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library  http://libvirt.org/




More information about the libvir-list mailing list