[libvirt] [PATCH] Add iptables rule to fixup DHCP response checksum.
Daniel Veillard
veillard at redhat.com
Tue Jul 13 09:05:29 UTC 2010
On Tue, Jul 13, 2010 at 02:17:17AM -0400, Laine Stump wrote:
> From: Laine Stump <laine at redhat.com>
>
> (I don't know whether or not we want to commit this upstream yet - the
> proposed iptables and kernel module backend for the changes have been
> posted but not yet committed upstream. On the other hand, the new
> libvirt code ends up simply printing a warning message if the
> necessary iptables support isn't yet in place.)
Well in general we try to avoid this kind of preventive code change
as we're not 100% sure the support will ever make it unchanged in
upstream for dependancies. But that should not stop us from reviewing
the patches !
> This patch attempts to take advantage of a newly added netfilter
> module to correct for a problem with some guest DHCP client
> implementations when used in conjunction with a DHCP server run on the
> host systems with packet checksum offloading enabled.
>
> The problem is that, when the guest uses a RAW socket to read the DHCP
> response packets, the checksum hasn't yet been fixed by the IP stack,
> so it is incorrect.
>
> The fix implemented here is to add a rule to the POSTROUTING chain of
> the mangle table in iptables that fixes up the checksum for packets on
> the virtual network's bridge that are destined for the bootpc port (ie
> "dhcpc", ie port 68) port on the guest.
>
> Only very new versions of iptables will have this support (it has been
> submitted upstream, but not yet committed), so a failure to add this
> rule only results in a warning message. The iptables patch is here:
>
> http://patchwork.ozlabs.org/patch/58525/
>
> A corresponding kernel module patch is also required (the backend of
> the iptables patch) and has been submitted, but I don't have the
> details for that (I tested using a pre-built image I received from the
> developer, Michael Tsirkin).
> ---
> src/libvirt_private.syms | 2 +
> src/network/bridge_driver.c | 17 ++++++++++
> src/util/iptables.c | 71 +++++++++++++++++++++++++++++++++++++++++++
> src/util/iptables.h | 6 ++++
> 4 files changed, 96 insertions(+), 0 deletions(-)
>
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 778ceb1..d81d4cf 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -328,6 +328,7 @@ iptablesAddForwardAllowRelatedIn;
> iptablesAddForwardMasquerade;
> iptablesAddForwardRejectIn;
> iptablesAddForwardRejectOut;
> +iptablesAddOutputFixUdpChecksum;
> iptablesAddTcpInput;
> iptablesAddUdpInput;
> iptablesContextFree;
> @@ -339,6 +340,7 @@ iptablesRemoveForwardAllowRelatedIn;
> iptablesRemoveForwardMasquerade;
> iptablesRemoveForwardRejectIn;
> iptablesRemoveForwardRejectOut;
> +iptablesRemoveOutputFixUdpChecksum;
> iptablesRemoveTcpInput;
> iptablesRemoveUdpInput;
>
> diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c
> index 72255c1..c4480ff 100644
> --- a/src/network/bridge_driver.c
> +++ b/src/network/bridge_driver.c
> @@ -781,6 +781,19 @@ networkAddIptablesRules(struct network_driver *driver,
> !networkAddRoutingIptablesRules(driver, network))
> goto err8;
>
> + /* If we are doing local DHCP service on this network, attempt to
> + * add a rule that will fixup the checksum of DHCP response
> + * packets back to the guests (but report failure without
> + * aborting, since not all iptables implementations support it).
> + */
> +
> + if ((network->def->ipAddress || network->def->nranges) &&
> + (iptablesAddOutputFixUdpChecksum(driver->iptables,
> + network->def->bridge, 68) != 0)) {
> + VIR_WARN("Could not add rule to fixup DHCP response checksums "
> + "on network '%s'", network->def->name);
> + }
> +
> return 1;
>
> err8:
> @@ -811,6 +824,10 @@ networkAddIptablesRules(struct network_driver *driver,
> static void
> networkRemoveIptablesRules(struct network_driver *driver,
> virNetworkObjPtr network) {
> + if (network->def->ipAddress || network->def->nranges) {
> + iptablesRemoveOutputFixUdpChecksum(driver->iptables,
> + network->def->bridge, 68);
> + }
hum, there is no return code to check, if yes the block isn't
necessary. Also a empty line should be added to separate from next if
block, but it's purely cosmetic :-)
> if (network->def->forwardType != VIR_NETWORK_FORWARD_NONE) {
> if (network->def->forwardType == VIR_NETWORK_FORWARD_NAT) {
> iptablesRemoveForwardMasquerade(driver->iptables,
> diff --git a/src/util/iptables.c b/src/util/iptables.c
> index d06b857..9b888e5 100644
> --- a/src/util/iptables.c
> +++ b/src/util/iptables.c
> @@ -60,6 +60,7 @@ struct _iptablesContext
> iptRules *input_filter;
> iptRules *forward_filter;
> iptRules *nat_postrouting;
> + iptRules *mangle_postrouting;
> };
>
> static void
> @@ -188,6 +189,9 @@ iptablesContextNew(void)
> if (!(ctx->nat_postrouting = iptRulesNew("nat", "POSTROUTING")))
> goto error;
>
> + if (!(ctx->mangle_postrouting = iptRulesNew("mangle", "POSTROUTING")))
> + goto error;
> +
> return ctx;
>
> error:
> @@ -210,6 +214,8 @@ iptablesContextFree(iptablesContext *ctx)
> iptRulesFree(ctx->forward_filter);
> if (ctx->nat_postrouting)
> iptRulesFree(ctx->nat_postrouting);
> + if (ctx->mangle_postrouting)
> + iptRulesFree(ctx->mangle_postrouting);
> VIR_FREE(ctx);
> }
>
> @@ -753,3 +759,68 @@ iptablesRemoveForwardMasquerade(iptablesContext *ctx,
> {
> return iptablesForwardMasquerade(ctx, network, physdev, REMOVE);
> }
> +
> +
> +static int
> +iptablesOutputFixUdpChecksum(iptablesContext *ctx,
> + const char *iface,
> + int port,
> + int action)
> +{
> + char portstr[32];
> +
> + snprintf(portstr, sizeof(portstr), "%d", port);
> + portstr[sizeof(portstr) - 1] = '\0';
> +
> + return iptablesAddRemoveRule(ctx->mangle_postrouting,
> + action,
> + "--out-interface", iface,
> + "--protocol", "udp",
> + "--destination-port", portstr,
> + "--jump", "CHECKSUM", "--checksum-fill",
> + NULL);
> +}
> +
> +/**
> + * iptablesAddOutputFixUdpChecksum:
> + * @ctx: pointer to the IP table context
> + * @iface: the interface name
> + * @port: the UDP port to match
> + *
> + * Add an rule to the mangle table's POSTROUTING chain that fixes up the
> + * checksum of packets with the given destination @port.
> + * the given @iface interface for TCP packets.
> + *
> + * Returns 0 in case of success or an error code in case of error.
> + * (NB: if the system's iptables does not support checksum mangling,
> + * this will return an error, which should be ignored.)
> + */
> +
> +int
> +iptablesAddOutputFixUdpChecksum(iptablesContext *ctx,
> + const char *iface,
> + int port)
> +{
> + return iptablesOutputFixUdpChecksum(ctx, iface, port, ADD);
> +}
> +
> +/**
> + * iptablesRemoveOutputFixUdpChecksum:
> + * @ctx: pointer to the IP table context
> + * @iface: the interface name
> + * @port: the UDP port of the rule to remove
> + *
> + * Removes the checksum fixup rule that was previous added with
> + * iptablesAddOutputFixUdpChecksum.
> + *
> + * Returns 0 in case of success or an error code in case of error
> + * (again, if iptables doesn't support checksum fixup, this will
> + * return an error, which should be ignored)
> + */
> +int
> +iptablesRemoveOutputFixUdpChecksum(iptablesContext *ctx,
> + const char *iface,
> + int port)
> +{
> + return iptablesOutputFixUdpChecksum(ctx, iface, port, REMOVE);
> +}
> diff --git a/src/util/iptables.h b/src/util/iptables.h
> index 7d55a6d..5c2e553 100644
> --- a/src/util/iptables.h
> +++ b/src/util/iptables.h
> @@ -89,5 +89,11 @@ int iptablesAddForwardMasquerade (iptablesContext *ctx,
> int iptablesRemoveForwardMasquerade (iptablesContext *ctx,
> const char *network,
> const char *physdev);
> +int iptablesAddOutputFixUdpChecksum (iptablesContext *ctx,
> + const char *iface,
> + int port);
> +int iptablesRemoveOutputFixUdpChecksum (iptablesContext *ctx,
> + const char *iface,
> + int port);
>
> #endif /* __QEMUD_IPTABLES_H__ */
I appreciate the level of comments :-)
Patch looks fine but I would wait a bit until support is upstream, once
it is, ACK !
Daniel
--
Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/
daniel at veillard.com | Rpmfind RPM search engine http://rpmfind.net/
http://veillard.com/ | virtualization library http://libvirt.org/
More information about the libvir-list
mailing list