[Fedora-xen] TCP checksum corruption
Daniel P. Berrange
berrange at redhat.com
Tue May 8 17:54:58 UTC 2007
On Tue, May 08, 2007 at 12:41:02PM -0500, Mike McGrath wrote:
> Daniel P. Berrange wrote:
> >On Tue, May 08, 2007 at 11:39:14AM -0500, Mike McGrath wrote:
> >
> >>We're using xen heavily in Fedora's Infrastructure and presently a
> >>number of the xen domU hosts are experiencing terrible checksum issues.
> >>I've tried the ethtool -K eth0 tx off fix and it didn't work.
> >
> >What sort of network config have you got with these ? Briding straight
> >to physical device, or NAT'd ?
> Bridge
That's good - should avoid the NAT related bugs there then.
> >There are a couple issues at play:
> >
> > - There is a general bug in 2.6.20 that breaks checksum offload
> > when used with NAT.
> > - In 2.6.19 or later Dom0 will transmits to guests using checksum
> > offload, so DHCP client in the guest will mistakenly thing it
> > has a corrupt checksum.
> >
> >To address the first bug requires disabling checksum offload in the eth0 in
> >the guest. ethtool -K eth0 tx off in the guest should do it.
> >
> >To address the 2nd is really difficult since the FC6 install images
> >themsves
> >have a broken DHCP client for example, so we need to workaround it in the
> >kernel. This can be done by disabling checksums on the device in Dom0 - any
> >of vifN.0, xenbr0, phet0 should have ethtook -K <dev> tx off done.
> >
> >NB, ignore eth0 in Dom0, that's a fake device so turning off tx on that
> >does
> >not fix things.
> >
> >So in summary, to get it working in general case requires:
> >
> > ethtool -K eth0 tx off in guest
> >
> >And
> >
> > ethtool -K <dev> tx off on whatever bridge device the guest is
> > attached to
> >
> I've actually run that on every interface on every dom[0,U] on the box
> :). I've also tried it on two other hosts. One a RHEL5 dom0 and the
> other had different hardware but was also a FC6 dom0. I can arrange
> access to the box if you're interested.
Ok that makes absolutely no sense to me now :-) Everytime I hit it I was
able to solve it eventually by setting 'tx off' on some combo of devices.
The RHEL-5 Dom0 kernel also already has the neccessary fixes in which is
even odder that it doesn't work for you.
Dan.
--
|=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=|
|=- Perl modules: http://search.cpan.org/~danberr/ -=|
|=- Projects: http://freshmeat.net/~danielpb/ -=|
|=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
More information about the Fedora-xen
mailing list