[Fedora-xen] dom0 F7 crashes tcp_tso_segment oops - new kernel lag time

Dale Bewley dlbewley at lib.ucdavis.edu
Thu Nov 15 18:55:18 UTC 2007


Running 2.6.20-2936.fc7xen x86_64 quad CPU, 16G.

We have a dom0 with bridges riding on top of a VLAN interface.

# brctl show
bridge name     bridge id               STP enabled     interfaces
br101           8000.00093d139ae9       no              eth0
br6             8000.00093d139ae9       no              vif2.0
                                                        vif1.0
                                                        eth0.6
...

Inside a F7 domU on br6 we are running tc (via shorewall) to limit the bandwidth of a mirror server. On Monday I throttled the bandwidth down far below the demand. Since then we are starting to see dom0 crash and reboot with nothing in the log. 

Crashes happened on Tues around 9am and Thursday (today) around 4am and 9am. I caught the console during the most recent crash:

Unable to handle kernel NULL pointer dereference at 0000000000000030 RIP:
 [<ffffffff803fc79c>] tcp_tso_segment+0x1d8/0x285
PGD 3db8f1067 PUD 3db8f2067 PMD 0
Oops: 0000 [1] SMP
last sysfs file: /devices/xen-backend/vbd-2-51712/statistics/wr_sect
CPU 1
Modules linked in: loop netbk xenblktap blkbk autofs4 8021q bridge nf_conntrack_netbios_ns ipt_LdPid: 0, comm: swapper Not tainted 2.6.20-2936.fc7xen #1
RIP: e030:[<ffffffff803fc79c>]  [<ffffffff803fc79c>] tcp_tso_segment+0x1d8/0x285
RSP: e02b:ffff880002f77950  EFLAGS: 00010216
RAX: 0000000000007a21 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000010130000 RSI: 0000000000000000 RDI: 0000000087e8b626
RBP: 000000000000fa87 R08: 0000000187e8b625 R09: 0000000000000000
R10: 000000009b9a7b25 R11: 0000000000000003 R12: ffff88034ecab034
R13: 0000000017acfe00 R14: 0000000000000020 R15: 00000000ffff0000
FS:  00002aaaab0ff230(0000) GS:ffffffff80580080(0000) knlGS:0000000000000000

We are up to date with patches. After the first 2 crashes and before the 3rd I upgraded from xen-3.1.0-6.fc7 to xen-3.1.0-8.fc7 and rebooted dom0 for good measure.

I did some googling and found this:
http://lists.openwall.net/netdev/2007/02/09/16
which seems pretty similar. 

Assuming this is the problem and there is a fix I'm left to ask what is the ETA for a new xen kernel and what is the typical lag time? 

Current state is 2.6.23.1-21.fc7 vs. 2.6.20-2936.fc7xen

--
Dale Bewley - Unix Administrator - Shields Library - UC Davis
GPG: 0xB098A0F3 0D5A 9AEB 43F4 F84C 7EFD  1753 064D 2583 B098 A0F3




More information about the Fedora-xen mailing list