intermittent network problems

brian fedora at logi.ca
Wed Mar 11 00:26:24 UTC 2009


I have his one box that experiences connection losses from time to time, 
generally for less than a minute to several). I'm convinced it's not a 
software issue as it's just happened with F10 or the first time since 
upgrading. I've swapped cables several times (each time, the cable works 
fine on other machines) and, though temporarily unplugging the cat5 
sometimes gets it to come back up, this doesn't always work (I'm 
guessing coincidence).

Anyway, it just happened again, though this time as it was booting up. 
It got as far as the white "Fedora" and hung. I hit the down arrow to 
see that it was hung at starting the firewall. I hit another key (don't 
know which) and saw what looked like a stack trace and a mention that 
the network had timed out (pasted below). After some minutes, it 
continued on its way.

I figure at this point that it's a bad network card, though I'm really 
unsure. I think it's obviously not Fedora issue, at any rate. I don't 
know a whole lot about debugging this sort of thing so I'm hoping 
someone might have some pointers on how to do so. Is there any way to 
debug a flaky network card? Should I even bother (ie. just replace it)?

This is what I found in the log:

  [<c042db04>] warn_slowpath+0x69/0x89
  [<c06aad4b>] ? _spin_unlock_irqrestore+0x22/0x38
  [<c0436adf>] ? __mod_timer+0x9d/0xa8
  [<c048ce19>] ? virt_to_head_page+0x22/0x2e
  [<c050e950>] ? queue_flag_clear+0x18/0x54
  [<c050e9f7>] ? __freed_request+0x6b/0x72
  [<c050f414>] ? blk_remove_plug+0x66/0x92
  [<c050ca1b>] ? elv_queue_empty+0x20/0x22
  [<c050fb16>] ? blk_run_queue+0x28/0x2c
  [<c05a2ab9>] ? scsi_run_queue+0x250/0x27c
  [<c051b82e>] ? kobject_put+0x37/0x3c
  [<c051e8ea>] ? strlcpy+0x17/0x49
  [<c0641a60>] dev_watchdog+0xda/0x12d
  [<c05a3c77>] ? scsi_io_completion+0x19c/0x366
  [<c04363fd>] run_timer_softirq+0x14b/0x1bb
  [<c0641986>] ? dev_watchdog+0x0/0x12d
  [<c0641986>] ? dev_watchdog+0x0/0x12d
  [<c0432757>] __do_softirq+0x84/0x109
  [<c04326d3>] ? __do_softirq+0x0/0x109
  [<c0406f1c>] do_softirq+0x77/0xdb
  [<c04323be>] irq_exit+0x44/0x83
  [<c04152b1>] smp_apic_timer_interrupt+0x6e/0x7c
  [<c040576d>] apic_timer_interrupt+0x2d/0x34
  [<c041b733>] ? native_safe_halt+0x5/0x7
  [<c040a15d>] default_idle+0x38/0x6a
  [<c0403c61>] cpu_idle+0x101/0x134
  [<c069a9b2>] rest_init+0x4e/0x50
  =======================
---[ end trace a25984b66b23281f ]---
eth0: Transmit timeout, status 00000004 00000040
eth0: Media Link On 100mbps half-duplex

Obviously, the network was out for a bit. But, does the stuff above "end 
trace" have anything to do with it? It doesn't seem to be (it refers to 
scsi, among other things) but I'm including it here anyway because, 
either way, I've no idea if it's good or bad ;-)








More information about the fedora-list mailing list