OT: OEM problems with aic79xx drivers

Stephen Smoogen smoogen at lanl.gov
Mon Jan 12 21:49:33 UTC 2004


Well after spending more time than I should trying to shoehorn the
latest AIC79xx driver into the last 2.4.20 kernel.. and remembering that
Doug Ledford had hair before he started working on the driver... I
decided it was time to look for help.

Currently I have given up trying to get the 2.4.20- kernel to work with
the latest aic79xx drivers.. so I went with a 2.4.24 kernel and just
made a few changes to the kernel. The kernel compiled fine, and seemed
to run ok, but 'crashed' horribly when I tried to do any large file
transfers across the bus (dd if=/dev/zero of=/var/foo bs=1024 count=1GB)

The crash is a repeatable set of scsi dumps of 
Jan 12 08:57:29 builder kernel: TSCB 0x86
Jan 12 08:57:29 builder kernel: qinstart = 7301 qinfifonext = 7379
Jan 12 08:57:29 builder kernel: QINFIFO: 0x84 0x83 0x82 0x8b 0x8a 0x89
0x88 0x87 0x8f 0x8e 0x8d 0x8c 0x90 0x95 0x94 0x93 0x92 0x91 0x9a 0x99
0x98 0x97 0x96 0x9f 0x9e 0x9d 0x9c 0x9b 0xa4 0xa3 0xa2 0xa1 0xa0 0xa9
0xa8 0xa7 0xa6 0xa5 0xae 0xad 0xac 0xab 0xaa 0xaf 0xb3 0xb2 0xb1 0xb0
0xb8 0xb7 0xb6 0xb5 0xb4 0xbd 0xbc 0xbb 0xba 0xb9 0xbf 0xbe 0xc2 0xc1
0xc0 0xc7 0xc6 0xc5 0xc4 0xc3 0xcc 0xcb 0xca 0xc9 0xc8 0xcf 0xce 0xcd
0xd1 0xd0
Jan 12 08:57:29 builder kernel: WAITING_TID_QUEUES:
Jan 12 08:57:29 builder kernel:        1 ( 0x80 0x86 0x85 )
Jan 12 08:57:29 builder kernel: Pending list:

then a long list of FIFO items. I blocked that problem by going into the
SCSI BIOS and turning off packetization. That of course makes the 320
mb/s into 160mb/s drives. 

Part of the reason for this push on my part is that our 7.x machines
have SuperMicro motherboards with aic7902 cards built on and we have
been seeing slow disk corruptions over time. I have yet to find anyone
else seeing this problem.. so am having to figure out what makes our
systems unique. The problem doesnt show up on a bonnie++ check, but our 
30 imap servers have consistently killed a drive a week since we brought
them online last august. 

So if anyone else runs into this problem supporting 7.x machines.. here
is what I am trying to do to see if it fixes the problem:

1) Turn off packetization
2) Compiling a standard kernel with a SCB lenght of 32 versus teh
default of 253.
3) Waving a magic chicken foot at the machines.



-- 
Stephen John Smoogen		smoogen at lanl.gov
Los Alamos National Lab  CCN-5 Sched 5/40  PH: 4-0645
Ta-03 SM-1498 MailStop B255 DP 10S  Los Alamos, NM 87545
-- So shines a good deed in a weary world. = Willy Wonka --





More information about the fedora-legacy-list mailing list