Seagate disk problems (NCQ bug???)
Robin Laing
Robin.Laing at drdc-rddc.gc.ca
Wed Apr 29 15:48:15 UTC 2009
Wolfgang S. Rupprecht wrote:
> After running flawlessly for 6+ months I just had my Seagate
> ST31500343AS (w. SD35 firmware) flake out. Does this look like the NCQ
> bug or just a random event? The final error msg was around the time the
> machine hung hard.
>
> Apr 28 06:41:26 arbol kernel: ata1: exception Emask 0x10 SAct 0x0 SErr 0x90200 action 0xe frozen
> Apr 28 06:41:26 arbol kernel: ata1: irq_stat 0x00400000, PHY RDY changed
> Apr 28 06:41:26 arbol kernel: ata1: SError: { Persist PHYRdyChg 10B8B }
> Apr 28 06:41:26 arbol kernel: ata1: hard resetting link
> Apr 28 06:41:28 arbol kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> Apr 28 06:41:33 arbol kernel: ata1.00: qc timeout (cmd 0xec)
> Apr 28 06:41:33 arbol kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> Apr 28 06:41:33 arbol kernel: ata1.00: revalidation failed (errno=-5)
> Apr 28 06:41:33 arbol kernel: ata1: hard resetting link
> Apr 28 06:41:34 arbol kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> Apr 28 06:41:44 arbol kernel: ata1.00: qc timeout (cmd 0xec)
> Apr 28 06:41:44 arbol kernel: ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
> Apr 28 06:41:44 arbol kernel: ata1.00: revalidation failed (errno=-5)
> Apr 28 06:41:44 arbol kernel: ata1: hard resetting link
> Apr 28 06:41:46 arbol kernel: ata1: softreset failed (device not ready)
> Apr 28 06:41:46 arbol kernel: ata1: failed due to HW bug, retry pmp=0
> Apr 28 06:41:46 arbol kernel: ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
> Apr 28 06:41:46 arbol kernel: ata1.00: configured for UDMA/133
> Apr 28 06:41:46 arbol kernel: ata1: EH complete
> Apr 28 06:41:46 arbol kernel: sd 0:0:0:0: [sda] 2930277168 512-byte hardware sectors (1500302 MB)
> Apr 28 06:41:46 arbol kernel: sd 0:0:0:0: [sda] Write Protect is off
> Apr 28 06:41:46 arbol kernel: sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
>
>
> -wolfgang
I had errors like this when my system load got to high for my system to
work with. I later found out that the motherboard controller was to
slow. It is an older system. Replaced the controllers with SATA cards
and no errors since.
I could predict when the errors were going to occur and almost predict
when the system would lock up using uptime.
What controller chip is used in your system?
--
Robin Laing
More information about the fedora-list
mailing list