smartctl -l error /dev/... When to worry ?
Randy Kelsoe
randykel at swbell.net
Mon Jul 19 19:32:46 UTC 2004
Hannes Mayer wrote:
> Hi all!
>
> I just discovered smartctl and the interesting output with
> # smartctl -l error /dev/hda
>
> hda reports no errors (newest disk), but hdb and hdf do report some
> errors every few hours (see outputs below)
>
> When do I actually have to start to worry about errors ?
> I mean, is an error every few hours normal for older (1-2 years) disks ?
>
> ########################### hdb #################################
>
> smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> === START OF READ SMART DATA SECTION ===
> SMART Error Log Version: 1
> Warning: ATA error count 60 inconsistent with error log pointer 3
>
> Error 60 occurred at disk power-on lifetime: 5386 hours
> When the command that caused the error occurred, the device was
> active or idle.
> Error 59 occurred at disk power-on lifetime: 5386 hours
> When the command that caused the error occurred, the device was
> active or idle.
> Error 58 occurred at disk power-on lifetime: 5386 hours
> When the command that caused the error occurred, the device was
> active or idle.
> Error 57 occurred at disk power-on lifetime: 5385 hours
> When the command that caused the error occurred, the device was
> active or idle.
> Error 56 occurred at disk power-on lifetime: 5375 hours
> When the command that caused the error occurred, the device was
> active or idle.
>
I have trimmed the above errors to show that the most recent 5 errors on
hdb were within 11 hours of each other. You need to look at a 'smartctl
-a /dev/hdb |grep -i power_on' to get the current age of the disk, then
compare it to the number of power-on hours when the error occurred. The
above errors occurred when you drive was 224 days old, so they might be
old errors, might have occurred during a power failure, etc.
You might also want to get a newer version of smartmontools. The latest
is 5.32, and you are running 5.21.
> ######################## hdf #######################
>
> smartctl version 5.21 Copyright (C) 2002-3 Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> === START OF READ SMART DATA SECTION ===
> SMART Error Log Version: 1
> Warning: ATA error count 1486 inconsistent with error log pointer 5
> Error 1486 occurred at disk power-on lifetime: 5383 hours
> When the command that caused the error occurred, the device was in
> an unknown state.
> Error 1485 occurred at disk power-on lifetime: 5382 hours
> When the command that caused the error occurred, the device was in
> an unknown state.
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> 08 00 00 01 00 00 b0 00 243.792 DEVICE RESET
> ec 00 01 01 00 00 b0 00 237.072 IDENTIFY DEVICE
> 08 00 00 01 00 00 b0 00 220.016 DEVICE RESET
> ec 00 01 01 00 00 b0 00 213.280 IDENTIFY DEVICE
>
> Error 1484 occurred at disk power-on lifetime: 5382 hours
> When the command that caused the error occurred, the device was in
> an unknown state.
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> 08 00 00 01 00 00 b0 00 220.016 DEVICE RESET
> ec 00 01 01 00 00 b0 00 213.280 IDENTIFY DEVICE
>
> Error 1483 occurred at disk power-on lifetime: 5381 hours
> When the command that caused the error occurred, the device was in
> an unknown state.
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> 08 00 00 01 00 00 b0 00 220.000 DEVICE RESET
> ec 00 01 01 00 00 b0 00 213.264 IDENTIFY DEVICE
>
> Error 1482 occurred at disk power-on lifetime: 5368 hours
> When the command that caused the error occurred, the device was in
> an unknown state.
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Timestamp Command/Feature_Name
> -- -- -- -- -- -- -- -- --------- --------------------
> 08 00 00 01 00 00 b0 00 220.000 DEVICE RESET
> ec 00 01 01 00 00 b0 00 213.264 IDENTIFY DEVICE
>
> #################################################
See a theme here? What device do or did you have as the master device on
the IDE bus with this drive? This looks like a device on the bus had a
problem, and the system tried to recover by resetting devices on the
bus. If you have a CDROM drive on the same bus, and it had problems, you
might see something like this in your error logs. Again, do the
'smartctl -a /dev/hdf | grep -i power_on' and compare the age of the
drive to when the error occurred.
More information about the fedora-list
mailing list