"One or more disks are failing" ?

Scott Beamer geekboy at angrykeyboarder.com
Sat Jul 4 09:57:33 UTC 2009


On 07/04/2009 02:41 AM, Michael Schwendt wrote:
> On Sat, 4 Jul 2009 07:57:46 +0000 (UTC), Scott wrote:
> 
>> For a number of weeks now I've been getting this notification (only when 
>> in Fedora 11 & running GNOME) that "one or more disks are failing".
> 
> What do you get for "smartctl --all /dev/sda"? Perhaps a non-zero and
> growing number of reallocated sectors?

Uhhh....

Actually, I got (106 lines of) all this:

$ sudo smartctl --all /dev/sda

smartctl version 5.38 [x86_64-redhat-linux-gnu] Copyright (C) 2002-8
Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Device Model:     SAMSUNG HD103UJ
Serial Number:    S13PJ1MQ606788
Firmware Version: 1AA01112
User Capacity:    1,000,204,886,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   8
ATA Standard is:  ATA-8-ACS revision 3b
Local Time is:    Sat Jul  4 02:46:25 2009 MST

==> WARNING: May need -F samsung or -F samsung2 enabled; see manual for
details.

SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121)	The previous self-test completed
having
					the read element of the test failed.
Total time to complete Offline
data collection: 		 (11658) seconds.
Offline data collection
capabilities: 			 (0x7b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 195) minutes.
Conveyance self-test routine
recommended polling time: 	 (  21) minutes.
SCT capabilities: 	       (0x003f)	SCT Status supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   064   051    Pre-fail  Always
      -       8
  3 Spin_Up_Time            0x0007   077   077   011    Pre-fail  Always
      -       7820
  4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always
      -       536
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always
      -       4
  7 Seek_Error_Rate         0x000f   100   100   051    Pre-fail  Always
      -       0
  8 Seek_Time_Performance   0x0025   100   100   015    Pre-fail
Offline      -       10567
  9 Power_On_Hours          0x0032   099   099   000    Old_age   Always
      -       6202
 10 Spin_Retry_Count        0x0033   100   100   051    Pre-fail  Always
      -       0
 11 Calibration_Retry_Count 0x0012   100   100   000    Old_age   Always
      -       0
 12 Power_Cycle_Count       0x0032   099   099   000    Old_age   Always
      -       524
 13 Read_Soft_Error_Rate    0x000e   100   066   000    Old_age   Always
      -       8
183 Unknown_Attribute       0x0032   100   100   000    Old_age   Always
      -       0
184 Unknown_Attribute       0x0033   100   100   099    Pre-fail  Always
      -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always
      -       2767
188 Unknown_Attribute       0x0032   100   100   000    Old_age   Always
      -       0
190 Airflow_Temperature_Cel 0x0022   071   068   000    Old_age   Always
      -       29 (Lifetime Min/Max 20/29)
194 Temperature_Celsius     0x0022   071   066   000    Old_age   Always
      -       29 (Lifetime Min/Max 20/32)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always
      -       9633546
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always
      -       0
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always
      -       2
198 Offline_Uncorrectable   0x0030   100   100   000    Old_age
Offline      -       1
199 UDMA_CRC_Error_Count    0x003e   100   100   000    Old_age   Always
      -       0
200 Multi_Zone_Error_Rate   0x000a   100   100   000    Old_age   Always
      -       0
201 Soft_Read_Error_Rate    0x000a   100   100   000    Old_age   Always
      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 0
Warning: ATA Specification requires self-test log structure revision
number = 1
Num  Test_Description    Status                  Remaining
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      6201
     1818207693
# 2  Short offline       Aborted by host               20%      6201
     -
# 3  Conveyance offline  Aborted by host               90%      6201
     -
# 4  Extended offline    Aborted by host               90%      4469
     -
# 5  Extended offline    Aborted by host               90%      4469
     -

SMART Selective Self-Test Log Data Structure Revision Number (0) should be 1
SMART Selective self-test log data structure revision number 0
Warning: ATA Specification requires selective self-test log data
structure revision number = 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

> 
>> But when I last went back to Fedora (then rawhide, now F11) it appeared 
>> again. This only happens when running Fedora 11 in GNOME.
> 
> I can confirm that Fedora 10 doesn't warn about the same disk.

What do you mean by that?

> 
>> My drive is exhibiting no odd behavior. It behaves as it should. I've had 
>> drives fail many times in the past and this one is nowhere near leading 
>> me to believe that a failure is eminent. 
> 
> Filesystems (see "man badblocks") and the hard-disk itself protect against
> a first bunch of errors that can only be worked around by reallocating/ignoring
> sectors. Until the hardware failures become fatal all of a sudden. Hence
> an early warning can be helpful.

Well that make sense, I'm questioning the accuracy of the waring I
guess. :)

Thanks!

-- 
            Scott
http://angrykeyboarder.com
I've never used an OS I didn't (dis)like.
©2009 angrykeyboarder™ & Elmer Fudd. All Wites Wesewved




More information about the fedora-list mailing list