F9: smartd errors, how to fix it?

Daniel B. Thurman dant at cdkkt.com
Sat Aug 1 19:45:38 UTC 2009


Is there any way to fix the following problems,
like forcing fsck or something?

I am getting smartd errors as reported:
$ cat /var/log/messages
[...]
{repeated messages of the following}
Aug  1 12:33:26 gold smartd[2820]: Device: /dev/sda, 6 Currently 
unreadable (pending) sectors
Aug  1 12:33:26 gold smartd[2820]: Device: /dev/sda, 6 Offline 
uncorrectable sectors

$ smartctl -a /dev/sda
smartctl version 5.38 [i386-redhat-linux-gnu] Copyright (C) 2002-8 Bruce 
Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Seagate Barracuda 7200.10 family
Device Model:     ST3750640AS
Serial Number:    5QD3HYLF
Firmware Version: 3.AAE
User Capacity:    750,156,374,016 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Sat Aug  1 12:38:15 2009 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)    Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 121)    The previous self-test 
completed having
                    the read element of the test failed.
Total time to complete Offline
data collection:          ( 430) seconds.
Offline data collection
capabilities:              (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   1) minutes.
Extended self-test routine
recommended polling time:      ( 202) minutes.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      
UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   120   094   006    Pre-fail  
Always       -       236188522
  3 Spin_Up_Time            0x0003   095   093   000    Pre-fail  
Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   
Always       -       164
  5 Reallocated_Sector_Ct   0x0033   100   100   036    Pre-fail  
Always       -       0
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  
Always       -       606571888
  9 Power_On_Hours          0x0032   092   092   000    Old_age   
Always       -       7730
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  
Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   
Always       -       169
187 Reported_Uncorrect      0x0032   071   071   000    Old_age   
Always       -       29
189 High_Fly_Writes         0x003a   100   100   000    Old_age   
Always       -       0
190 Airflow_Temperature_Cel 0x0022   060   052   045    Old_age   
Always       -       40 (Lifetime Min/Max 33/48)
194 Temperature_Celsius     0x0022   040   048   000    Old_age   
Always       -       40 (0 24 0 0)
195 Hardware_ECC_Recovered  0x001a   066   063   000    Old_age   
Always       -       55842978
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   
Always       -       6
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   
Offline      -       6
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   
Always       -       0
200 Multi_Zone_Error_Rate   0x0000   100   253   000    Old_age   
Offline      -       0
202 TA_Increase_Count       0x0032   100   253   000    Old_age   
Always       -       0

SMART Error Log Version: 1
ATA Error Count: 30 (device log contains only the most recent five errors)
    CR = Command Register [HEX]
    FR = Features Register [HEX]
    SC = Sector Count Register [HEX]
    SN = Sector Number Register [HEX]
    CL = Cylinder Low Register [HEX]
    CH = Cylinder High Register [HEX]
    DH = Device/Head Register [HEX]
    DC = Device Command Register [HEX]
    ER = Error register [HEX]
    ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 30 occurred at disk power-on lifetime: 7685 hours (320 days + 5 hours)
  When the command that caused the error occurred, the device was active 
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 00 5c 9d bf e0  Error: IDNF at LBA = 0x00bf9d5c = 12557660

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 30 4b 9d bf e0 00      00:08:21.882  READ DMA EXT
  25 00 08 43 9d bf e0 00      00:08:21.869  READ DMA EXT
  25 00 28 8b 95 c0 e0 00      00:08:21.862  READ DMA EXT
  25 00 08 83 95 c0 e0 00      00:08:21.856  READ DMA EXT
  25 00 08 c3 82 c0 e0 00      00:08:21.957  READ DMA EXT

Error 29 occurred at disk power-on lifetime: 7685 hours (320 days + 5 hours)
  When the command that caused the error occurred, the device was active 
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 7a dd bf e0  Error: UNC at LBA = 0x00bfdd7a = 12574074

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 20 63 dd bf e0 00      00:05:00.757  READ DMA EXT
  27 00 00 00 00 00 e0 00      00:05:00.685  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:05:00.683  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:05:07.849  SET FEATURES [Set transfer 
mode]
  27 00 00 00 00 00 e0 00      00:05:07.791  READ NATIVE MAX ADDRESS EXT

Error 28 occurred at disk power-on lifetime: 7685 hours (320 days + 5 hours)
  When the command that caused the error occurred, the device was active 
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 7a dd bf e0  Error: UNC at LBA = 0x00bfdd7a = 12574074

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 20 63 dd bf e0 00      00:05:00.757  READ DMA EXT
  27 00 00 00 00 00 e0 00      00:05:00.685  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:05:00.683  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:05:00.665  SET FEATURES [Set transfer 
mode]
  27 00 00 00 00 00 e0 00      00:05:00.609  READ NATIVE MAX ADDRESS EXT

Error 27 occurred at disk power-on lifetime: 7685 hours (320 days + 5 hours)
  When the command that caused the error occurred, the device was active 
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 7a dd bf e0  Error: UNC at LBA = 0x00bfdd7a = 12574074

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 20 63 dd bf e0 00      00:05:00.757  READ DMA EXT
  27 00 00 00 00 00 e0 00      00:05:00.685  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:05:00.683  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:05:00.665  SET FEATURES [Set transfer 
mode]
  27 00 00 00 00 00 e0 00      00:05:00.609  READ NATIVE MAX ADDRESS EXT

Error 26 occurred at disk power-on lifetime: 7685 hours (320 days + 5 hours)
  When the command that caused the error occurred, the device was active 
or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 78 dd bf e0  Error: UNC at LBA = 0x00bfdd78 = 12574072

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  25 00 20 63 dd bf e0 00      00:04:56.427  READ DMA EXT
  27 00 00 00 00 00 e0 00      00:04:56.371  READ NATIVE MAX ADDRESS EXT
  ec 00 00 00 00 00 a0 00      00:04:56.368  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00      00:04:52.800  SET FEATURES [Set transfer 
mode]
  27 00 00 00 00 00 e0 00      00:04:52.790  READ NATIVE MAX ADDRESS EXT

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  
LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       90%      
7689         1455414650

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Thanks!
Dan




More information about the fedora-list mailing list