SMART Errors: What does it mean?

Linux for blind general discussion blinux-list at redhat.com
Sat Apr 6 15:32:31 UTC 2019


Rob here.
I ran the test and got this.
SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed: read failure       20%      1863         -
That sure doesn't look good. I took it out of my BTRFS array because there were millions of wite errors. Sounds like I acted none too soon.
----- Original Message -----
From: Linux for blind general discussion <blinux-list at redhat.com>
To: blinux-list at redhat.com
Date: Fri, 5 Apr 2019 08:23:25 -0500
Subject: Re: SMART Errors: What does it mean?

> Tim here.  It sounds like your drive is starting to error out and
> possibly die.
> 
> The first thing I'd do is run a background test:
> 
> $ sudo /usr/sbin/smartctl --test=long /dev/sda
> 
> and then after it has finished running, issue
> 
> $ sudo /usr/sbin/smartctl -l selftest /dev/sda
> 
> to see the results of the test.  I have a pair of cron jobs set up to
> run these two commands weekly, running the first test at midnight on
> Sunday-into-Monday, and then running the report at midnight of
> Monday-into-Tuesday (the test usually takes ~2hr on my machine and I
> tend to forget).  My own drive started throwing errors a while ago and
> so I've bought a replacement and just need to do the
> backup/replace/reinstall/restore dance when I get the time next week.
> 
> -tim
> 
> On April  5, 2019, Linux for blind general discussion wrote:
> > This is an error report from
> > smartctl -a /dev/sda
> > Truncated, showing only intro and error section
> > smartctl 6.6 2016-05-31 r4324
> > [x86_64-linux-4.18.0-0.bpo.1-rt-amd64] (local build) Copyright (C)
> > 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org
> > 
> > === START OF INFORMATION SECTION ===
> > Device Model:     HITACHI HUS724040ALE640
> > Serial Number:    PAGRGBRS
> > LU WWN Device Id: 5 000cca 22bca3623
> > Firmware Version: MJAONS04
> > User Capacity:    4,000,787,030,016 bytes [4.00 TB]
> > Sector Sizes:     512 bytes logical, 4096 bytes physical
> > Rotation Rate:    7200 rpm
> > Form Factor:      3.5 inches
> > Device is:        Not in smartctl database [for details use: -P
> > showall] ATA Version is:   ATA8-ACS T13/1699-D revision 4
> > SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
> > Local Time is:    Fri Apr  5 02:43:04 2019 CDT
> > SMART support is: Available - device has SMART capability.
> > SMART support is: Enabled
> > 
> > SMART Error Log Version: 1
> > ATA Error Count: 65535 (device log contains only the most recent
> > five errors) CR = Command Register [HEX]
> > 	FR = Features Register [HEX]
> > 	SC = Sector Count Register [HEX]
> > 	SN = Sector Number Register [HEX]
> > 	CL = Cylinder Low Register [HEX]
> > 	CH = Cylinder High Register [HEX]
> > 	DH = Device/Head Register [HEX]
> > 	DC = Device Command Register [HEX]
> > 	ER = Error register [HEX]
> > 	ST = Status register [HEX]
> > Powered_Up_Time is measured from power on, and printed as
> > DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
> > SS=sec, and sss=millisec. It "wraps" after 49.710 days.
> > 
> > Error 65535 occurred at disk power-on lifetime: 1335 hours (55 days
> > + 15 hours) When the command that caused the error occurred, the
> > device was active or idle.
> > 
> >   After command completion occurred, registers were:
> >   ER ST SC SN CL CH DH
> >   -- -- -- -- -- -- --
> >   84 51 90 d0 bb 03 0d  Error: ICRC, ABRT at LBA = 0x0d03bbd0 =
> > 218348496
> > 
> >   Commands leading to the command that caused the error were:
> >   CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
> >   -- -- -- -- -- -- -- --  ----------------  --------------------
> >   61 00 c0 80 b6 03 40 00      10:34:04.997  WRITE FPDMA QUEUED
> >   61 00 b8 80 b1 03 40 00      10:34:04.997  WRITE FPDMA QUEUED
> >   61 e0 b0 80 bb 03 40 00      10:34:04.997  WRITE FPDMA QUEUED
> >   ef 10 02 00 00 00 a0 00      10:34:04.997  SET FEATURES [Enable
> > SATA feature] ec 00 00 00 00 00 a0 00      10:34:04.995  IDENTIFY
> > DEVICE
> > 
> > 
> > There are 5 more of these. What is this error telling me, exactly?
> > I don't quite get it.
> > 
> > _______________________________________________
> > Blinux-list mailing list
> > Blinux-list at redhat.com
> > https://www.redhat.com/mailman/listinfo/blinux-list
> 
> _______________________________________________
> Blinux-list mailing list
> Blinux-list at redhat.com
> https://www.redhat.com/mailman/listinfo/blinux-list
> 




More information about the Blinux-list mailing list