howto run badblocks using fedora rescue mode
Wolfgang S. Rupprecht
wolfgang.rupprecht+gnus200812 at gmail.com
Sat Dec 13 22:01:11 UTC 2008
> You might be better off using smartctl to have the disk scan itself.
> Modern disks remap bad blocks for you when possible. So it is possible
> for you to have a failing drive and not have badblocks spot any.
> You can usually have the disks do a self test while your system is running.
I have a bit of experience with this now, having been given two
laptops to fix with abused/damaged disks.
1) The short smartmon test is worthless. Out of 2 dozen disks I've
never seen a disk fail this. The closest I've seen is 3 disks that
didn't respond to any commands, but I got an error from smartctl
itself, not the test.
2) The long smartmon test will flush out any pending bad blocks
errors. This is a very important number. Run this test and then
look at the "Current_Pending_Sector" count.
3) Any non-zero error in ether the remapped block count
(Reallocated_Event_Count) or the pending remapped block count
(Current_Pending_Sector) means the disk has a physical ding in it,
probably from being dropped or bumped severely while running.
4) Any developed bad spots (see #3) are the kiss of death. The disk
is on its way out. Backup and order a spare. (This is also
Google's finding, and they have a ton of disks to collect data
over.)
5) Current_Pending_Sectors are a pain to have in that state. They
cause the long test to return errors until you force them to be
remapped. The simplest way to remap them is to force a write. All
of my bad blocks were in the free list, so a simple shell script to
use up free space on the disk fixed them. Ideally ext3 would have
a security tool to forcefully clear the blocks on the free list,
which would also have written the bad blocks and caused the disk
blocks to be reallocated.
Example of a failing disk with a "ding" in the platter.
[root at acidophilus ~]# smartctl -a /dev/sda
...
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 11
197 Current_Pending_Sector 0x0032 100 100 000 Old_age Always - 0
...
Before the remapping I had 7 back blocks on one list and 6 on the
other. The long test would always fail citing the bad blocks.
-wolfgang
--
Wolfgang S. Rupprecht http://www.full-steam.org/ (ipv6-only)
You may need to config 6to4 to see the above pages.
More information about the fedora-list
mailing list