Borked MD RAID...

Mon Dec 1 01:02:00 UTC 2008

Lonni J Friedman wrote:
> Is it always the same disk that gets marked offline ?  Perhaps the
> disk is actually bad?
>
> On Sun, Nov 30, 2008 at 4:20 PM, Eitan Tsur <eitan.tsur at gmail.com> wrote:
>   
>> Even if I remove that it still happens.
>>
>> On Sun, Nov 30, 2008 at 3:53 PM, Lonni J Friedman <netllama at gmail.com>
>> wrote:
>>     
>>> If you only have 3 disks, then you can't have:
>>> spares=1
>>>
>>>
>>>
>>> On Sun, Nov 30, 2008 at 3:11 PM, Eitan Tsur <eitan.tsur at gmail.com> wrote:
>>>       
>>>>> # mdadm.conf written out by anaconda
>>>>> DEVICE partitions
>>>>> MAILADDR root at localhost
>>>>>
>>>>> ARRAY /dev/md0 level=raid5 num-devices=3 spares=1
>>>>> UUID=0c21bf19:83747f05:70a4872d:90643876
>>>>>           
>>>> If I switch the "DEVICE partitions" with "DEVICE /dev/sdb1 /dev/sdc1
>>>> /dev/sdd1", drives no longer are allocated as spares, however the array
>>>> still seems to rebuild every boot.
>>>>
>>>> I don't remember the specifics of what was in /proc/mdstat at the time,
>>>> but
>>>> currently the array is being rebuilt.  I'll reboot after it is complete
>>>> to
>>>> give you a copy of it.  Basically it allocated the dropped drive as a
>>>> spare
>>>> which I'd have to mdadm -stop and mdadm -add to the original array after
>>>> every boot, manually.  Give me an hour or two and I'll get you the
>>>> output of
>>>> mdstat.
>>>>
>>>> On Sun, Nov 30, 2008 at 2:41 PM, Lonni J Friedman <netllama at gmail.com>
>>>> wrote:
>>>>         
>>>>> On Sun, Nov 30, 2008 at 2:38 PM, Eitan Tsur <eitan.tsur at gmail.com>
>>>>> wrote:
>>>>>           
>>>>>> I just recently installed a 3-disk RAID5 array in a server of mine,
>>>>>> running
>>>>>> FC9. Upon reboot, one of the drives drops out, and is allocated as a
>>>>>> spare.
>>>>>> I suspect there is some sort of issue where DBUS re-arranges the
>>>>>> drive-to-device maps between boots, but I am not sure... Just kind of
>>>>>> annoying to have to stop and re-add a drive every boot, and wait the
>>>>>> couple
>>>>>> hours for the array to rebuild the 3rd disk. Any thoughts? Anyone
>>>>>> else
>>>>>> encountered such an issue before? What should I be looking for? I'm
>>>>>> new
>>>>>> to
>>>>>> the world of RAID, so any information you can give may be helpful.
>>>>>>             
>>>>> What's in /etc/mdadm.conf, /proc/mdstat and dmesg when this fails ?
>>>>>           
>
>   

check and make sure that the UUID number your specifying in your 
/etc/mdadm.conf file is correct.  You can verify the UUID numbers by 
typing "ls -l /dev/disk/by-uuid"

Next verify that the UUID numbers in your /etc/mdad.conf file stored in 
the initrd file is correct, you'll have to extract the initrd file
with cpio.  I don't remember the full procedure but you should be able 
to find it pretty easily with your favorite search engine.

Jeff