[rhelv6-list] Nasty bug with writing to resyncing RAID-5 Array

Daryl Herzmann akrherz at iastate.edu
Sat Jun 23 17:04:07 UTC 2012


On Fri, Jun 22, 2012 at 4:03 PM, Stephen John Smoogen <smooge at gmail.com> wrote:
> On 22 June 2012 14:10, daryl herzmann <akrherz at iastate.edu> wrote:
>> Howdy,
>>
>> The RHEL6.3 release notes have a curious entry:
>>
>> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.3_Technical_Notes/kernel_issues.html
>>
>>  kernel component
>>
>>  Due to a race condition, in certain cases, writes to RAID4/5/6 while the
>>  array is reconstructing could hang the system
>>
>> Wow, I am reproducing it frequently here.  Simply have a RAID-5 software
>> array and do some write IO to it, eventually things start hanging and the
>> power button needs to be pressed.
>>
>> Oh man.
>
> Well the race condition they are mentioning should only happen when
> the RAID array is reconstructing. This sounds like a different
> bug/problem. What kind of disks, type of RAID etc.

Thanks for the response.  I am not sure of the difference between
'reconstructing' and 'resyncing' and/or 'syncing'.  The reproducing
case was quite easy for me.

1. Create a software raid5
2. Immediately then create a filesystem on this raid5, while init sync underway
3. IO to the RAID device eventually stops, even for the software raid5 sync

or another reproducer, which is more concerning:

1. Start a verify on a previously clean raid5
2. Do some write IO to the mounted device
3. Processes accessing that mount point lock up
4. Push the power button :(

I wonder how many people will hit this, once the first Sunday of July
rolls around and software raid5's are auto-verified.

daryl




More information about the rhelv6-list mailing list