[rhelv6-list] Nasty bug with writing to resyncing RAID-5 Array

David C. Miller millerdc at fusion.gat.com
Thu Aug 16 18:08:20 UTC 2012



----- Original Message -----
> From: "Daryl Herzmann" <akrherz at iastate.edu>
> To: "Red Hat Enterprise Linux 6 (Santiago) discussion mailing-list" <rhelv6-list at redhat.com>
> Sent: Wednesday, August 15, 2012 7:32:15 AM
> Subject: Re: [rhelv6-list] Nasty bug with writing to resyncing RAID-5 Array
> 
> On Sun, Jun 24, 2012 at 12:48 PM, Stephen John Smoogen
> <smooge at gmail.com> wrote:
> > On 23 June 2012 11:04, Daryl Herzmann <akrherz at iastate.edu> wrote:
> >> On Fri, Jun 22, 2012 at 4:03 PM, Stephen John Smoogen
> >> <smooge at gmail.com> wrote:
> >>> On 22 June 2012 14:10, daryl herzmann <akrherz at iastate.edu>
> >>> wrote:
> >>>> Howdy,
> >>>>
> >>>> The RHEL6.3 release notes have a curious entry:
> >>>>
> >>>> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.3_Technical_Notes/kernel_issues.html
> >>>>
> >>>>  kernel component
> >>>>
> >>>>  Due to a race condition, in certain cases, writes to RAID4/5/6
> >>>>  while the
> >>>>  array is reconstructing could hang the system
> >>>>
> >>>> Wow, I am reproducing it frequently here.  Simply have a RAID-5
> >>>> software
> >>>> array and do some write IO to it, eventually things start
> >>>> hanging and the
> >>>> power button needs to be pressed.
> >>>>
> >>>> Oh man.
> >>>
> >>> Well the race condition they are mentioning should only happen
> >>> when
> >>> the RAID array is reconstructing. This sounds like a different
> >>> bug/problem. What kind of disks, type of RAID etc.
> >>
> >> Thanks for the response.  I am not sure of the difference between
> >> 'reconstructing' and 'resyncing' and/or 'syncing'.  The
> >> reproducing
> >> case was quite easy for me.
> >>
> >> 1. Create a software raid5
> >> 2. Immediately then create a filesystem on this raid5, while init
> >> sync underway
> >> 3. IO to the RAID device eventually stops, even for the software
> >> raid5 sync
> >
> > Ok reconstructing is where the initial RAID drives pair up with
> > each
> > other. Resyncing I believe is where a RAID which has been created
> > is
> > putting the data across its raid. Basic cat /proc/mdstat.. if there
> > is
> > a line ====> then you are reconstructing the disk array. In the
> > example you give above, the disks would be reconstructing
> >
> > So the next thing to do is why you are able to trigger it
> > constantly.
> > That may be due to
> > CPU Type:
> > RAM Amount:
> > Disk controllers:
> > DIsk types (SATA, SAS, SCSI, PATA):
> > RAID type:
> > RAID layout (same controller, different controller, etc):
> 
> I don't seem to have much issue reproducing, I just had another
> machine do it this morning.  Nehalem processor, 12 GB ram, Dell
> PowerEdge T400, Perc 6i controller, software raid 5, Seagate 2 TB
> Barracuda drives...
> 
> Does anybody have the bugzilla ticket associated with this or perhaps
> a knowledge base article on it?
> 
> daryl
> 

I would like to know too. I have not seen this issue yet but I do have some large RAID6 arrays.

David.




More information about the rhelv6-list mailing list