[rhelv6-list] Nasty bug with writing to resyncing RAID-5 Array

Grzegorz Witkowski geslinux at gmail.com
Wed Aug 15 20:01:00 UTC 2012


On Wed, Aug 15, 2012 at 3:32 PM, Daryl Herzmann <akrherz at iastate.edu> wrote:

> On Sun, Jun 24, 2012 at 12:48 PM, Stephen John Smoogen <smooge at gmail.com>
> wrote:
> > On 23 June 2012 11:04, Daryl Herzmann <akrherz at iastate.edu> wrote:
> >> On Fri, Jun 22, 2012 at 4:03 PM, Stephen John Smoogen <smooge at gmail.com>
> wrote:
> >>> On 22 June 2012 14:10, daryl herzmann <akrherz at iastate.edu> wrote:
> >>>> Howdy,
> >>>>
> >>>> The RHEL6.3 release notes have a curious entry:
> >>>>
> >>>>
> http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/6.3_Technical_Notes/kernel_issues.html
> >>>>
> >>>>  kernel component
> >>>>
> >>>>  Due to a race condition, in certain cases, writes to RAID4/5/6 while
> the
> >>>>  array is reconstructing could hang the system
> >>>>
> >>>> Wow, I am reproducing it frequently here.  Simply have a RAID-5
> software
> >>>> array and do some write IO to it, eventually things start hanging and
> the
> >>>> power button needs to be pressed.
> >>>>
> >>>> Oh man.
> >>>
> >>> Well the race condition they are mentioning should only happen when
> >>> the RAID array is reconstructing. This sounds like a different
> >>> bug/problem. What kind of disks, type of RAID etc.
> >>
> >> Thanks for the response.  I am not sure of the difference between
> >> 'reconstructing' and 'resyncing' and/or 'syncing'.  The reproducing
> >> case was quite easy for me.
> >>
> >> 1. Create a software raid5
> >> 2. Immediately then create a filesystem on this raid5, while init sync
> underway
> >> 3. IO to the RAID device eventually stops, even for the software raid5
> sync
> >
> > Ok reconstructing is where the initial RAID drives pair up with each
> > other. Resyncing I believe is where a RAID which has been created is
> > putting the data across its raid. Basic cat /proc/mdstat.. if there is
> > a line ====> then you are reconstructing the disk array. In the
> > example you give above, the disks would be reconstructing
> >
> > So the next thing to do is why you are able to trigger it constantly.
> > That may be due to
> > CPU Type:
> > RAM Amount:
> > Disk controllers:
> > DIsk types (SATA, SAS, SCSI, PATA):
> > RAID type:
> > RAID layout (same controller, different controller, etc):
>
> I don't seem to have much issue reproducing, I just had another
> machine do it this morning.  Nehalem processor, 12 GB ram, Dell
> PowerEdge T400, Perc 6i controller, software raid 5, Seagate 2 TB
> Barracuda drives...
>
> Does anybody have the bugzilla ticket associated with this or perhaps
> a knowledge base article on it?
>
> daryl
>
> _______________________________________________
> rhelv6-list mailing list
> rhelv6-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rhelv6-list
>

Just a curiosity... why to use software raid if you have PERC6??? :O
Of course, bugzilla should be open for this...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/rhelv6-list/attachments/20120815/cbbbc1fd/attachment.htm>


More information about the rhelv6-list mailing list