isw device for volume broken after opensuse livecd boot
tiago.frt at gmail.com
Fri Sep 28 23:38:40 UTC 2007
Thanks that's what I originally thought.
I have now confirmed the bug in the opensuse 10.3 RC2 livecd / dmraid
or device-mapper / OROM.
I deleted the RAID0 volume, disconnected the third drive that wasn't
part of the raid and chose one of the hitachis for RAID1 rebuild.
Then I created a new RAID0 volume, so the raid0 was online and raid1 degraded.
I then booted the opensuse livecd and just entered a root terminal and
typed "dmraid -ay". It said that the RAID0 was broken.
I rebooted and OROM said that the second raid member was offline, so
RAID0 failed and RAID1 degraded.
So just typing "dmraid -ay' makes OROM think one or more members are
offline!! I think only one got offline because the raid1 was degraded,
on the first time when both were online, after the "dmraid -ay" both
Then I disconnected the sata port of one of the disks, and it said
raid0 and raid1 failed. Finally I reconnected all disks and both were
recognized, raid0 became online again.
So after all it was not bad luck or bad connectors! This is a big bug
in both dmraid or device-mapper, and OROM.
First of all, the metadata should not indicate that a member is
offline, that should be a temporarily disconnected member.
You can easily reproduce this by taking the following steps:
Get a P35 motherboard with ICH9R.
Create a RAID0 volume with two disks (in this case two Hitachi 7K160).
Create a RAID1 volume.
Boot OpenSuse 10.3RC2 livecd. Open a terminal, type "su" and then "dmraid -ay"
Reboot and see that at least one of the disks is an offline member,
and raid1 fails.
On 9/28/07, Fang, Ying <ying.fang at intel.com> wrote:
> Sorry, Tiago. I misread your email regarding the two volumes: RAID0 and
> In the following messages:
> "Port 1 .. Member disk(0,1)" means that the hard drive attached to port
> 0 (scsi address: 1:0:0:0) is a member of two RAID arrays (RAID0 and
> RAID1) which are defined as RAID id 0 and 1 respetively.
> But port 0(scsi address 0:0:0:0) has a hard drive that has a RAID
> configuration including the same names of the RAID arrays.
> Because the metadata got messed up, OROM couldn't determine that the
> above two hard drives were belong to the same group of disks. In order
> to differentiate the names of the RAID volumes from two hard drives, :1
> was added in.
> >>> >0 RAID0:1 80Gb Failed
> >>> >1 RAID1:1 109.0Gb Degraded
> >>> >2 RAID0 80Gb Failed
> >>> >3 RAID1 109.0Gb Degraded
> >>> >
> >>> >Port
> >>> >0 Hitachi 149.1GB Member Disk(0,1)
> >>> >1 Hitachi 149.1GB Member Disk(2,3)
> Thanks Eric for pointing out that the OROM display screen doesn't
> include the partition information.
> I hope that will help you understand those magic numbers. If you have
> any questions, let me know.
> >-----Original Message-----
> >From: Fang, Ying
> >Sent: Thursday, September 27, 2007 4:07 PM
> >To: Tiago Freitas
> >Cc: ATARAID (eg, Promise Fasttrak, Highpoint 370) related discussions
> >Subject: RE: isw device for volume broken after opensuse livecd boot
> >Are you talking about RAID1 and RAID1:1? The first is the RAID device
> >the latter is the first partition in that RAID device. If you have more
> >than one partition there, you'll get RAID1:2 and so on.
More information about the Ataraid-list