Bug in dmraid? (segfaults when raid-set name set)

Bas Mevissen abuse at basmevissen.nl
Wed Mar 19 15:09:34 UTC 2008


I'm using dmraid on my server (Supermicro with Intel fakeraid to mirror
2 SATA disks, Adaptec ROM emulation, DDF1 1 format). The device mapper
is initialized by dmraid in the initrd. But unfortunately, I found it
not to work properly and my system was running on 1 disk only...

Anyway, that is corrected now. I also found the reason of the failure.
It appears that dmraid crashes (segfaults) when it is invoked, both in
the running system and in the initrd. See GDB session below.

In initrd init script:
echo Scanning and configuring dmraid supported devices
dmraid -ay -i -p "ddf1_EMCODEV_BEAST"
kpartx -a -p p "/dev/mapper/ddf1_EMCODEV_BEAST"

[sgm at beast tools]$ pwd
[sgm at beast tools]$ sudo gdb
GNU gdb Red Hat Linux (6.5-25.el5_1.1rh)
This GDB was configured as "x86_64-redhat-linux-gnu".
(gdb) file dmraid
Reading symbols from /home/sgm/dmraid/1.0.0.rc14/tools/dmraid...done.
Using host libthread_db library "/lib64/libthread_db.so.1".
(gdb) set args -ay -t -p "some_raid_set_name"
(gdb) run
Starting program: /home/sgm/dmraid/1.0.0.rc14/tools/dmraid -ay -t -p

Program received signal SIGSEGV, Segmentation fault.
0x000000300074ba00 in main_arena () from /lib64/libc.so.6
(gdb) bt
#0  0x000000300074ba00 in main_arena () from /lib64/libc.so.6
#1  0x000000000040c480 in dmraid_group (lc=0x21ca5a0, rd=0x300074b9c0)
at metadata/metadata.c:657
#2  0x000000000040c861 in group_set (lc=0x21ca5a0, name=0x7fffbe5a6c4e
"some_long_raid_set_name") at metadata/metadata.c:873
#3  0x00000000004047e3 in build_sets (lc=0x21ca5a0, sets=0x7fffbe5a4ff8)
at toollib.c:69
#4  0x0000000000404425 in get_metadata (lc=0x21ca5a0, p=0x62fd80,
argv=0x7fffbe5a4ff8) at commands.c:640
#5  0x000000000040452a in _perform (lc=0x21ca5a0, p=0x62fd80,
argv=0x7fffbe5a4ff8) at commands.c:767
#6  0x00000000004045f6 in perform (lc=0x21ca5a0, argv=0x7fffbe5a4ff8) at
#7  0x0000000000403663 in main (argc=5, argv=0x7fffbe5a4ff8) at dmraid.c:34

It appears that if a name of a raid set (existing or not) is given, it
will segfault.

* Some extra info:
Supermicro system running centos^wRHEL 5.1 with 2.6.18.* xen kernel.
Contains dmraid rc13. Tested with rc14. Name of the raid set (set in the
$ sudo dmraid -t -ay -p
ddf1_EMCODEV_BEAST: 0 624737856 mirror core 2 131072 nosync 2 /dev/sda 0
/dev/sdb 0

* Side notes:
It appears that the debug version (configure --enable-debug) is not
correctly build if a release version was previously build. Reason is
that the make.tmpl file is re-created AFTER creating the Makefile from
Makefile.in. Attached patch corrects this behavior.

Also, there is always an -O2 optimization. This caused inline functions
to be hidden for the source level debugger. I manually removed the -O2
from configure, but there should be a better way. (see configure, Cif
test "$GCC" = yes; then <cr> FLAGS="-g -O2" ).

Last note: when changing the configuration by ./configure, it seems that
not always all files that need rebuilding are actually rebuild. I just
called "make clean", but a better fix is welcome.

Anyway, I fixed it for myself by just removing the raid set name from
the call to dmraid.static in the initrd. But in rc.sysinit, it still

Can you tell whether there might be a bug in dmraid? You can point to
relevant sections of "SNIA-DDFv1.2_with_Errata_A_Applied.pdf" if you
like for info about the DDF1 labels. Of course, I would be happy to test
a possible fix.


Bas Mevissen
Antoon de Winterstraat 15
6001 SP Weert
T: +31 (0)495 450 421
F: +31 (0)495 450 422
E: bas.mevissen at emcodev.nl

