[Linux-cluster] Possible bug in rhel5 for nested HA-LVM resources?

Wed Mar 3 17:01:21 UTC 2010

Hello,
my problem begins from this need:
- having a rh el 5.4 cluster with 2 nodes where I have HA-LVM in place and
some lvm/fs pairs resources componing one service

I want to add a new lvm/fs to the cluster, without disrupting the running
service.
My already configured and running lvm/mountpoints are:
/dev/mapper/VG_TEST_APPL-LV_TEST_APPL
                      5.0G  139M  4.6G   3% /appl_db1
/dev/mapper/VG_TEST_DATA-LV_TEST_DATA
                      5.0G  139M  4.6G   3% /oradata/TEST

The new desired mount point is to be put under /oradata/TEST/newtemp

Current extract of cluster.conf is
                <service domain="MAIN" autostart="1" name="TESTSRV">
                        <ip ref="10.4.5.157"/>
                        <lvm ref="TEST_APPL"/>
                        <fs ref="TEST_APPL"/>
                        <lvm ref="TEST_DATA"/>
                        <fs ref="TEST_DATA"/>
                        <script ref="clusterssh"/>
                </service>

Based on my assumption about precedences/child resources and so on, I
presumed one correct new conf would be:
                <service domain="MAIN" autostart="1" name="TESTSRV">
                        <ip ref="10.4.5.157"/>
                        <lvm ref="TEST_APPL"/>
                        <fs ref="TEST_APPL"/>
                        <lvm ref="TEST_DATA"/>
                        <fs ref="TEST_DATA">
                                <lvm ref="TEST_TEMP"/>
                                 <fs ref="TEST_TEMP"/>
                        </fs>
                        <script ref="clusterssh"/>
                </service>

And in fact I was able to
- temporary verify new lvm/fs outside of cluster inserting its name in
volume_list under lvm.conf (and touch of initrd files in /boot... --> this
needs to be fixed some day ;-)
- vgchange -ay of the new vg and mount the fs ---> ok
- umount fs, remove filter from volume_list and touch initrd files
- change config with also an increased version number
- run ccs_tool update /etc/cluster/cluster.conf
- cman_tool version -r new_version

OK.
The problem is that any consequent relocate/restart is not able to start the
service.... also a reboot of the test node
This is because during my preliminary steps I activated the VG and didn't
deactivated it then, so my first change was not representing actual start
steps...
It seems from the messages that rgmanager tries to start the inner fs before
activating its lvm device first....

Mar  3 16:21:56 clutest1 clurgmgrd[2396]: <notice> Starting stopped service
service:TESTSRV
Mar  3 16:21:56 clutest1 clurgmgrd: [2396]: <notice> Activating
VG_TEST_APPL/LV_TEST_APPL
Mar  3 16:21:56 clutest1 clurgmgrd: [2396]: <notice> Making resilient :
lvchange -ay VG_TEST_APPL/LV_TEST_APPL
Mar  3 16:21:56 clutest1 clurgmgrd: [2396]: <notice> Resilient command:
lvchange -ay VG_TEST_APPL/LV_TEST_APPL --config
devices{filter=["a|/dev/vda4|","a|/dev/vdc|","a|/dev/vdd|","a|/dev/vde|","r|.*|"]}
Mar  3 16:21:56 clutest1 clurgmgrd: [2396]: <notice> Activating
VG_TEST_DATA/LV_TEST_DATA
Mar  3 16:21:57 clutest1 clurgmgrd: [2396]: <notice> Making resilient :
lvchange -ay VG_TEST_DATA/LV_TEST_DATA
Mar  3 16:21:57 clutest1 clurgmgrd: [2396]: <notice> Resilient command:
lvchange -ay VG_TEST_DATA/LV_TEST_DATA --config
devices{filter=["a|/dev/vda4|","a|/dev/vdc|","a|/dev/vdd|","a|/dev/vde|","r|.*|"]}
Mar  3 16:21:57 clutest1 kernel: kjournald starting.  Commit interval 5
seconds
Mar  3 16:21:57 clutest1 kernel: EXT3 FS on dm-0, internal journal
Mar  3 16:21:57 clutest1 kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Mar  3 16:21:57 clutest1 kernel: kjournald starting.  Commit interval 5
seconds
Mar  3 16:21:57 clutest1 kernel: EXT3 FS on dm-4, internal journal
Mar  3 16:21:57 clutest1 kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Mar  3 16:21:57 clutest1 clurgmgrd: [2396]: <err> startFilesystem: Could not
match /dev/VG_TEST_TEMP/LV_TEST_TEMP with a real device
Mar  3 16:21:57 clutest1 clurgmgrd[2396]: <notice> start on fs "TEST_TEMP"
returned 2 (invalid argument(s))
Mar  3 16:21:57 clutest1 clurgmgrd[2396]: <warning> #68: Failed to start
service:TESTSRV; return value: 1
Mar  3 16:21:57 clutest1 clurgmgrd[2396]: <notice> Stopping service
service:TESTSRV

This in my opinion is a bug, as the last fs resource (TEST_TEMP) should be
started after its same level lvm resource (TEST_TEMP).
Is my assumption shared by anyone else, before opening a bug?

A fix to this seems to be this config:
                <service domain="MAIN" autostart="1" name="TESTSRV">
                        <ip ref="10.4.5.157"/>
                        <lvm ref="TEST_APPL"/>
                        <fs ref="TEST_APPL"/>
                        <lvm ref="TEST_DATA"/>
                        <fs ref="TEST_DATA">
                                <lvm ref="TEST_TEMP">
                                        <fs ref="TEST_TEMP"/>
                                </lvm>
                        </fs>
                        <script ref="clusterssh"/>
                </service>

But in my opinion sort of redundant one.....
With this config enabled  and the service that remained in stopped state,
because of the problem above, if I run now
clusvcadm -R TESTSRV
I get success, with this inside the log...

Mar  3 16:40:41 clutest1 clurgmgrd[2396]: <notice> Starting stopped service
service:TESTSRV
Mar  3 16:40:41 clutest1 clurgmgrd: [2396]: <notice> Activating
VG_TEST_APPL/LV_TEST_APPL
Mar  3 16:40:41 clutest1 clurgmgrd: [2396]: <notice> Making resilient :
lvchange -ay VG_TEST_APPL/LV_TEST_APPL
Mar  3 16:40:41 clutest1 clurgmgrd: [2396]: <notice> Resilient command:
lvchange -ay VG_TEST_APPL/LV_TEST_APPL --config
devices{filter=["a|/dev/vda4|","a|/dev/vdc|","a|/dev/vdd|","a|/dev/vde|","r|.*|"]}
Mar  3 16:40:41 clutest1 clurgmgrd: [2396]: <notice> Activating
VG_TEST_DATA/LV_TEST_DATA
Mar  3 16:40:41 clutest1 clurgmgrd: [2396]: <notice> Making resilient :
lvchange -ay VG_TEST_DATA/LV_TEST_DATA
Mar  3 16:40:41 clutest1 clurgmgrd: [2396]: <notice> Resilient command:
lvchange -ay VG_TEST_DATA/LV_TEST_DATA --config
devices{filter=["a|/dev/vda4|","a|/dev/vdc|","a|/dev/vdd|","a|/dev/vde|","r|.*|"]}
Mar  3 16:40:41 clutest1 kernel: kjournald starting.  Commit interval 5
seconds
Mar  3 16:40:41 clutest1 kernel: EXT3 FS on dm-0, internal journal
Mar  3 16:40:41 clutest1 kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Mar  3 16:40:42 clutest1 kernel: kjournald starting.  Commit interval 5
seconds
Mar  3 16:40:42 clutest1 kernel: EXT3 FS on dm-4, internal journal
Mar  3 16:40:42 clutest1 kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Mar  3 16:40:42 clutest1 clurgmgrd: [2396]: <notice> Activating
VG_TEST_TEMP/LV_TEST_TEMP
Mar  3 16:40:42 clutest1 clurgmgrd: [2396]: <notice> Making resilient :
lvchange -ay VG_TEST_TEMP/LV_TEST_TEMP
Mar  3 16:40:42 clutest1 clurgmgrd: [2396]: <notice> Resilient command:
lvchange -ay VG_TEST_TEMP/LV_TEST_TEMP --config
devices{filter=["a|/dev/vda4|","a|/dev/vdc|","a|/dev/vdd|","a|/dev/vde|","r|.*|"]}
Mar  3 16:40:42 clutest1 kernel: kjournald starting.  Commit interval 5
seconds
Mar  3 16:40:42 clutest1 kernel: EXT3 FS on dm-5, internal journal
Mar  3 16:40:42 clutest1 kernel: EXT3-fs: mounted filesystem with ordered
data mode.
Mar  3 16:40:43 clutest1 clurgmgrd[2396]: <notice> Service service:TESTSRV
started
Mar  3 16:40:51 clutest1 clurgmgrd: [2396]: <notice> Getting status

Any thoughts are appreciated.
Gianluca
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100303/7a523600/attachment.htm>