[Linux-cluster] Possible bug in rhel5 for nested HA-LVM resources?

Lon Hohberger lhh at redhat.com
Wed Mar 3 21:53:49 UTC 2010


On Wed, 2010-03-03 at 18:01 +0100, Gianluca Cecchi wrote:


> The new desired mount point is to be put under /oradata/TEST/newtemp
> 
> 
> Current extract of cluster.conf is 
>                 <service domain="MAIN" autostart="1" name="TESTSRV">
>                         <ip ref="10.4.5.157"/>
>                         <lvm ref="TEST_APPL"/>
>                         <fs ref="TEST_APPL"/>
>                         <lvm ref="TEST_DATA"/>
>                         <fs ref="TEST_DATA"/>
>                         <script ref="clusterssh"/>
>                 </service>
> 
> 
> Based on my assumption about precedences/child resources and so on, I
> presumed one correct new conf would be: 
>                 <service domain="MAIN" autostart="1" name="TESTSRV">
>                         <ip ref="10.4.5.157"/>
>                         <lvm ref="TEST_APPL"/>
>                         <fs ref="TEST_APPL"/>
>                         <lvm ref="TEST_DATA"/>
>                         <fs ref="TEST_DATA">
>                                 <lvm ref="TEST_TEMP"/>
>                                  <fs ref="TEST_TEMP"/>
>                         </fs>
>                         <script ref="clusterssh"/>
>                 </service>

So, here's the problem:

Starting TESTSRV...
[start] service:TESTSRV
[start] lvm:TEST_APPL
[start] lvm:TEST_DATA
[start] fs:TEST_APPL
[start] fs:TEST_DATA
[start] fs:TEST_TEMP
[start] lvm:TEST_TEMP
[start] ip:10.4.5.157
[start] script:clusterssh
Start of TESTSRV complete

This is because[1]:

===
With all the type-specified children, it's also important to note that
all untyped children - children of a given resource node which do not
have a <child> definition in the resource agent metadata - are all
started according to their order in cluster.conf and stopped in reverse
order. They are started after all type-specified children and stopped
before any typed children.
===

As it happens, the 'fs' file system type looks for child 'fs' resources:

        <child type="fs" start="1" stop="3"/>

... but it does not have an entry for 'lvm', which would be required to
make it work in the order you specified.

The following would work:

  <service domain="MAIN" autostart="1" name="TESTSRV">
    <ip ref="10.4.5.157"/>
    <lvm ref="TEST_APPL"/>
    <fs ref="TEST_APPL"/>
    <lvm ref="TEST_DATA"/>
    <fs ref="TEST_DATA"/>
    <lvm ref="TEST_TEMP"/>
    <fs ref="TEST_TEMP"/>
    <script ref="clusterssh"/>
  </service>


rg_test ordering:

Starting TESTSRV...
[start] service:TESTSRV
[start] lvm:TEST_APPL
[start] lvm:TEST_DATA
[start] lvm:TEST_TEMP
[start] fs:TEST_APPL
[start] fs:TEST_DATA
[start] fs:TEST_TEMP
[start] ip:10.4.5.157
[start] script:clusterssh
Start of TESTSRV complete


So would this (if you needed to be absolutely sure that TEST_TEMP
mounted as a subdirectory of the TEST_APPL mountpoint):

  <service domain="MAIN" autostart="1" name="TESTSRV">
    <ip ref="10.4.5.157"/>
    <lvm ref="TEST_APPL"/>
    <fs ref="TEST_APPL"/>
    <lvm ref="TEST_DATA"/>
    <lvm ref="TEST_TEMP"/>
    <fs ref="TEST_DATA">
      <fs ref="TEST_TEMP"/>
    </fs>
    <script ref="clusterssh"/>
  </service>


There apparently a bug in "rg_test delta" output.  This is because it
simply runs through the resource list, looking for newly added/changed
resources.  Because it doesn't run down the tree, the output of the
"Operations" section is wrong.

However, the tree outputs (old/new) are correct and consistent with
expected operation given the default fs.sh/lvm.sh/service.sh metadata.

-- Lon

[1] http://sources.redhat.com/cluster/wiki/ResourceTrees




More information about the Linux-cluster mailing list