[Linux-cluster] Two nodes DRBD - Fail-Over Actif/Passif Cluster.

Mon Feb 14 21:02:40 UTC 2011

Hello all,

I just installed last week two servers, each of them with Redhat Linux Enterprise 6.0 on it for hosting in a near future Blue Coat Reporter. Installation is ok but now I am trying to configure these both servers in cluster. 

First of all, I never configured any cluster with Linux ...

Servers are both HP DL380R06 with disk cabinet directly attached on it. (twice exactly same hardware specs).

What I would like to get is simply getting an Actif/Passif clustering mode with bidirectional disk space synchronization. This means, both servers are running. Only, the first one is running Reporter. During this time, disk spaces are continuously synchronized. When first one is down, second one becomes actif and when first one is running again, it synchronizes the disks and becomes primary again.

server 1 is reporter1.lab.intranet with ip 10.30.30.90
server 2 is reporter2.lab.intranet with ip 10.30.30.91

the load balanced ip should be 10.30.30.92 ..

After some days of research on the net, I came to the conclusion that I could be happy with a solution including, DRBD/GFS2 with Redhat Cluster Suite.

I am first trying to get a complete picture running on two vmware fusion (Linux Redhat Enterprise Linux 6) on my macosx before configuring my real servers.

So, after some hours of research on the net, I found some articles and links that seem to describe what I wanna get ...

http://gcharriere.com/blog/?p=73
http://www.linuxtopia.org/online_books/rhel6/rhel_6_cluster_admin/rhel_6_cluster_ch-config-cli-CA.html
http://www.drbd.org/users-guide/users-guide.html

and the DRBD packages for RHEL6 that I did not find anywhere ..

http://elrepo.org/linux/elrepo/el6/i386/RPMS/

I just only configured till now the first part, meaning cluster services but the first issue occur ..

below the cluster.conf file ...

<?xml version="1.0"?>
<cluster name="cluster" config_version="6">
  <!-- post_join_delay: number of seconds the daemon will wait before
                        fencing any victims after a node joins the domain
       post_fail_delay: number of seconds the daemon will wait before
            	        fencing any victims after a domain member fails
       clean_start    : prevent any startup fencing the daemon might do.
		        It indicates that the daemon should assume all nodes
		        are in a clean state to start. -->
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
  <clusternodes>
    <clusternode name="reporter1.lab.intranet" votes="1" nodeid="1">
      <fence>
        <!-- Handle fencing manually -->
        <method name="human">
          <device name="human" nodename="reporter1.lab.intranet"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="reporter2.lab.intranet" votes="1" nodeid="2"> 
      <fence>
        <!-- Handle fencing manually -->
        <method name="human">
          <device name="human" nodename="reporter2.lab.intranet"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <!-- cman two nodes specification -->
  <cman expected_votes="1" two_node="1"/>
  <fencedevices>
    <!-- Define manual fencing -->
    <fencedevice name="human" agent="fence_manual"/>
  </fencedevices>
  <rm>
     <failoverdomains>
        <failoverdomain name="example_pri" nofailback="0" ordered="1" restricted="0">
           <failoverdomainnode name="reporter1.lab.intranet" priority="1"/>
           <failoverdomainnode name="reporter2.lab.intranet" priority="2"/>
        </failoverdomain>
     </failoverdomains>
     <resources>
           <ip address="10.30.30.92" monitor_link="on" sleeptime="10"/>
           <apache config_file="conf/httpd.conf" name="example_server" server_root="/etc/httpd" shutdown_wait="0"/>
      </resources>
      <service autostart="1" domain="example_pri" exclusive="0" name="example_apache" recovery="relocate">
                <ip ref="10.30.30.92"/>
                <apache ref="example_server"/>
      </service>
  </rm>
</cluster>

and this is the result I get on both servers ...

[root at reporter1 ~]# clustat
Cluster Status for cluster @ Mon Feb 14 22:22:53 2011
Member Status: Quorate

 Member Name                                      ID   Status
 ------ ----                                      ---- ------
 reporter1.lab.intranet                               1 Online, Local, rgmanager
 reporter2.lab.intranet                               2 Online, rgmanager

 Service Name                            Owner (Last)                            State         
 ------- ----                            ----- ------                            -----         
 service:example_apache                  (none)                                  stopped       

as you can see, everything is stopped or in other words nothing runs .. so my question are :

did I forget something in my conf file ?
did I make something wrong in my conf file ?
do I have to configure manually load balanced ip 10.30.30.92 as an alias ip on both sides or is it done automatically by redhat cluster ?
I just made a simple try with apache but I do not find anywhere reference to the start/stop script for apache in the examples, is that normal ??
do you have some best practice regarding this picture ??

many thks to help me because I certainly have a bad understanding on some points.

Regards
Vincent
-----------------------------------------------------------------
ATTENTION:
This e-mail is intended for the exclusive use of the
recipient(s). This e-mail and its attachments, if any, contain
confidential information and/or information protected by
intellectual property rights or other rights. This e-mail does
not constitute any commitment for ING Belgium except when
expressly otherwise agreed in a written agreement between the
intended recipient and ING Belgium.

If you receive this message by mistake, please, notify the sender
with the "reply" option and delete immediately this e-mail from
your system, and destroy all copies of it. You may not, directly
or indirectly, use this e-mail or any part of it if you are not
the intended recipient.

Messages and attachments are scanned for all viruses known. If
this message contains password-protected attachments, the files
have NOT been scanned for viruses by the ING mail domain. Always
scan attachments before opening them.
-----------------------------------------------------------------
ING Belgium SA/NV - Bank/Lender - Avenue Marnix 24, B-1000
Brussels, Belgium - Brussels RPM/RPR - VAT BE 0403.200.393 -
BIC (SWIFT) : BBRUBEBB - Account: 310-9156027-89 (IBAN BE45 3109
1560 2789).
An insurance broker, registered with the Banking, Finance and
Insurance Commission under the code number 12381A.

ING Belgique SA - Banque/Preteur, Avenue Marnix 24, B-1000
Bruxelles - RPM Bruxelles - TVA BE 0403 200 393 - BIC (SWIFT) :
BBRUBEBB - Compte: 310-9156027-89 (IBAN: BE45 3109 1560 2789).
Courtier d'assurances inscrit a la CBFA sous le numero 12381A.

ING Belgie NV - Bank/Kredietgever - Marnixlaan 24, B-1000 Brussel
- RPR Brussel - BTW BE 0403.200.393 - BIC (SWIFT) : BBRUBEBB -
Rekening: 310-9156027-89 (IBAN: BE45 3109 1560 2789).
Verzekeringsmakelaar ingeschreven bij de CBFA onder het nr.
12381A.
-----------------------------------------------------------------