[Linux-cluster] Postgresql service on RHCS

Tue Aug 14 20:41:37 UTC 2007

I am attempting to set up an active/passive failover environment for
postgresql-8.2.  I created a failover domain with 2 nodes, one of them
preferred, added a postgresql-8 resource and an IP address resource.  Now
when I set up a service I run into problems.  The idea is to have both the
postgresql-8 resource and the IP address resource within the service so
that it will move the floating IP to the other node in the domain and
start postgres on it.   The IP portion seems to work fine but postgresql
always fails.  I get these error messages:

Aug 14 15:40:47 cdb6 clurgmgrd[4106]: <notice> Starting disabled service
10.10.1.221
Aug 14 15:40:47 cdb6 clurgmgrd: [4106]: <info> Adding IPv4 address
10.10.1.236 to bond0
Aug 14 15:40:48 cdb6 clurgmgrd: [4106]: <info> Starting Service
postgres-8:cdb6
Aug 14 15:40:48 cdb6 clurgmgrd: [4106]: <err> Starting Service
postgres-8:cdb6 > Failed
Aug 14 15:40:48 cdb6 clurgmgrd[4106]: <notice> start on postgres-8:cdb6
returned 1 (generic error)
Aug 14 15:40:48 cdb6 clurgmgrd[4106]: <warning> #68: Failed to start
10.10.1.221; return value: 1
Aug 14 15:40:48 cdb6 clurgmgrd[4106]: <notice> Stopping service 10.10.1.221
Aug 14 15:40:48 cdb6 clurgmgrd: [4106]: <info> Stopping Service
postgres-8:cdb6
Aug 14 15:40:48 cdb6 clurgmgrd: [4106]: <err> Checking Existence Of File
/var/run/cluster/postgres-8/postgres-8:cdb6.pid [postgres-8:cdb6] > Failed
- File Doesn't Exist
Aug 14 15:40:48 cdb6 clurgmgrd: [4106]: <err> Stopping Service
postgres-8:cdb6 > Failed
Aug 14 15:40:48 cdb6 clurgmgrd[4106]: <notice> stop on postgres-8:cdb6
returned 1 (generic error)
Aug 14 15:40:48 cdb6 clurgmgrd[4106]: <crit> #12: RG 10.10.1.221 failed to
stop; intervention required
Aug 14 15:40:48 cdb6 clurgmgrd[4106]: <notice> Service 10.10.1.221 is failed
Aug 14 15:40:48 cdb6 clurgmgrd[4106]: <crit> #13: Service 10.10.1.221
failed to stop cleanly

Here are the relavent portions of cluster.conf:

               <failoverdomains>
                        <failoverdomain name="cdb6_pgsql" ordered="1"
restricted="1">
                                <failoverdomainnode name="cdb6"
priority="1"/>
                                <failoverdomainnode name="cdb5"
priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <clusterfs device="/dev/mapper/gfsvol1"
force_unmount="0" fsid="36155" fstype="gfs"
mountpoint="/mnt/gfs" name="gfs1" options=""/>
                        <postgres-8
config_file="/var/lib/pgsql/data/postgresql.conf"
name="cdb6" postmaster_options=""
postmaster_user="postgres" shutdown_wait=""/>
                        <ip address="10.10.1.236" monitor_link="1"/>
                        <script file="/etc/rc.d/init.d/postgresql"
name="postgres"/>
                </resources>
                <service autostart="0" domain="cdb6_pgsql"
name="10.10.1.221" recovery="relocate">
                        <ip ref="10.10.1.236"/>
                        <postgres-8 ref="cdb6"/>
                </service>

I looked into the init script failing to return zero issue and I don't
think that is it.  My postgresql startup script doesn't even show that
there was an attempt to start it.  When setting up a postgresql-8 resource
I assume that 'config file' is the config file for postgres or is it the
startup script '/etc/rc.d/init.d/postgresql'?  Neither works for me.

Any tips would be greatly appreciated.

Brad Crotchett
brad at bradandkim.net
http://www.bradandkim.net