[Linux-cluster] clurgmgrd[XXXX]: <err> Error storing ip: Duplicate

Budai Laszlo laszlo.budai at gmail.com
Wed Jul 13 16:07:22 UTC 2011


Hello everyone,


I was asked to investigate why the rgmanager is not running on a red hat
cluster. The cluster is on RHEL 5.3.
#rpm -q cman rgmanager
cman-2.0.98-1.el5
rgmanager-2.0.46-1.el5

Currently cman is running, rgmanager not.

# clustat
Cluster Status for prod-clust1 @ Wed Jul 13 15:42:16 2011
Member Status: Quorate

 Member Name                                     ID   Status
 ------ ----                                     ---- ------
 pnl-p                                               1 Online
 psd-p                                               2 Online, Local


# cman_tool status
Version: 6.1.0
Config Version: 14
Cluster Name: prod-clust1
Cluster Id: 3382
Cluster Member: Yes
Cluster Generation: 1136
Membership state: Cluster-Member
Nodes: 2
Expected votes: 1
Total votes: 2
Quorum: 1
Active subsystems: 7
Flags: 2node Dirty
Ports Bound: 0
Node name: pnl-p
Node ID: 1
Multicast addresses: 224.0.0.1
Node addresses: 10.0.0.2


# cman_tool services
type             level name     id       state
fence            0     default  00010002 none


# service rgmanager status
clurgmgrd dead but pid file exists

this is the situation on both nodes.

for one of the nodes I cannot see any message from rgmanager, and it was
confirmed that the error is older then the oldest log file.
on the other node I can see the messages when rgmanager was started
(after reboot) and here they are:

messages.3:Jun 26 02:00:13 node-pnl-01 clurgmgrd[8720]: <notice>
Resource Group Manager Starting
messages.3:Jun 26 02:00:13 node-pnl-01 clurgmgrd[8720]: <err> Error
storing ip: Duplicate



my question is what the second line means and what are the consequences?
Is it possible that a duplicate IP would shut down the rgmanager?
because after a few seconds (25 seconds as we can see) I can see the
following:

messages.3:Jun 26 02:00:35 node-pnl-01 clurgmgrd[8720]: <notice>
Shutting down

and later on:

messages.3:Jun 26 02:00:59 node-pnl-01 clurgmgrd[8720]: <notice>
Shutdown complete, exiting

Right now it is not an option to start the rgmanager and test. I have to
figure it out only from the log files.

Thank you in advance for any ideas.

Laszlo




More information about the Linux-cluster mailing list