[Linux-cluster] Fedora 19 cluster stack and Cluster registry components

Michael Richmond Michael.Richmond at sandisk.com
Tue Apr 23 18:07:13 UTC 2013


Andrew and Digimer,
Thank you for taking the time to respond, you have collaborated some of
what I've been putting together as the likely direction.

I am working on adapting some cluster-aware storage features for use in a
Linux cluster environment. With this kind of project it is useful to try
and predict where the Linux community is heading so that I can focus my
development work on what will be the "current" cluster stack around my
anticipated release dates. Any predictions are simply educated guesses
that may prove to be wrong, but are useful with regard to developing
plans. From my reading of various web pages and piecing things together I
found that RHEL 7 is intended to be based on Fedora 18, so I assume that
the new Pacemaker stack has a good chance of being rolled out in RHEL
7.1/7.2, or even possibly 7.0.

Hearing that there is official word that the intention is for Pacemaker to
be the official cluster stack helps me put my development plans together.


The project I am working on is focused on two-node clusters. But I also
need a persistent, cluster-wide data store to hold a small amount of state
(less than 1KB). This data store is what I refer to as a cluster-registry.
The state data records the last-known operational state for the storage
feature. This last-known state helps drive recovery operations for the
storage feature during node bring-up. This project is specifically aimed
at integrating generic functionality into the Linux cluster stack.

I have been thinking about using the cluster configuration file for this
storage which I assume is the CIB referenced by Andrew. But I can imagine
cases where the CIB file may loose updates if it does not utilize shared
storage media. My understanding is that the CIB file is stored on each
node using local disk storage.

For example, consider a two-node cluster that is configured with a quorum
disk on shared storage media. If at a given point in time NodeB is up and
NodeB is down. NodeA can form quorate and start cluster services
(including HA applications). Assume that NodeA updates the CIB to record
some state update. If NodeB starts booting but before NodeB joins the
cluster, NodeA crashes. At this point, the updated CIB only resides on
NodeA and cannot be accessed by NodeB even if NodeB can access the quorum
disk as form quorate. Effectively, NodeB cannot be aware of the update
from NodeA which will result in an implicit roll-back of any updates
performed by NodeA.

With a two-node cluster, there are two options for resolving this:
* prevent any update to the cluster registry/CIB unless all nodes are part
of the cluster. (This is not practical since it undermines some of the
reasons for building clusters.)
* store the cluster registry on shared storage so that there is one source
of truth.

It is possible that the nature of the data stored in the CIB is resilient
to the example scenario that I describe. In this case, maybe the CIB is
not an appropriate data store for my cluster registry data. In this case I
am either looking for an appropriate Linux component to use for my cluster
registry, or I will build a custom data store that provides atomic update
semantics on shared storage.

Any thoughts and/or pointers would be appreciated.

Thanks,
Michael Richmond

--
michael richmond | principal software engineer | flashsoft, sandisk |
+1.408.425.6731




On 22/4/13 4:37 PM, "Andrew Beekhof" <andrew at beekhof.net> wrote:

>
>On 23/04/2013, at 4:59 AM, Digimer <lists at alteeve.ca> wrote:
>
>> On 04/22/2013 02:36 PM, Michael Richmond wrote:
>>> Hello,
>>> I am researching the new cluster stack that is scheduled to be
>>>delivered
>>> in Fedora 19. Does anyone on this list have a sense for the timeframe
>>> for this new stack to be rolled into a RHEL release? (I assume the
>>> earliest would be RHEL 7.)
>>>
>>> On the Windows platform, Microsoft Cluster Services provides a
>>> cluster-wide registry service that is basically a cluster-wide
>>>key:value
>>> store with atomic updates and support to store the registry on shared
>>> disk. The  storage on shared disk allows access and use of the registry
>>> in cases where nodes are frequently joining and leaving the cluster.
>>>
>>> Are there any component(s) that can be used to provide a similar
>>> registry in the Linux cluster stack? (The current RHEL 6 stack, and/or
>>> the new Fedora 19 stack.)
>>>
>>> Thanks in advance for your information,
>>> Michael Richmond
>>
>> Hi Michael,
>>
>>  First up, Red Hat's policy of what is coming is "we'll announce on
>>release day". So anything else is a guess. As it is, Pacemaker is in
>>tech-preview in RHEL 6, and the best guess is that it will be the
>>official resource manager in RHEL 7, but it's just that, a guess.
>
>I believe we're officially allowed to say that it is our _intention_ that
>Pacemaker will be the one and only supported stack in RHEL7.
>
>>
>>  As for the registry question; I am not entirely sure what it is you
>>are asking here (sorry, not familiar with windows). I can say that
>>pacemaker uses something called the CIB (cluster information base) which
>>is an XML file containing the cluster's configuration and state. It can
>>be updated from any node and the changes will push to the other nodes
>>immediately.
>
>How many of these attributes are you planning to have?
>You can throw a few in there, but I'd not use it for 100's or 1000's of
>them - its mainly designed to store the resource/service configuration.
>
>
>> Does this answer your question?
>>
>>  The current RHEL 6 cluster is corosync + cman + rgmanager. It also
>>uses an XML config and it can be updated from any node and push out to
>>the other nodes.
>>
>>  Perhaps a better way to help would be to ask what, exactly, you want
>>to build your cluster for?
>>
>> Cheers
>>
>> --
>> Digimer
>> Papers and Projects: https://alteeve.ca/w/
>> What if the cure for cancer is trapped in the mind of a person without
>>access to education?
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>


________________________________

PLEASE NOTE: The information contained in this electronic mail message is intended only for the use of the designated recipient(s) named above. If the reader of this message is not the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify the sender by telephone or e-mail (as shown above) immediately and destroy any and all copies of this message in your possession (whether hard copies or electronically stored copies).





More information about the Linux-cluster mailing list