[Cluster-devel] cluster4 gfs_controld

David Teigland teigland at redhat.com
Thu Oct 13 14:20:59 UTC 2011


Here's the outline of my plan to remove/replace the essential bits of
gfs_controld in cluster4.  I expect it'll go away entirely, but there
could be one or two minor things it would still handle on the side.

kernel dlm/gfs2 will continue to be operable with either
. cluster3 dlm_controld/gfs_controld combination, or
. cluster4 dlm_controld only

Two main things from gfs_controld need replacing:

1. jid allocation, first mounter

cluster3
. both from gfs_controld

cluster4
. jid from dlm-kernel "slots" which will be assigned similarly
. first mounter using a dlm lock in lock_dlm

2. recovery coordination, failure notification

cluster3
. coordination of dlm-kernel/gfs-kernel recovery is done
  indirectly in userspace between dlm_controld/gfs_controld,
  which then toggle sysfs files.
. write("sysfs block", 0) -> block_store(1)
  write("sysfs recover", jid) -> recover_store(jid)
  write("sysfs block", 1) -> block_store(0)

cluster4
. coordination of dlm-kernel/gfs-kernel recovery is done
  directly in kernel using callbacks from dlm-kernel to gfs-kernel.
. gdlm_mount(struct gfs2_sbd *sdp, const char *table, int *first, int *jid)
  calls dlm_recover_register(dlm, &jid, &recover_callbacks)
. gdlm_recover_prep() -> block_store(1)
  gdlm_recover_slot(jid) -> recover_store(jid)
  gdlm_recover_done() -> block_store(0)

cluster3 dlm/gfs recovery
. dlm_controld sees nodedown                      (libcpg)
. gfs_controld sees nodedown                      (libcpg)
. dlm_controld stops dlm-kernel                   (sysfs control 0)
. gfs_controld stops gfs-kernel                   (sysfs block 1)
. dlm_controld waits for gfs_controld kernel stop (libdlmcontrol)
. gfs_controld waits for dlm_controld kernel stop (libdlmcontrol)
. dlm_controld syncs state among all nodes        (libcpg)
. gfs_controld syncs state among all nodes        (libcpg)
. dlm_controld starts dlm-kernel recovery         (sysfs control 1)
. gfs_controld starts gfs-kernel recovery         (sysfs recover jid)
. gfs_controld starts gfs-kernel                  (sysfs block 0)

cluster4 dlm/gfs recovery
. dlm_controld sees nodedown                      (libcpg)
. dlm_controld stops dlm-kernel                   (sysfs control 0)
. dlm-kernel stops gfs-kernel                     (callback block 1)
. dlm_controld syncs state among all nodes        (libcpg)
. dlm_controld starts dlm-kernel recovery         (sysfs control 1)
. dlm-kernel starts gfs-kernel recovery           (callback recover jid)
. dlm-kernel starts gfs-kernel                    (callback block 0)




More information about the Cluster-devel mailing list