[Linux-cluster] gfs2, kvm setup
J. Bruce Fields
bfields at fieldses.org
Thu Jun 26 18:32:17 UTC 2008
On Thu, Jun 26, 2008 at 02:56:10PM +0100, Steven Whitehouse wrote:
> Hi,
>
> On Wed, 2008-06-25 at 18:45 -0400, J. Bruce Fields wrote:
> > I'm trying to get a gfs2 file system running on some kvm hosts, using an
> > ordinary qemu disk for the shared storage (is there any reason this
> > can't work?).
> >
> > I installed openais80.3 from source (after modifying Makefile so "make
> > install" would install to /), and installed gfs2 from the STABLE2 branch
> > of git://sources.redhat.com/git/cluster.git, plus this patch:
> >
> > https://www.redhat.com/archives/cluster-devel/2008-April/msg00143.html
> >
> > (with conflict in write_result() resolved in the obvious way). The
> > kernel is from recent git: 2.6.26-rc4-00103-g1beee8d. I created a
> > minimal cluster.conf and did a mkfs -tgfs2 following doc/usage.txt, then
> > did the startup steps from usage.txt by hand. Everything works up to
> > the mount, at which point the first host gets the following lock bug in
> > the logs. Other mounts fail or hang.
> >
> > Any hints?
> >
> So the first mount is ok, but further mounts fail? or is it all mounts
> that fail/hang?
The first mount appears to succeed, though any subsequent access to the
mounted filesystem hangs (I assume that's by design). Mounts from the
other nodes hang or fail.
--b.
(PS: Could you leave me cc'd?)
>
> Steve.
>
> > --b.
> >
> > Jun 25 18:30:11 piglet1 ccsd[3022]: Starting ccsd 1214172260:
> > Jun 25 18:30:11 piglet1 ccsd[3022]: Built: Jun 22 2008 18:04:35
> > Jun 25 18:30:11 piglet1 ccsd[3022]: Copyright (C) Red Hat, Inc. 2004-2008 All rights reserved.
> > Jun 25 18:30:11 piglet1 ccsd[3022]: /etc/cluster/cluster.conf (cluster name = piglet, version = 1) found.
> > Jun 25 18:30:15 piglet1 ccsd[3022]: Initial status:: Quorate
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=: Trying to join cluster "lock_dlm", "piglet:test"
> > Jun 25 18:31:01 piglet1 kernel: dlm: Using TCP for communications
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: Joined cluster. Now mounting FS...
> > Jun 25 18:31:01 piglet1 kernel:
> > Jun 25 18:31:01 piglet1 kernel: =====================================
> > Jun 25 18:31:01 piglet1 kernel: [ BUG: bad unlock balance detected! ]
> > Jun 25 18:31:01 piglet1 kernel: -------------------------------------
> > Jun 25 18:31:01 piglet1 kernel: dlm_recoverd/3061 is trying to release lock (&ls->ls_in_recovery) at:
> > Jun 25 18:31:01 piglet1 kernel: [<c01c3930>] dlm_recoverd+0x440/0x510
> > Jun 25 18:31:01 piglet1 kernel: but there are no more locks to release!
> > Jun 25 18:31:01 piglet1 kernel:
> > Jun 25 18:31:01 piglet1 kernel: other info that might help us debug this:
> > Jun 25 18:31:01 piglet1 kernel: 3 locks held by dlm_recoverd/3061:
> > Jun 25 18:31:01 piglet1 kernel: #0: (&ls->ls_recoverd_active){--..}, at: [<c01c35c5>] dlm_recoverd+0xd5/0x510
> > Jun 25 18:31:01 piglet1 kernel: #1: (&ls->ls_recv_active){--..}, at: [<c01c38c5>] dlm_recoverd+0x3d5/0x510
> > Jun 25 18:31:01 piglet1 kernel: #2: (&ls->ls_recover_lock){--..}, at: [<c01c38cd>] dlm_recoverd+0x3dd/0x510
> > Jun 25 18:31:01 piglet1 kernel:
> > Jun 25 18:31:01 piglet1 kernel: stack backtrace:
> > Jun 25 18:31:01 piglet1 kernel: Pid: 3061, comm: dlm_recoverd Not tainted 2.6.26-rc4-00103-g1beee8d #38
> > Jun 25 18:31:01 piglet1 kernel: [<c0137bb9>] print_unlock_inbalance_bug+0xc9/0xf0
> > Jun 25 18:31:01 piglet1 kernel: [<c0137449>] ? save_trace+0x39/0xa0
> > Jun 25 18:31:01 piglet1 kernel: [<c01374ea>] ? add_lock_to_list+0x3a/0xa0
> > Jun 25 18:31:01 piglet1 kernel: [<c0139cac>] ? __lock_acquire+0xb9c/0xfc0
> > Jun 25 18:31:01 piglet1 kernel: [<c0139f14>] ? __lock_acquire+0xe04/0xfc0
> > Jun 25 18:31:01 piglet1 kernel: [<c013a23f>] lock_release_non_nested+0xff/0x170
> > Jun 25 18:31:01 piglet1 kernel: [<c01c3930>] ? dlm_recoverd+0x440/0x510
> > Jun 25 18:31:01 piglet1 kernel: [<c01c3930>] ? dlm_recoverd+0x440/0x510
> > Jun 25 18:31:01 piglet1 kernel: [<c013a33d>] lock_release+0x8d/0x150
> > Jun 25 18:31:01 piglet1 kernel: [<c0131c96>] up_write+0x16/0x30
> > Jun 25 18:31:01 piglet1 kernel: [<c01c3930>] dlm_recoverd+0x440/0x510
> > Jun 25 18:31:01 piglet1 kernel: [<c01c34f0>] ? dlm_recoverd+0x0/0x510
> > Jun 25 18:31:01 piglet1 kernel: [<c012e546>] kthread+0x36/0x60
> > Jun 25 18:31:01 piglet1 kernel: [<c012e510>] ? kthread+0x0/0x60
> > Jun 25 18:31:01 piglet1 kernel: [<c0103587>] kernel_thread_helper+0x7/0x10
> > Jun 25 18:31:01 piglet1 kernel: =======================
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=0, already locked for use
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=0: Looking at journal...
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=0: Done
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=1: Trying to acquire journal lock...
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=1: Looking at journal...
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=1: Done
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=2: Trying to acquire journal lock...
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=2: Looking at journal...
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=2: Done
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=3: Trying to acquire journal lock...
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=3: Looking at journal...
> > Jun 25 18:31:01 piglet1 kernel: GFS2: fsid=piglet:test.0: jid=3: Done
> > Jun 25 18:33:46 piglet1 ntpd[2951]: synchronized to 76.189.12.0, stratum 1
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
More information about the Linux-cluster
mailing list