[Linux-cluster] stuck processes on GFS partition?
Matt Brookover
mbrookov at mines.edu
Mon Dec 12 18:02:23 UTC 2005
Looking at the logs, this problem started at 16:04 yesterday. This set
of log messages has been logged every 10 minutes since then.
Any ideas?
Matt
On Mon, 2005-12-12 at 10:35, Matt Brookover wrote:
> We are getting processes stuck in device waits on one file system.
> These errors are logged in /var/log/messages:
>
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: stuck in
> gfs_releasepage()...
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: blkno =
> 12446334, bh->b_count = 9
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> bh->b_journal_head = !NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: gl = (4,
> 12477424)
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> bd_new_le.le_trans = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> bd_incore_le.le_trans = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: bd_frozen
> = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: bd_pinned
> = 0
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: bd_ail_tr
> = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: ip =
> 12477424/12477424
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_count = 1, ip->i_vnode = !NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[0] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[1] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[2] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[3] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[4] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[5] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[6] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[7] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[8] = NULL
> Dec 12 10:04:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[9] = NULL
> Dec 12 10:09:17 imagine su(pam_unix)[5104]: session closed for user
> root
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: stuck in
> gfs_releasepage()...
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: blkno =
> 12446334, bh->b_count = 9
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> bh->b_journal_head = !NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: gl = (4,
> 12477424)
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> bd_new_le.le_trans = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> bd_incore_le.le_trans = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: bd_frozen
> = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: bd_pinned
> = 0
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: bd_ail_tr
> = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5: ip =
> 12477424/12477424
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_count = 1, ip->i_vnode = !NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[0] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[1] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[2] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[3] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[4] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[5] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[6] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[7] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[8] = NULL
> Dec 12 10:14:02 imagine kernel: GFS: fsid=CSM_ACN:admin01.5:
> ip->i_arch.i_cache[9] = NULL
>
> The file system in question appears to work fine on the other nodes, I
> unmounted it to be on the safe side.
>
> This is redhat enterprise 3.6, kernel 2.4.21-37.ELsmp, GFS
> 6.0.2.27-0. GFS was built from the source.
> There are 2 partitions in the admin pool, the second was added a week
> or so ago.
>
> I tried to unmount it, but the umount failed because of the processes
> that are stuck in device waits.
>
> Any ideas?
>
> thank you
>
> Matt
> mbrookov at mines.edu
>
>
>
>
> ______________________________________________________________________
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20051212/15083b93/attachment.htm>
More information about the Linux-cluster
mailing list