[Linux-cluster] Processes locked in "D" state

Brynnen R Owen owen at isrl.uiuc.edu
Fri Nov 19 18:12:41 UTC 2004


More info, hot off the presses.

  I just unmounted the GFS on another server, and two others with hung
processes sprang back to life.  So, it appears to be some kind of
locking issue, but I have no idea what.

On Fri, Nov 19, 2004 at 11:53:45AM -0600, Brynnen R Owen wrote:
> Hi all,
> 
>   While my initial problems with getting the locking/fencing seem to
> be solved with the proper magma modules, my initial problem is not
> solved.  I have been running some test backups to a GFS partition
> which somehow has a bad directory on it.  Here's what I mean.  Any
> process that tries to open this "bad" directory gets hung forever in a
> "D" state.  There are no errors/warnings/logs anywhere.  I have tried
> 'ls <path>', 'find .' on a directory above this bad one in the path,
> '/gfs_tool stat <path>', and the original perl script which was
> descending into directories and copying stuff.  I now have 4 hung
> processes.  The machine still appears awake.  'df' still works (this
> is an improvement over the old failure method).  Any suggestions? 
> 
> I'm using lock_dlm
> gfs from CVS on Nov 11. which I applied to a kernel.org 2.6.9 kernel.
> Using mptscsih fibre channel cards.
> Athlon processors with athlon extensions
> No extra high memory (1G limit)
> Non-SMP
> base system is RedHat 9.
> 
> copy of /proc/cluster/status (fifth node was never active):
> Version: 3.0.1
> Config version: 7
> Cluster name: gslis-san1
> Cluster ID: 43161
> Membership state: Cluster-Member
> Nodes: 4
> Expected_votes: 5
> Total_votes: 4
> Quorum: 3   
> Active subsystems: 8
> Node addresses: 192.168.1.240  
> 
> copy of /proc/cluster/services:
> Service          Name                              GID LID State
> Code
> Fence Domain:    "default"                           1   2 run       -
> [1 3 4 2]
> 
> DLM Lock Space:  "archive-content"                   2   3 run       -
> [1 3 4 2]
> 
> DLM Lock Space:  "archive-home"                      4   5 run       -
> [1 3 4 2]
> 
> GFS Mount Group: "archive-content"                   3   4 run       -
> [1 3 4 2]
> 
> GFS Mount Group: "archive-home"                      5   6 run       -
> [1 3 4 2]
> 
> 
> -- 
> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
> <>  Brynnen Owen            (     this space for rent                      )<>
> <>  owen at uiuc.edu           (                                              )<>
> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

-- 
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
<>  Brynnen Owen            (     this space for rent                      )<>
<>  owen at uiuc.edu           (                                              )<>
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>




More information about the Linux-cluster mailing list