[Linux-cluster] gfs1 and 2.6.20

Thu Feb 22 15:54:14 UTC 2007

Asbjørn Sannes wrote:
> Robert Peterson wrote:
>   
>> Asbjørn Sannes wrote:
>>     
>>> Asbjørn Sannes wrote:
>>>  
>>>       
>>>> I have been trying to use the STABLE branch of the cluster suite with
>>>> vanilla 2.6.20 kernel, and everything seemed at first to work, my
>>>> problem can be reproduced by this:
>>>>
>>>> mount a gfs filesystem anywhere..
>>>> do a sync, this sync will now just hang there ..
>>>>
>>>> If I unmount the filesystem in another terminal, the sync command will
>>>> end..
>>>>
>>>> .. dumping the kernel stack of sync shows that it is in
>>>> __sync_inodes on
>>>> __down_read, looking in the code it seems that is waiting for the
>>>> s_umount semaphore (in the superblock)..
>>>>
>>>> Just tell me if you need any more information or if this is not the
>>>> correct place for this..
>>>>       
>>>>         
>>> Here is the trace for sync (while hanging) ..
>>>
>>> sync          D ffffffff8062eb80     0 17843 
>>> 15013                    (NOTLB)
>>> ffff810071689e98 0000000000000082 ffff810071689eb8 ffffffff8024d210
>>> 0000000071689e18 0000000000000000 0000000100000000 ffff81007b670fe0
>>> ffff81007b6711b8 00000000000004c8 ffff810037c84770 0000000000000001
>>> Call Trace:
>>> [<ffffffff8024d210>] wait_on_page_writeback_range+0xed/0x140
>>> [<ffffffff8046046c>] __down_read+0x90/0xaa
>>> [<ffffffff802407d6>] down_read+0x16/0x1a
>>> [<ffffffff8028df35>] __sync_inodes+0x5f/0xbb
>>> [<ffffffff8028dfa7>] sync_inodes+0x16/0x2f
>>> [<ffffffff80290293>] do_sync+0x17/0x60
>>> [<ffffffff802902ea>] sys_sync+0xe/0x12
>>> [<ffffffff802098be>] system_call+0x7e/0x83
>>>
>>> Greetings,
>>> Asbjørn Sannes
>>>
>>>       
>> Hi Asbjørn,
>>
>> I'll look into this as soon as I can find the time...
>>
>>     
> Great! I tried to figure out why the s_umount semaphore was not upped by
> comparing to other filesystems, but the functions seems almost identical
> .. so I cheated and looked what had changed lately (from your patch):
>
> diff -w -u -p -p -u -r1.1.2.1.4.1.2.1 diaper.c
> --- gfs-kernel/src/gfs/diaper.c	26 Jun 2006 21:53:51 -0000	1.1.2.1.4.1.2.1
> +++ gfs-kernel/src/gfs/diaper.c	2 Feb 2007 22:28:41 -0000
> @@ -50,7 +50,7 @@ static int diaper_major = 0;
>  static LIST_HEAD(diaper_list);
>  static spinlock_t diaper_lock;
>  static DEFINE_IDR(diaper_idr);
> -kmem_cache_t *diaper_slab;
> +struct kmem_cache *diaper_slab;
>  
>  /**
>   * diaper_open -
> @@ -232,9 +232,9 @@ get_dummy_sb(struct diaper_holder *dh)
>  	struct inode *inode;
>  	int error;
>  
> -	mutex_lock(&real->bd_mount_mutex);
> +	down(&real->bd_mount_sem);
>  	sb = sget(&gfs_fs_type, gfs_test_bdev_super, gfs_set_bdev_super, real);
> -	mutex_unlock(&real->bd_mount_mutex);
> +	up(&real->bd_mount_sem);
>  	if (IS_ERR(sb))
>  		return PTR_ERR(sb);
>  
> @@ -252,7 +252,6 @@ get_dummy_sb(struct diaper_holder *dh)
>  	sb->s_op = &gfs_dummy_sops;
>  	sb->s_fs_info = dh;
>  
> -	up_write(&sb->s_umount);
>  	module_put(gfs_fs_type.owner);
>  
>  	dh->dh_dummy_sb = sb;
> @@ -263,7 +262,6 @@ get_dummy_sb(struct diaper_holder *dh)
>  	iput(inode);
>  
>   fail:
> -	up_write(&sb->s_umount);
>  	deactivate_super(sb);
>  	return error;
>  }
>
>
>
> And undid those up_write ones (added them back in), which helped, I
> don't know if it safe though, and maybe you could shed some lights on
> why they were removed? (I didn't find any changes that would do up_write
> on s_umount later..
>   
Actually, it didn't enjoy unmount as much ..

Mvh,
Asbjørn Sannes