[Cluster-devel] GFS2: Use new workqueue scheme

Thu Sep 9 13:45:08 UTC 2010

Hi,

On Thu, 2010-09-09 at 15:18 +0200, Tejun Heo wrote:
> Hello, Steven.
> 
> Thanks for working on this.
> 
I think it will be a big win for GFS2, particularly as the number of cpu
cores increases

> On 09/09/2010 02:36 PM, Steven Whitehouse wrote:
> > diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
> > index 8e478e2..fffc1bf 100644
> > --- a/fs/gfs2/glock.c
> > +++ b/fs/gfs2/glock.c
> > @@ -1783,10 +1783,14 @@ int __init gfs2_glock_init(void)
> >  	}
> >  #endif
> >  
> > -	glock_workqueue = create_workqueue("glock_workqueue");
> > +	glock_workqueue = alloc_workqueue("glock_workqueue", WQ_RESCUER |
> > +					  WQ_HIGHPRI | WQ_CPU_INTENSIVE |
> > +					  WQ_FREEZEABLE, 0);
> 
> Does this really need WQ_HIGHRPI and WQ_CPU_INTENSIVE?
> 
This would be a tasklet were it not for the fact that it needs to be
able to submit block I/O from time to time. It does need to be as fast
as possible since it directly affects the latency of operations using
large numbers of inodes.

I read your latest set of docs before assigning the flags, so I hope
I've understood it correctly.

The glock workqueue is involved in sending requests to the DLM and
processing the results of those requests, waking up waiting processes as
quickly as possible.

> >  	if (IS_ERR(glock_workqueue))
> >  		return PTR_ERR(glock_workqueue);
> > -	gfs2_delete_workqueue = create_workqueue("delete_workqueue");
> > +	gfs2_delete_workqueue = alloc_workqueue("delete_workqueue", WQ_RESCUER |
> > +						WQ_UNBOUND | WQ_NON_REENTRANT |
> > +						WQ_FREEZEABLE, 0);
> 
> >  	if (IS_ERR(gfs2_delete_workqueue)) {
> >  		destroy_workqueue(glock_workqueue);
> >  		return PTR_ERR(gfs2_delete_workqueue);
> > diff --git a/fs/gfs2/main.c b/fs/gfs2/main.c
> > index b1e9630..60f3465 100644
> > --- a/fs/gfs2/main.c
> > +++ b/fs/gfs2/main.c
> > @@ -140,7 +140,8 @@ static int __init init_gfs2_fs(void)
> >  
> >  	error = -ENOMEM;
> >  	gfs_recovery_wq = alloc_workqueue("gfs_recovery",
> > -					  WQ_NON_REENTRANT | WQ_RESCUER, 0);
> > +					  WQ_NON_REENTRANT | WQ_RESCUER |
> > +					  WQ_UNBOUND | WQ_FREEZEABLE, 0);
> 
> And do these need to be WQ_UNBOUND?  Unless the flags are specifically
> needed, I think it would be better to stick with the default.  I'm
> currently working on the documentation.  It's still not complete but
> please take a look for more information the behaviors of each flag.
> 
> Thanks.
> 
I wouldn't say that it was 100% a requirement, but they are long running
(potentially a few seconds, or even as far as a minute or two in extreme
cases). The recovery workqueue seems to meet this criteria:

>	* Long running CPU intensive workloads which can be better
> 	  managed by the system scheduler.

and the delete_workqueue seems to meet this criteria:

> 	* Wide fluctuation in the concurrency level requirement is
> 	  expected and using bound wq may end up creating large number
> 	  of mostly unused workers across different CPUs as the issuer
> 	  hops through different CPUs.

It may be that I didn't understand the docs correctly, but I think I've
found the right flags. The delete_workqueue is usually unused during
normal fs operation, but occasionally it might have a lot to do. It was
made a separate workqueue because it needs to be able to manipulate
glocks and thus must never block the glock workqueue.

Steve.