[Linux-cluster] Meaning of Cluster Cycle and timeout problems

Thu Apr 17 18:54:56 UTC 2008

On Thu, 2008-04-17 at 09:08 +0200, Peter wrote:
> Hi!
> 
> In our Cluster we have the following entry in the "messages" logfile:
> 
> "qdiskd[4314]: <warning> qdisk cycle took more than 3 seconds to  
> complete (3.890000)"

It means it took more than 3 seconds for one qdiskd cycle to complete.
This is a whole lot:

   8192 bytes in 16 block reads
   some internal calculations
   512  bytes in 1 block write

(that's it...)

> Theese messages are very frequent. I can not find anything except the  
> source code via google and i am sorry to say that i am not so familar  
> with c to get the point.
> 
> 
> We also have sometimes a quorum timeout:
> 
> "kernel: CMAN: Quorum device /dev/sdh timed out"
> 
> 
> Are theese two messages independent and what is the meaning of the  
> first message?

No, they're 100% related.  It sounds like qdiskd is getting starved for
I/O to /dev/sdh, or possibly it's getting CPU-starved for some reason.
Being that it's more or less a real-time program which helps keep the
cluster running, that's bad!  In your case, it's getting hung up for
longer than the cluster failover time, so CMAN thinks qdiskd has died.
Not good.

(1) Turn *off* status_file if you have it enabled!  It's for debugging,
and under certain load patterns, it can really slow down qdiskd.

(2) If you think it's I/O, what you should try is (assuming you're using
cluster2/rhel5/centos5/etc. here):

  echo deadline > /sys/block/sdh/queue

If you had a default of 10 seconds (1 interval 10 tko), you should also
do:

  echo 2500 > /sys/block/sdh/queue/iosched/write_expire

... you've got at least 3 for interval, so I'm not sure this would apply
to you.

[NOTE: On rhel4/centos4/stable, I think you have to set the I/O
scheduler globally in the kernel command line at system boot.]

(3) If you think qdiskd is getting CPU starved, you can adjust the
'scheduler' and 'priority' values in cluster.conf to something
different.  I think the man page might be wrong; I think the highest
'priority' value for the 'rr' scheduler is 99, not 100.  See the
qdisk(5) man page for more information on those.

-- Lon