stalled 'sync' on ext3+quota over drbd
crosser at rol.ru
Wed Mar 31 13:05:46 UTC 2004
On Wed, 2004-03-31 at 16:46, Stephen C. Tweedie wrote:
> > Now, the setup mostly works fine. But if you actively use the
> > filesystem for some time (hour of copying a large tree over NFS), then
> > then try 'sync' command, the latter runs very long (10 minutes or more),
> > eating 99% CPU according to top, and the system becomes very sluggish
> > (leading to stalled replication, heartbeat misbehavior) and in fact
> > unusable.
> You'd need to try capturing a profile of the 99% cpu loop for us to be
> able to investigate this any further.
That'd be tricky: it is somewhere in the kernel (top shows 99% CPU used
by "system", and strace attaced to sync does not show anything).
Another thing, possibly related: when I try `quotaoff', machine hangs
for 10+ minutes, and does not respond to *anything* but ping. Then it
gets alive again.
I'd be happy to provide more information but so far I cannot decide
where to look... Should I learn to use "kernel profiling"?
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 189 bytes
Desc: This is a digitally signed message part
More information about the Ext3-users