stalled 'sync' on ext3+quota over drbd

Eugene Crosser crosser at rol.ru
Wed Mar 31 13:05:46 UTC 2004


On Wed, 2004-03-31 at 16:46, Stephen C. Tweedie wrote:

> > Now, the setup mostly works fine.  But if you actively use the
> > filesystem for some time (hour of copying a large tree over NFS), then
> > then try 'sync' command, the latter runs very long (10 minutes or more),
> > eating 99% CPU according to top, and the system becomes very sluggish
> > (leading to stalled replication, heartbeat misbehavior) and in fact
> > unusable.
> 
> You'd need to try capturing a profile of the 99% cpu loop for us to be
> able to investigate this any further.

That'd be tricky: it is somewhere in the kernel (top shows 99% CPU used
by "system", and strace attaced to sync does not show anything).

Another thing, possibly related: when I try `quotaoff', machine hangs
for 10+ minutes, and does not respond to *anything* but ping.  Then it
gets alive again.

I'd be happy to provide more information but so far I cannot decide
where to look...  Should I learn to use "kernel profiling"?

Eugene
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20040331/e4bb4d61/attachment.sig>


More information about the Ext3-users mailing list