stalled 'sync' on ext3+quota over drbd

Eugene Crosser crosser at rol.ru
Wed Mar 24 10:47:19 UTC 2004


I don't know yet if this is an ext3, quota or drbd issue, but I'll ask
anyway.  I am building a HA NFS server using two Dell-1750's and drbd. 
I have ext3 filesystem with quota built on drbd device running over
200Gb disk partition (hardware raid0+1), drdb-mirrored across servers. 
The kernel is 2.4.25, so hopefully quota deadlock should not be a
problem (it was on 2.4.24).

Now, the setup mostly works fine.  But if you actively use the
filesystem for some time (hour of copying a large tree over NFS), then
then try 'sync' command, the latter runs very long (10 minutes or more),
eating 99% CPU according to top, and the system becomes very sluggish
(leading to stalled replication, heartbeat misbehavior) and in fact
unusable.

Any ideas why this happens and/or suggestions for further investigation?

Eugene
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20040324/b9693349/attachment.sig>


More information about the Ext3-users mailing list