[patch] Re: stalled 'sync' on ext3+quota over drbd
Eugene Crosser
crosser at rol.ru
Sat Apr 17 10:40:52 UTC 2004
On Sat, 2004-04-17 at 01:19, Stephen C. Tweedie wrote:
> > after moving about 10,000 files and setting quota for a million
> > groupids, and then several hours of inactivity(!) I zeroed profile
> > counters (readprofile -r), ran `time sync' and then `readprofile'. Here
> > are the results. Yes, that's true, it took 3 (three) hours for `sync'
> > to complete!
>
> Turns out there's a nasty O(N^2) behaviour in vfs_quota_sync(). That
> function walks the dquot list looking for things to sync, and it drops
> the lock when doing the actual syncing --- so each item synced causes it
> to start again at the beginning of the list. If each item starts off
> dirty, then the list walk is N^2.
>
> An obvious cure is to shift the start of the list to point just after
> the item just synced. I've done only limited testing of this patch, but
> does it help for you?
Cool! I've already began to build testing environment with oprofile
enabled ;-) During the weekend, I am out of the office, but I'll
certainly verify your fix on Monday.
> 2.4 and 2.6 seem to share this problem.
Apparently things are worse 2.6. I have an impression (did not check it
yet) that 2.6.5 still suffers from the same deadlock problem that was
fixed in 2.4.24 -> 2.4.25 diff.
Unrelated question: is quotacheck necessary after mounting an ext3 to
ensure consistent status? I am building a 200Gb HA NFS server hosting a
million or two files that belong to 1/3 million userids; failover
without fsck and quotacheck takes about 30-40 seconds which is pretty
good. fsck on this filesystem takes about 7 minutes, quotacheck - about
4 minutes. So, having to run quotacheck has significant impact in
availability...
Eugene
More information about the Ext3-users
mailing list