Questions regarding journal replay
Ralf Hildebrandt
Ralf.Hildebrandt at charite.de
Wed Feb 25 17:39:07 UTC 2009
* Theodore Tso <tytso at mit.edu>:
> On Wed, Feb 25, 2009 at 10:31:42AM -0600, Eric Sandeen wrote:
> >
> > It'd be better to get to the bottom of the problem ... maybe iostat
> > while it's happening to see if IO is actually happening; run blktrace to
> > see where IO is going, do a few sysrq-t's to see where threads are at, etc.
> >
> > Can you find a way to reproduce this at will?
> >
> > Journal replay should *never* take this long, AFAIK.
>
> Indeed. The journal is 128 megs, as I recall. So even if the journal
> was completely full, if it's taking 800 seconds, that's a write rate
> of 0.16 Mb/S (164 kb/second). That is indeed way too slow.
The problem seems to be with the external journal which I recently
changed to. It's a 32GB partition. My timings seem to indicate that
ALL OF IT was being replayed
> I assume this wasn't your boot partition, so the journal replay was
> being done by e2fsck, right?
Yes
> Or are you guys skipping e2fsck and the journal replay was happening
> when you mounted the partition?
Both. We tried both ways :)
> If the journal replay is happening via e2fsck, is fsck running any
> other filesystem checks in parallel?
No, it's running alone.
> Also, what is the geometry of your raid? How many disks, what RAID
> level, and what is the chunk size? The journal replay is done a
> filesystem block at a time, so it could be that it's turning into a
> large number of read-modify-writes, which is trashing your performance
> if the chunk size is really large.
The RAID is made up from one logical volume, consisting of two drives
sda and sdb, each containing 6 disks in a hardware RAID5 setup.
> The other thing that might explain the performan problem is if the
> somehow the number of multiple outstanding requests allowed by the
> hard drive has been clamped down to a very small number, and so a
> large number of small read/write requests is really killing
> performance. The system dmesg log might have some hidden clues about
> that.
dmesg is silent
--
Ralf Hildebrandt Ralf.Hildebrandt at charite.de
Charite - Universitätsmedizin Berlin Tel. +49 (0)30-450 570-155
Geschäftsbereich IT | Abt. Netzwerk Fax. +49 (0)30-450 570-962
Hindenburgdamm 30 | 12200 Berlin
More information about the Ext3-users
mailing list