Questions regarding journal replay

Wed Feb 25 17:36:25 UTC 2009

Ralf Hildebrandt wrote:
> * Eric Sandeen <sandeen at redhat.com>:
>> Ralf Hildebrandt wrote:
>>> * Ralf Hildebrandt <Ralf.Hildebrandt at charite.de>:
>>>> * Curtis Doty <Curtis at GreenKey.net>:
>>>>> Yesterday Ralf Hildebrandt said:
>>>>>
>>>>>> The journal replay too quite a while. About 800 seconds.
>>>>>>
>>>>> Were there any other background iops on the underlying volume 
>>>>> devices? Like maybe raid reconstruction?
>>>> I don't think so. The machine never powered off...
>>> Again, 2.6.28.7 failed us and now we're encountering another journal
>>> replay. Taking ages. This sucks.
>>>
>>> Questions:
>>>
>>> How can I find out (during normal operation) HOW MUCH of the
>>> journal is actually in use?
>>>
>>> How can I resize the journal to be smaller, thus making a journal
>>> replay faster?
>>>
>> It'd be better to get to the bottom of the problem ... maybe iostat
>> while it's happening to see if IO is actually happening; run blktrace to
>> see where IO is going, do a few sysrq-t's to see where threads are at, etc.
> 
> We had 24GB of reading from the journal device (or 12GB if it's
> 512byte blocks). I wonder why?

24GB of reading from the journal device (during that 800s of replay
during mount?), and your journal is 128M ... well that's odd.

You say journal device; is this an external journal?  I didn't think so
from your first email, but is it?

>> Can you find a way to reproduce this at will?
> 
> Yes. My users will kill me, though.

No spare box, eh :(

>> Journal replay should *never* take this long, AFAIK.
> 
> Amen
> 

so let's figure it out :)

-Eric