Re: More on high i/o load wedging Fedora 10

Robin Laing wrote:

> I always will report bugs if I can get the details.  It is almost 
> useless to report bugs if you don't have any details to post with it as 
> there is a request for more details.

Thanks, I understand that.

> It takes time to learn what tools to use to find issues.  I just read an 
> IBM paper on tracing problems using iostat.  I also found dstat at the 
> same time.  It is IO related as the problems all come from using or 
> writing to a hard drive.  It has also gotten worse and may be related to 
> the latest kernel.

I understand, but in this thread I have repeatedly asked people hitting
this sort of hang to do "sysrq-w" (or, echo w > /proc/sysrq-trigger) -
nobody has ever shown me the results. [1]

I sympathize that it's hard to follow "this" thread, because it keeps
getting re-started under new subjects... :)

> My dumping EXT4 is more due to reports that I have read about data loss 
> due to the procedure for write delays.  I have run into the issue of 
> losing my kde config files as reported by others on the net already.
> http://www.advogato.org/person/mjg59/diary/195.html
> http://www.h-online.com/open/Possible-data-loss-in-Ext4--/news/112821
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/317781?comments=all

patches are in Fedora already to mitigate this, it should not be a big
problem for you at this point.  If it is, I need to know about it.

> I am also dumping EXT4 as I am trying to trace the issue with 
> locking/freezing computer and I don't need data losses.

I understand; however, they may be related...

> As it stands, I did get a kernel oops last night that didn't crash my 
> system and was logged in messages.  I was using a tty session so this 
> could be why the system didn't totally freeze.
> I have not had time to look through it and to see where it should be 
> posted.  It is related to USB as it occurred when I unplugged my USB 
> drive that I was restoring data from.  It was late and I was tired so I 
> want to check things on the system before going further.

This may be something of a known issue, depending on the details.  (a
drive disappearing should not actually *oops* the box, but it will
probably spew lots of warnings and errors at least.)

> There is an issue with filing kernel related bugs if the kernel is 
> tainted because of Nvidia drivers.  I have been told before that I need 
> to remove the driver before filing a bug.  Well that is hard to do when 
> 3D is needed on the computer with the problem.

That's often true.  Speaking for myself, if there is some weird behavior
never-before reported, and the kernel exhibiting that behavior has
binary modules loaded, I often won't dig into it much because TBH I
can't debug it 100%, and the binary module is always suspect.  But if
the report correlates with other similar reports, it is still useful to
me, even with the binary module loaded.

> I just tried the sysrq 'w' but I don't have that command on my machine 
> at work.

[1] I probably should have been more explicit when I asked for this.

# echo w > /proc/sysrq-trigger
# dmesg > dmesg_output.txt

should work on any fedora machine out of the box.


