Filesystem remounted read-only WAS: RE: Recent unexplained quota problems

Wed Sep 12 12:17:34 UTC 2007

> 

Mazda Motor Logistics Europe NV, Blaasveldstraat 162, B-2830 Willebroek
VAT BE 406.024.281, RPR Mechelen, ING  310-0092504-52, IBAN : BE64 3100 0925 0452, SWIFT : BBRUBEBB

-----Original Message-----
[quota issues due to read-only filesystem]
> > If I try to 'touch test' in /home I get:
> > [root at server home]# touch test
> > touch: creating `test': Read-only file system
> >
> > I rebooted the server and everything seems to be okay.  I'm 
> a little concerned 
> > about this though because I can't explain it.
> 
> You should be concerned about this.  The kernel will change a
> filesystem to read-only when it detects an IO error against that FS.
> This can happen for a number of reasons:
> 
>    - Your connection to your SAN dropped;
>    - Your hard drive(s) are dying;
>    - You have significant data corruption;
>    - and on and on...
> 
> Except for the first reason I listed, all of the other reasons I know
> of are Real Bad.
> 
> If you're lucky, you've got some minor data corruption that caused the
> kernel to try to write beyond the end of the drive or something like
> that; you should try running fsck on the filesystem first.  Be warned,
> though, that if you have significant data corruption, fsck may
> completely hose the filesystem, so get as good a backup as 
> you can first.
> 
> You should check /var/log/messages for kernel messages about this.  If
> it happens again, dmesg will also have useful information (at least,
> it will until you reboot).
> 
> If the problem is transient, a simple userspace mount call will fix
> it:
> 
> mount -o remount,rw,usrquota /home
> 
> But that's a gamble.
> 
> Despite what other posters have said, when the kernel changes the
> status of the volume, it does so using kernel-level tools, _not_
> userspace mount calls, so the arguments show in the mount(1) command
> will _not_ reflect the read-only status of the drive.  If mount(1)
> shows that the drive is 'ro', then a person or a program, not the
> kernel, has mounted it read-only.

I stand corrected.

As a matter of fact I just noticed very similar behavior on one of our
servers just now.  /var was remounted read-only while mount indicated
nothing special.

Dmesg however showed the following errors:
audit(1189303330.843:3): avc:  denied  { search } for  pid=2189
comm="syslogd" name="httpd" dev=dm-5 ino=131096
scontext=user_u:system_r:syslogd_t tcontext=s
ystem_u:object_r:httpd_log_t tclass=dir
SCSI error : <0 0 1 0> return code = 0x8000002
Info fld=0x0, Current sdb: sense key Aborted Command
ASC=47 ASCQ=7f
end_request: I/O error, dev sdb, sector 31183
Buffer I/O error on device dm-5, logical block 3842
lost page write due to I/O error on dm-5
Aborting journal on device dm-5.
journal commit I/O error
ext3_abort called.
EXT3-fs error (device dm-5): ext3_journal_start_sb: Detected aborted
journal
Remounting filesystem read-only
EXT3-fs error (device dm-5) in start_transaction: Journal has aborted
EXT3-fs error (device dm-5) in start_transaction: Journal has aborted
EXT3-fs error (device dm-5) in start_transaction: Journal has aborted
EXT3-fs error (device dm-5) in start_transaction: Journal has aborted
EXT3-fs error (device dm-5) in start_transaction: Journal has aborted

This is on a virtual server (VMWare ESX) with the datafiles on a SAN, so
it's likely to have been an issue with the SAN-connection.

I was unable to unmount /var even after going to runlevel 1.  After a
reboot everything appeared to be fine, just like in th OP's message. But
I rebooted to runlevel 1 and unmounted /var to run a file system check.

E2fsck -p /dev/logvg/var indicates the filesystem is clean, as does
e2fsck -b 32768 -p /dev/logvg/var.

Can we trust this filesystem?

Kind regards

Bram