[rhelv6-list] fsck -n always showing errors

Thu Dec 21 13:09:03 UTC 2017

Have you checked the filesystem from a rescue disk or does the fsck on reboot report that it is fixing errors each time? As far as I understand running `fsck -n /` on the active root filesystem will most always return some errors as the blocks in the filesystem are changing while the fsck is running it’s passes. Thus the warning at the beginning of the process about the filesystem being mounted. Sorry if I am misunderstanding your process, but if you have not tried checking the filesystem after booting into rescue mode that would be a good step.

From: rhelv6-list-bounces at redhat.com [mailto:rhelv6-list-bounces at redhat.com] On Behalf Of francis picabia
Sent: December 21, 2017 07:21
To: Red Hat Enterprise Linux 6 (Santiago) discussion mailing-list <rhelv6-list at redhat.com>
Subject: Re: [rhelv6-list] fsck -n always showing errors

fsck -n is used to verify only.
The touch on /forcefsck will force a regular fsck on unmounted
partitions on boot up.
So what I've done is:
fsck -n
touch /forcefsck
reboot
times three.
It should be actually fixing the problems on reboot.
I can find there are at least some fsck errors on every Redhat 6 machine,
whether virtual or physical.  I mean I've tested the fsck -n status on about
twelve systems which have some errors.  Only 2 showed a history
of SCSI errors, both happening to be VMware.
Maybe some other people can test this on their Redhat 6 systems
and see if fsck -n /var or similar comes back clean.  You might
be surprised to see the same state I've noticed.  There is
no issue like read-only file system.   Everything is functional.

On Wed, Dec 20, 2017 at 5:57 PM, Gianluca Cecchi <gianluca.cecchi at gmail.com<mailto:gianluca.cecchi at gmail.com>> wrote:

On Wed, Dec 20, 2017 at 9:27 PM, francis picabia <fpicabia at gmail.com<mailto:fpicabia at gmail.com>> wrote:

With one development box I did touch /forcefsck and rebooted.
Retested fsck and still issues.  Repeated this cycle 3 times
and no improvement.

Hi,
not going into the reasons of the problem, but into your "cycle".
if I have understood correctly your sentence, you run fsck and use "-n" option that automatically answers "no" to all the questions related to problems and suggestions to fix them.
So, as you didn't fix anything, the next run the fsck command exposes the same problems again....

Sometimes I have seen in vSphere environments storage problems causing linux VMs problems and so kernel to automatically put one or more filesystems in read-only mode: typically the filesystems where there were writes in action during the problem occurrence.
So in your case it could be something similar with impact to all the VMs insisting on the problematic storage / datastore
If you have no monitoring in place, such as Nagios and a monitor like this:
https://exchange.nagios.org/directory/Plugins/Operating-Systems/Linux/check_ro_mounts/details
you can go ahead also some days before realizing that you had a problem
Analyzing /var/log/messages you should see when it happened

Take in mind that if the filesystem went in read-only mode due to a SCSI error (action taken by the kernel to prevent further errors and data corruption), you will not be able to remount it read-write, but you have to reboot the server.

Just a guess.
HIH,
Gianluca

_______________________________________________
rhelv6-list mailing list
rhelv6-list at redhat.com<mailto:rhelv6-list at redhat.com>
https://www.redhat.com/mailman/listinfo/rhelv6-list

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/rhelv6-list/attachments/20171221/d11dcb44/attachment.htm>