ext3 file system becoming read only

tweeks tweeks at rackspace.com
Tue Sep 25 18:27:04 UTC 2007


The EL4 kernel is wacky when it comes the the I/O scheduler locking up and and 
causing ext3 to remount RO.  Various hardware hiccups can cause it to go RO.  
And when it does.. you need to tread lightly or you could lose everything.

If your ext3 filesystem had problems and remounted read-only, I would strongly 
advise /against/ simply fscking it.  Often times when your filesystem has 
gone RO, it may have been that way for 30 minutes or more.  Just rebooting ro 
fscking is a great way to lose everything (i.e. everything being dumped 
into /lost+found/"

Instead, I would recommend:
1) rebooting into a rescue CD environment (not allowing the rescue environment 
to mount or fsck your filesystems).
2) Nuke the ext3 journal:
	tune2fs -O ^has_journal /dev/<rootfs>
 (possibly doing the same for other problem partitions)
3) Do a fake fsck to see the extent of damage:
	fsck -fn /dev/<rootfs>
  (after checking things out.. use "-fy" once you're sure that it's safe)
4) Rebuild the journal w, "tune2fs -j /dev/<rootfs>
  (rerun at least once until "clean" result is repeatable)
5) Mount and check things out, 
	"mkdir /mnt/tmp && mount -t ext3 /dev/<rootfs> /mnt/tmp"
6) Gracefully umount & reboot:
	"umount /mnt/tmp  && shutdown -rf now && exit"

Tweeks

On Tuesday 25 September 2007 11:47, Swapana Ghosh wrote:
> Hi Jordi,
>
> Thanks for your reply.  I will test the way you suggested.
>
> Thanks
> -swapna
>
> --- Jordi Prats <jprats at cesca.es> wrote:
> > Hi,
> > It seems like what it happened to me. I did this to solve this issue:
> >
> > Mark the filesystem as it does not have a journal (take it to ext2)
> >
> > tune2fs -O ^has_journal /dev/cciss/c0d0p2
> >
> > fsck it to delete the journal:
> >
> > e2fsck /dev/cciss/c0d0p2
> >
> > Create the journal (take it back to ext3)
> >
> > tune2fs -j /dev/cciss/c0d0p2
> >
> > and finaly, remount it.
> >
> > In my case it was with a local disk, but with your SAN disk should be
> > the same.
> >
> > Jordi
> >
> > Swapana Ghosh wrote:
> > > Hi
> > >
> > > In our office environment few servers mostly  database servers and
> >
> > yesterday it
> >
> > > happened
> > > for one application server(first time) the partion is getting "read
> > > only".
> > >
> > > I was checking the archives, found may be similar kind of issues in the
> > > 2007-July archives.
> > > But how it has been solved if someone describes me that will be really
> >
> > helpful.
> >
> > > In our case, just at the problem started found the line in log file as
> >
> > follows:
> > >      EXT3-fs error (device dm-12): edxt3_find_entry: reading directory
> >
> > #2015496
> >
> > > offset 2
> > >
> > > Then one blank line
> > > Then the line is
> > >
> > >     Aborting journal on device dm-12.
> > >     ext3_abort called
> > >
> > >     Ext3-fs error (device dm-12): ext3_journal_start_sb: Detected
> > > aborted journal
> > >     Remounting filesysem read-only
> > >
> > > Then the continuous line as follows:
> > >
> > >
> > >     EXT3-fs error (device dm-12) in start_transaction: Journal has
> > > aborted
> > >
> > >
> > >
> > > The above message is continuous  until we remount the filesystem and
> >
> > partion
> >
> > > becomes
> > > 'read-write'.
> > >
> > > We could not figure it out what is the root cause of the system.
> > >
> > > We are using individual EMC luns and are configured with LVM volume
> > > groups
> >
> > and
> >
> > > then mounted on logical
> > > volumes.
> > >
> > > Here i am giving the server description:
> > >
> > > ____________________________________________________________
> > >
> > > [root at server ~]# lsmod |grep -i qla
> > > qla2300               130304  0
> > > qla2xxx_conf          305924  0
> > > qla2xxx               307448  21 qla2300
> > > scsi_mod              117709  5 sg,emcp,qla2xxx,cciss,sd_mod
> > >
> > > ____________________________________________________________
> > > [root at server ~]# cat /etc/modprobe.conf
> > > alias eth0 tg3
> > > alias eth1 tg3
> > > alias eth2 e1000
> > > alias eth3 e1000
> > > alias eth4 e1000
> > > alias eth5 e1000
> > > alias bond0 bonding
> > > alias scsi_hostadapter cciss
> > > options bond0 max_bonds=2 miimon=100 mode=1
> > > alias scsi_hostadapter1 qla2xxx
> > > alias scsi_hostadapter2 qla2xxx_conf
> > > #alias scsi_hostadapter3 qla6312
> > > options qla2xxx  ql2xmaxqdepth=16 qlport_down_retry=64
> > > ql2xloginretrycount=30 ql2xfailover=0 ql2xlbType=0
> > > install qla2xxx /sbin/modprobe qla2xxx_conf; /sbin/modprobe
> > > --ignore-install qla2xxx
> > > remove qla2xxx /sbin/modprobe -r --first-time --ignore-remove qla2xxx
> > > && { /sbin/modprobe -r --ignore-remove qla2xxx_conf; }
> > > ###BEGINPP
> > > include /etc/modprobe.conf.pp
> > > ###ENDPP
> > > ###BEGINPP
> > > include /etc/modprobe.conf.pp
> > > ###ENDPP
> > > ###BEGINPP
> > > include /etc/modprobe.conf.pp
> > > ###ENDPP
> > >
> > > ________________________________________________
> > > [root at server ~]# rpm -qa |grep -i EMC
> > > EMCpower.LINUX-4.5.1-022
> > >
> > > ________________________________________________
> > > [root at server ~]# rpm -qa|grep -i scli
> > > scli-1.06.16-57
> > >
> > > ________________________________________________
> > > [root at server ~]# rpm -qa|grep -i nav
> > > naviagentcli-6.19.1.3.0-1
> > >
> > > ________________________________________________
> > >  product: QLA2312 Fibre Channel Adapter
> > >
> > > ________________________________________________
> > > [root at server ~]# rpm -qa|grep -i lvm
> > > lvm2-2.02.06-6.0.RHEL4
> > > system-config-lvm-1.0.19-1.0
> > >
> > > ________________________________________________
> > >
> > > If I missed any info, pl. let me know.
> > >
> > > It would be really appreciated if I get some hints to solve the issues
> > >
> > > Thanks in advance
> > > -swapana
>
> ___________________________________________________________________________
>_________
>
> > > Boardwalk for $500? In 2007? Ha! Play Monopoly Here and Now (it's
> > > updated
> >
> > for today's economy) at Yahoo! Games.
> >
> > > http://get.games.yahoo.com/proddesc?gamekey=monopolyherenow
> > >
> > > _______________________________________________
> > > Ext3-users mailing list
> > > Ext3-users at redhat.com
> > > https://www.redhat.com/mailman/listinfo/ext3-users
> >
> > --
> > ......................................................................
> >          __
> >         / /          Jordi Prats
> >   C E / S / C A      Dept. de Sistemes
> >       /_/            Centre de Supercomputació de Catalunya
> >
> >   Gran Capità, 2-4 (Edifici Nexus) · 08034 Barcelona
> >   T. 93 205 6464 · F.  93 205 6979 · jprats at cesca.es
> > ......................................................................
>
> ___________________________________________________________________________
>_________ Be a better Heartthrob. Get better relationship answers from
> someone who knows. Yahoo! Answers - Check it out.
> http://answers.yahoo.com/dir/?link=list&sid=396545433
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users




More information about the Ext3-users mailing list