Invalid superblock after e2fsck

Albert Sellarès whats at wekk.net
Sun Feb 21 18:57:14 UTC 2010


Hi,

the filesystem is on a SAN device and was connected to a 32bits RHEL4
system. 

To be able to check it with enough ram and with a new kernel, I
installed an ubuntu karmic on a new x86_64 machine and I plugged the SAN
to it. The new machine have 4Gb of RAM and 90Gb of SWAP. 

Thinking in what you say, I think that the only thing that can be
"broken" in the new system is the hard drive because is the only part
not completely new, however, I will verify memory RAM.

By the way, today before to send the first email, I've downloaded the
last version of e2fstools (only one version upper that the ubuntu
package), I've compiled it and I've launched the filesystem check using
the new version of e2fsck. Now, the process e2fsck is at 60% and it is
using 28Gb of memory. I don't understand why the first filesystem check
(the one that I explained in the first email) didn't almost use SWAP
space. It is weird to see that there is a big difference in memory usage
between both versions. 

Is it "normal"? What can be happening?

Sorry for bother you with so many questions. :(

Thanks you very much.

El dg 21 de 02 de 2010 a les 09:38 -0800, en/na Stephen Samuel (gmail)
va escriure:
> I'm worried toat fsck is corrupting your filesystem.... That implies
> that something else is seriously wrong -- either the controller, the
> kernel or the copy of e2fsck that you're using.
> 
> At this point, I'd suggest doing the FSCK from a CD.  If you have the
> time to work on this, you can try an 'fsck -n' to see if you have good
> resuts without writing to the drive.
> In the meantime, I would also verify the FSCK instance.   If you're
> running Red Hat, that would be an 
>    rpm --verify {package-name}
> {package-name} is probably e2fstools, but I'm not currently on a
> redhat system to check that right now.   I'm not sure what the debian
> equivalent is.
> 
> manually starting fsck also has the advantage that the system will be
> more resliant to errors. It asks questions before it does things, but
> it will ask to fix problems that would cause an abort during a
> pre-mount preen.
> 
> In any case, boot from a CD,  do a system memory check (to make sure
> you have no bad RAM), and then manually start the FSCK.   If all of
> that works, then I would say that your root filesystem is corrupt.
> Try using chkrootkit to see if there's a root kit installed, or just
> reload the OS.
> 
> 
> 2010/2/21 Albert Sellarès <whats at wekk.net>
>         Hi,
>         
>         e2fsck quits after the bad magic number message. I didn't
>         pasted the
>         entire e2fsck output (sorry!). The full output was that:
>         
>         root at pacs:~# e2fsck -C 0 -y -t /dev/storage/cabina_snapshot2
>         e2fsck 1.41.9 (22-Aug-2009)
>         /dev/storage/cabina_snapshot2 contains a file system with
>         errors, check
>         forced.
>         Pass 1: Checking inodes, blocks, and sizes
>         /dev/storage/cabina_snapshot2: |==================
>            / 60.0%
>         
>         [...]
>         
>         Until 60% of the progress bar, there was no error messages.
>         After reach
>         the 60%, e2fsck started fixing a lot of things. In the output
>         I
>         recognized the inode that it was analyzing. Once e2fsck
>         analyzed the
>         last inode of the filesystem, it printed this message and
>         exited:
>         
>         Pass 1: Memory used: 268k/18014398508105072k (59k/210k), time:
>         65673.66/1550.92/1371.06
>         Pass 1: I/O read: 253456MB, write: 23885MB, rate: 4.22MB/s
>         Restarting e2fsck from the beginning...
>         e2fsck: Superblock invalid, trying backup blocks...
>         e2fsck: Bad magic number in super-block while trying to
>         open /dev/storage/cabina_snapshot
>         
>         
>         The superblock could not be read or does not describe a
>         correct ext2
>         filesystem.  If the device is valid and it really contains an
>         ext2
>         filesystem (and not swap or ufs or something else), then the
>         superblock
>         is corrupt, and you might try running e2fsck with an alternate
>         superblock:
>            e2fsck -b 8193 <device>
>         
>         About your questions:
>         
>         The content of the default superblock (I mean the superblock
>         located at
>         block 0) was correct before to launch e2fsck because I was
>         able to mount
>         it and use the filesystem. (I also dumped their content and
>         checked it).
>         
>         After the filesystem check, the superblock was filled only
>         with zeros
>         (no magic number, no inode counts, etc...), all the 4096 bytes
>         at zero.
>         
>         
>         I've not tried to launch the e2fsck over a backup superblock
>         instead of
>         over the default one because I think that this would produce
>         the same
>         result. I'm sure that the original superblock was not
>         corrupted then I
>         guess that with a backup superblock, e2fsck would have the
>         same
>         behavior.
>         
>         What do you think? Should I try to launch it over a backup
>         superblock at
>         the first time?
>         
>         Any other ideas?
>         
>         Thanks you very much!
>         
>         
>         El dg 21 de 02 de 2010 a les 04:51 -0800, en/na Stephen Samuel
>         (gmail)
>         va escriure:
>         
>         > The system is using the backup superblock -- which sounds
>          reasonable,
>         > under the circumstances, and should result in a half-decent
>         recovery.
>         >
>         > A couple of questions:
>         > Was the superblock zero to begin with, or did it become zero
>         during
>         > the FSCK?
>         > In either case, I'm worried about the zero data having been
>         written.
>         > This is obviously worth further investigation.
>         > Did the system abort the FSCK after this error? or did you
>         stop it?
>         > did you try explicitly using one of the backup superblocks?
>         >   a list of backup superblocks can be found by using -n on
>         mkfs
>         >   check the -b option on fsck.ext2 for more details on the
>         backup
>         > superblock defaults.
>         >
>         > 2010/2/21 Albert Sellarès <whats at wekk.net>
>         >         Hi all,
>         >
>         >         I'm trying to fix a 7.5Tb ext3 filesystem using
>         e2fsck on a
>         >         x86_64
>         >         machine with plenty of memory ram. The filesystem is
>         >         corrupted, but I
>         >         can mount it.is
>         >
>         >         Before starting the filesystem check, I did a LVM
>         snapshot to
>         >         be able to
>         >         start it again from the same point in case of error.
>         >
>         >         After 12 hours checking the filesystem, I got this
>         error
>         >         message:
>         >
>         >         Pass 1: Memory used: 268k/18014398508105072k
>         (59k/210k), time:
>         >         65673.66/1550.92/1371.06
>         >         Pass 1: I/O read: 253456MB, write: 23885MB, rate:
>         4.22MB/s
>         >         Restarting e2fsck from the beginning...
>         >         e2fsck: Superblock invalid, trying backup blocks...
>         >         e2fsck: Bad magic number in super-block while trying
>         to
>         >         open /dev/storage/cabina_snapshot
>         >
>         >         Once I saw this message, my first thought was that
>         e2fsck
>         >         didn't manage
>         >         to fix the filesystem and it corrupted the
>         superblock. To be
>         >         sure I
>         >         dumped the entire block and compared it against the
>         original
>         >         superblock.
>         >         Doing that I realized that the entire superblock
>         only
>         >         contained zeros.
>         >
>         >         Any ideas of what can I do?
>         >
>         >
>         > --
>         > Stephen Samuel http://www.bcgreen.com  Software, like love,
>         > 778-861-7641                              grows when you
>         give it away
>         
>         
>         --
>          Albert Sellarès        GPG id: 0x13053FFE
>          http://www.wekk.net    whats at jabber.org
>          Linux User: 324456
>         
> 
> 
> 
> -- 
> Stephen Samuel http://www.bcgreen.com  Software, like love, 
> 778-861-7641                              grows when you give it away
-- 
  Albert Sellarès        GPG id: 0x13053FFE
  http://www.wekk.net    whats at jabber.org 
  Linux User: 324456                
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Aix? ?s una part	d'un missatge signada digitalment
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20100221/7f706643/attachment.sig>


More information about the Ext3-users mailing list