recovering corrupt file system

Fri Nov 20 21:08:40 UTC 2015

 You can try using the secondary superblock:
fsck -b 32768 /dev/whatever

This presumes that  you're using 4K blocks in the filesystem.  you can get
a (more accurate)  list of available
secondary superblocks with

mkfs -n -{other options used to make the filesystem}  /dev/whatever

On Thu, Nov 19, 2015 at 7:06 PM, Boylan, Ross <Ross.Boylan at ucsf.edu> wrote:

> I tried the "just run e2fsck", but it reduced the filesystem to almost
> nothing.  Before there were nearly 700G of files; after there were 70G.*
> Also, the overall filesystem size shrunk to under 300G.  I ran resize2fs
> to  get the  space back, but of course that didn't get the files back.
>
> This seems like an awful lot of damage from losing a total of 8,192 bytes
> out of ~700G.  Maybe the first block of zeros caused the recovery to decide
> it had reached the end?  The logical volume got about 64GB from the first,
> presumably OK, virtual disk.  The holes in the file occur around 174GB into
> the 2nd virtual hard disk.
>
> I've still got copies from before e2fsck, and I'm still interested in
> recovering them (lots of recorded shows on them).
>
> Sorry about the top-posting; my mail client doesn't provide good way to do
> otherwise.
> Ross
>
> *I had expected only to lose the newer files, but a lot of the program
> files seem to be gone too. startx doesn't exist, for example.
>
>
> From: Boylan, Ross
>
> Sent: Thursday, November 19, 2015 1:13 PM
>
> To: Stephen Samuel
>
> Cc: Ext3-users at redhat.com
>
> Subject: RE: recovering corrupt file system
>
>
>
>
>
>
> Thanks for the pointer.  Turning to my other bad file system, I could use
> some help interpreting e2fsck.  I have the source and have been looking at
> various web resources, so  I suppose
>  I could figure this out eventually.
>
>
>
> Actually, maybe I should ask a simpler question: should I just run e2fck,
> accepting its recommendations, and live with the results?  No matter what I
> do I don't think I can recover any more information.
>
>
>
>
> Here's a little diagram:
>
> media01/root   # LVM logical volume on which the ext4 filesystem resides
>
> VM's sda, sdb, various partitions   # physical volumes making up the
> media01VG
>
> ------ virtual machine above here ---
>
> --- physical machine/ host below here -------------------
>
> media01b.vdi                    # host file backing virtual disk sdb
>
> # note I have made a spare copy of media01b.vdi.
>
> # The file backing virtual sda had no hardware problems.
>
> ## various more layers here
>
> physical disk
>
>
>
> The physical disk at the bottom is failing. I used (g)ddrescue to copy as
> much of the media01b.vdi file as I could; the file is about 700G, and there
> were 2 chunks of 0x1000 bytes that could not be recovered and are now 0
> filled.
>
>
>
> The basic structure of the virtual disks appears intact: the partition
> tables are still there and the logical volumes can still be assembled.
>
>
>
> If it's worth getting into the details, here's what e2fsck, run inside
> another VM that has the problems disks temporarily inserted says.  What do
> the individual block bitmap differences mean?  I'm guessing + and minus
> indicate whether the block was found in
>  the scan only or in the file system tables on disk one, but I don't know
> which.  And what do the numbers mean?  Offsets in bytes? sectors? relative
> to ??
>
>
>
> root at wheezy02:~# e2fsck -vn /dev/media01-vg/root
>
> e2fsck 1.42.12 (29-Aug-2014)
>
> One or more block group descriptor checksums are invalid.  Fix? no
>
>
>
> Group descriptor 465 checksum is 0x5e7a, should be 0xa22b.  IGNORED.
>
> Group descriptor 482 checksum is 0x69eb, should be 0x73a5.  IGNORED.
>
> Group descriptor 485 checksum is 0xbd9b, should be 0x21c9.  IGNORED.
>
> Group descriptor 496 checksum is 0xe550, should be 0x9a62.  IGNORED.
>
> Group descriptor 508 checksum is 0xf4d0, should be 0x2466.  IGNORED.
>
> /dev/media01-vg/root contains a file system with errors, check forced.
>
> Pass 1: Checking inodes, blocks, and sizes
>
> Pass 2: Checking directory structure
>
> Pass 3: Checking directory connectivity
>
> Pass 4: Checking reference counts
>
> Pass 5: Checking group summary information
>
> Block bitmap differences:  +15243264 +(15511418--15511423)
> +(15511488--15511551) +(15523264--15523327) +(15812608--15813174)
> -(15813349--15813503) -(15813632--158\
>
> 14054) +(15814656--15815176) +(15815680--15816191) -(15816505--15816703)
> -(15823872--15824895) -(15850505--15851519) -(15852544--15853567)
> -(15896583--15898623) +\
>
> (16029735--16029759) +(16029786--16029823) +(16029852--16029855)
> +(16029884--16031743) -(16261152--16261954) -(16263740--16263743) -16459791
> +(16459795--16459799)\
>
>  -(16459808--16459814) -(16459825--16459839) -(16460000--16460026)
> -(16460288--16460799) +(16668689--16670719) +(17210175--17211391) +17498112
> -(17498624--1749913\
>
> 5) -(17499262--17500159) +17534976 -(17536000--17537023)
> -(17953505--17954815) +(18026714--18028543) +(18031327--18032639)
> -(18032653--18034687) +(18062252--18063\
>
> 359) +(18655232--18657173) -(19314688--19316735) -(19331072--19333119)
> -19406880 -25174048 -(41954376--41955015) -(42999849--42999850)
> -(42999852--42999859) -(429\
>
> 99861--42999862) -(43524128--43546654) -(45621280--45625870)
> -(45637632--45641520) -46669856 -(48242720--48246756) -(67436544--67446783)
>
> Fix? no
>
>
>
> Free blocks count wrong for group #473 (0, counted=134).
>
> Fix? no
>
>
>
> ## quite a few more Free blocks wrong messages
>
>
>
> Free blocks count wrong (48434540, counted=48384585).
>
> Fix? no
>
>
>
> Inode bitmap differences:  -4849670
>
> Fix? no
>
>
>
> Free inodes count wrong for group #592 (8187, counted=8186).
>
> Fix? no
>
>
>
> Free inodes count wrong (16811016, counted=16811015).
>
> Fix? no
>
>
>
> Padding at end of block bitmap is not set. Fix? no
>
>
>
>
>
> /dev/media01-vg/root: ********** WARNING: Filesystem still has errors
> **********
>
>
>
>
>
>        56312 inodes used (0.33%, out of 16867328)
>
>           91 non-contiguous files (0.2%)
>
>           53 non-contiguous directories (0.1%)
>
>              # of inodes with ind/dind/tind blocks: 0/0/0
>
>              Extent depth histogram: 51385/43
>
>     19012244 blocks used (28.19%, out of 67446784)
>
>            0 bad blocks
>
>           14 large files
>
>
>
>        46167 regular files
>
>         5101 directories
>
>           12 character device files
>
>           25 block device files
>
>            0 fifos
>
>
>
>
>
> From: darkonc at gmail.com [darkonc at gmail.com] on behalf of Stephen Samuel [
> samuel at bcgreen.com]
>
> Sent: Thursday, November 19, 2015 8:00 AM
>
> To: Boylan, Ross
>
> Cc: Ext3-users at redhat.com
>
> Subject: Re: recovering corrupt file system
>
>
>
>
>
>
> well, the next place to go, if fsck isn't enough would be to to try
> debugfs(1)
> man debugfs.
>
>
>
> On Wed, Nov 18, 2015 at 8:39 PM, Boylan, Ross
> <Ross.Boylan at ucsf.edu> wrote:
>
>
> I guess some of the trouble was that the virtual disk was mounted
> read-only at the VM level.  When I mounted read/write I was able to do
> fsck, which gave messages about replaying the logs and a couple messages
> about changing the inode counts (sorry, don't have
>  the exact words).  Then I ran fsck -f, which didn't report any problems.
> Then I mounted it, and everything seems OK.
>
>
>
> I'm still interested in the general question about how to diagnose and
> recover from file system errors, since I have another virtual machine that
> was backed by a failing real disk.
>
> ________________________________________
>
> From: Boylan, Ross
>
> Sent: Wednesday, November 18, 2015 4:35 PM
>
> To:
> Ext3-users at redhat.com
>
> Subject: recovering corrupt file system
>
>
>
>
> Any recommendations for tools to diagnose and recover problems on an ext4
> file system?
>
>
>
> In particular:
>
> root at jessie01:~# mount -o ro /dev/markov02/root /mnt/markov02
>
> mount: wrong fs type, bad option, bad superblock on
> /dev/mapper/markov02-root,
>
>        missing codepage or helper program, or other error
>
>
>
>        In some cases useful info is found in syslog - try
>
>        dmesg | tail or so.
>
> and e2fsck says
>
> root at jessie01:~# e2fsck /dev/markov02/root
>
> e2fsck 1.42.12 (29-Aug-2014)
>
> /dev/markov02/root: recovering journal
>
> Superblock needs_recovery flag is clear, but journal has data.
>
>
>
> markov02/root is an LVM volume, built on partitions from 2 disks in a
> virtual machine.  The initial symptom was that the VM the disks were in
> originally would only get as far as busybox when it started.  However, I
> think the filesystem was OK even after that,
>  since it was visible in busybox and in another VM.  I think virt-manager
> might have overwritten on of the disks because I left "allocate entire disk
> now" checked when I moved one of the disks between machines.
>
>
>
> I'm making copies of the virtual disks now.
>
> Ross Boylan
>
>
>
> _______________________________________________
>
> Ext3-users mailing list
>
> Ext3-users at redhat.com
>
> https://www.redhat.com/mailman/listinfo/ext3-users
>
>
>
>
>
>
>
>
>
>
>
> --
>
> Stephen Samuel
> http://www.bcgreen.com  Software, like love,
>
> 778-861-7641                              grows when you give it away
>
>
>
>
>
>
>
>

-- 
Stephen Samuel http://www.bcgreen.com  Software, like love,
778-861-7641                              grows when you give it away
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20151120/a4a42d8e/attachment.htm>