From lakshmipathi.g at gmail.com  Wed Nov 18 19:00:15 2015
From: lakshmipathi.g at gmail.com (Lakshmipathi.G)
Date: Thu, 19 Nov 2015 00:30:15 +0530
Subject: parse raw image to read block group desc table!
In-Reply-To: <8E3C2A0D-ABF0-48DC-94AE-80249007FFC0@dilger.ca>
References: <CAJ_ObUHOy0y69jLTZUVp+Bkoth2iaY0XGKPJJg9oNH7eb-oJ+A@mail.gmail.com>
	<8E3C2A0D-ABF0-48DC-94AE-80249007FFC0@dilger.ca>
Message-ID: <CAKuJGC9KQpa20VgjvudEgTokEkqGzTnhUHUd8xqa8oRA=JQc0A@mail.gmail.com>

As Andreas mentioned, using libext2fs (http://giis.co.in/libext2fs.pdf) is
100 times easier than writing your own code
and I learned this hard way :p First two links are insane, I highly
recommend not to open them ;)  third link uses libext2fs.

https://github.com/Lakshmipathi/giis/blob/master/src/init.c#L195
http://giis.co.in/Kick_start.html

https://github.com/Lakshmipathi/giis-ext4/blob/master/src/giis-ext4.c


----
Cheers,
Lakshmipathi.G
FOSS Programmer.
www.giis.co.in
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20151119/5c9b5afc/attachment.htm>

From Ross.Boylan at ucsf.edu  Thu Nov 19 00:35:25 2015
From: Ross.Boylan at ucsf.edu (Boylan, Ross)
Date: Thu, 19 Nov 2015 00:35:25 +0000
Subject: recovering corrupt file system
Message-ID: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>

Any recommendations for tools to diagnose and recover problems on an ext4 file system?

In particular:
root at jessie01:~# mount -o ro /dev/markov02/root /mnt/markov02
mount: wrong fs type, bad option, bad superblock on /dev/mapper/markov02-root,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
and e2fsck says
root at jessie01:~# e2fsck /dev/markov02/root
e2fsck 1.42.12 (29-Aug-2014)
/dev/markov02/root: recovering journal
Superblock needs_recovery flag is clear, but journal has data.

markov02/root is an LVM volume, built on partitions from 2 disks in a virtual machine.  The initial symptom was that the VM the disks were in originally would only get as far as busybox when it started.  However, I think the filesystem was OK even after that, since it was visible in busybox and in another VM.  I think virt-manager might have overwritten on of the disks because I left "allocate entire disk now" checked when I moved one of the disks between machines.

I'm making copies of the virtual disks now.
Ross Boylan


From Ross.Boylan at ucsf.edu  Thu Nov 19 04:39:14 2015
From: Ross.Boylan at ucsf.edu (Boylan, Ross)
Date: Thu, 19 Nov 2015 04:39:14 +0000
Subject: recovering corrupt file system
In-Reply-To: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>
References: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>
Message-ID: <F1F13E14A610474196571953929C02090BDB3DF3@ex08.net.ucsf.edu>

I guess some of the trouble was that the virtual disk was mounted read-only at the VM level.  When I mounted read/write I was able to do fsck, which gave messages about replaying the logs and a couple messages about changing the inode counts (sorry, don't have the exact words).  Then I ran fsck -f, which didn't report any problems.  Then I mounted it, and everything seems OK.

I'm still interested in the general question about how to diagnose and recover from file system errors, since I have another virtual machine that was backed by a failing real disk.
________________________________________
From: Boylan, Ross
Sent: Wednesday, November 18, 2015 4:35 PM
To: Ext3-users at redhat.com
Subject: recovering corrupt file system

Any recommendations for tools to diagnose and recover problems on an ext4 file system?

In particular:
root at jessie01:~# mount -o ro /dev/markov02/root /mnt/markov02
mount: wrong fs type, bad option, bad superblock on /dev/mapper/markov02-root,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
and e2fsck says
root at jessie01:~# e2fsck /dev/markov02/root
e2fsck 1.42.12 (29-Aug-2014)
/dev/markov02/root: recovering journal
Superblock needs_recovery flag is clear, but journal has data.

markov02/root is an LVM volume, built on partitions from 2 disks in a virtual machine.  The initial symptom was that the VM the disks were in originally would only get as far as busybox when it started.  However, I think the filesystem was OK even after that, since it was visible in busybox and in another VM.  I think virt-manager might have overwritten on of the disks because I left "allocate entire disk now" checked when I moved one of the disks between machines.

I'm making copies of the virtual disks now.
Ross Boylan


From samuel at bcgreen.com  Thu Nov 19 16:00:21 2015
From: samuel at bcgreen.com (Stephen Samuel)
Date: Thu, 19 Nov 2015 08:00:21 -0800
Subject: recovering corrupt file system
In-Reply-To: <F1F13E14A610474196571953929C02090BDB3DF3@ex08.net.ucsf.edu>
References: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>
	<F1F13E14A610474196571953929C02090BDB3DF3@ex08.net.ucsf.edu>
Message-ID: <CALp1NBjWDnYbM=JRCbODWZCM-n0xB44rHM=obCTkRcKuy+1_zg@mail.gmail.com>

well, the next place to go, if fsck isn't enough would be to to try
debugfs(1)
man debugfs.

On Wed, Nov 18, 2015 at 8:39 PM, Boylan, Ross <Ross.Boylan at ucsf.edu> wrote:

> I guess some of the trouble was that the virtual disk was mounted
> read-only at the VM level.  When I mounted read/write I was able to do
> fsck, which gave messages about replaying the logs and a couple messages
> about changing the inode counts (sorry, don't have the exact words).  Then
> I ran fsck -f, which didn't report any problems.  Then I mounted it, and
> everything seems OK.
>
> I'm still interested in the general question about how to diagnose and
> recover from file system errors, since I have another virtual machine that
> was backed by a failing real disk.
> ________________________________________
> From: Boylan, Ross
> Sent: Wednesday, November 18, 2015 4:35 PM
> To: Ext3-users at redhat.com
> Subject: recovering corrupt file system
>
> Any recommendations for tools to diagnose and recover problems on an ext4
> file system?
>
> In particular:
> root at jessie01:~# mount -o ro /dev/markov02/root /mnt/markov02
> mount: wrong fs type, bad option, bad superblock on
> /dev/mapper/markov02-root,
>        missing codepage or helper program, or other error
>
>        In some cases useful info is found in syslog - try
>        dmesg | tail or so.
> and e2fsck says
> root at jessie01:~# e2fsck /dev/markov02/root
> e2fsck 1.42.12 (29-Aug-2014)
> /dev/markov02/root: recovering journal
> Superblock needs_recovery flag is clear, but journal has data.
>
> markov02/root is an LVM volume, built on partitions from 2 disks in a
> virtual machine.  The initial symptom was that the VM the disks were in
> originally would only get as far as busybox when it started.  However, I
> think the filesystem was OK even after that, since it was visible in
> busybox and in another VM.  I think virt-manager might have overwritten on
> of the disks because I left "allocate entire disk now" checked when I moved
> one of the disks between machines.
>
> I'm making copies of the virtual disks now.
> Ross Boylan
>
> _______________________________________________
> Ext3-users mailing list
> Ext3-users at redhat.com
> https://www.redhat.com/mailman/listinfo/ext3-users
>


-- 
Stephen Samuel http://www.bcgreen.com  Software, like love,
778-861-7641                              grows when you give it away
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20151119/536ab25f/attachment.htm>

From Ross.Boylan at ucsf.edu  Thu Nov 19 21:13:09 2015
From: Ross.Boylan at ucsf.edu (Boylan, Ross)
Date: Thu, 19 Nov 2015 21:13:09 +0000
Subject: recovering corrupt file system
In-Reply-To: <CALp1NBjWDnYbM=JRCbODWZCM-n0xB44rHM=obCTkRcKuy+1_zg@mail.gmail.com>
References: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>
	<F1F13E14A610474196571953929C02090BDB3DF3@ex08.net.ucsf.edu>,
	<CALp1NBjWDnYbM=JRCbODWZCM-n0xB44rHM=obCTkRcKuy+1_zg@mail.gmail.com>
Message-ID: <F1F13E14A610474196571953929C02090BDB7DFD@ex08.net.ucsf.edu>

Thanks for the pointer.  Turning to my other bad file system, I could use some help interpreting e2fsck.  I have the source and have been looking at various web resources, so  I suppose I could figure this out eventually.

Actually, maybe I should ask a simpler question: should I just run e2fck, accepting its recommendations, and live with the results?  No matter what I do I don't think I can recover any more information.

Here's a little diagram:
media01/root   # LVM logical volume on which the ext4 filesystem resides
VM's sda, sdb, various partitions   # physical volumes making up the media01VG
------ virtual machine above here ---
--- physical machine/ host below here -------------------
media01b.vdi                    # host file backing virtual disk sdb
# note I have made a spare copy of media01b.vdi.
# The file backing virtual sda had no hardware problems.
## various more layers here
physical disk

The physical disk at the bottom is failing. I used (g)ddrescue to copy as much of the media01b.vdi file as I could; the file is about 700G, and there were 2 chunks of 0x1000 bytes that could not be recovered and are now 0 filled.

The basic structure of the virtual disks appears intact: the partition tables are still there and the logical volumes can still be assembled.

If it's worth getting into the details, here's what e2fsck, run inside another VM that has the problems disks temporarily inserted says.  What do the individual block bitmap differences mean?  I'm guessing + and minus indicate whether the block was found in the scan only or in the file system tables on disk one, but I don't know which.  And what do the numbers mean?  Offsets in bytes? sectors? relative to ??

root at wheezy02:~# e2fsck -vn /dev/media01-vg/root
e2fsck 1.42.12 (29-Aug-2014)
One or more block group descriptor checksums are invalid.  Fix? no

Group descriptor 465 checksum is 0x5e7a, should be 0xa22b.  IGNORED.
Group descriptor 482 checksum is 0x69eb, should be 0x73a5.  IGNORED.
Group descriptor 485 checksum is 0xbd9b, should be 0x21c9.  IGNORED.
Group descriptor 496 checksum is 0xe550, should be 0x9a62.  IGNORED.
Group descriptor 508 checksum is 0xf4d0, should be 0x2466.  IGNORED.
/dev/media01-vg/root contains a file system with errors, check forced.
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences:  +15243264 +(15511418--15511423) +(15511488--15511551) +(15523264--15523327) +(15812608--15813174) -(15813349--15813503) -(15813632--158\
14054) +(15814656--15815176) +(15815680--15816191) -(15816505--15816703) -(15823872--15824895) -(15850505--15851519) -(15852544--15853567) -(15896583--15898623) +\
(16029735--16029759) +(16029786--16029823) +(16029852--16029855) +(16029884--16031743) -(16261152--16261954) -(16263740--16263743) -16459791 +(16459795--16459799)\
 -(16459808--16459814) -(16459825--16459839) -(16460000--16460026) -(16460288--16460799) +(16668689--16670719) +(17210175--17211391) +17498112 -(17498624--1749913\
5) -(17499262--17500159) +17534976 -(17536000--17537023) -(17953505--17954815) +(18026714--18028543) +(18031327--18032639) -(18032653--18034687) +(18062252--18063\
359) +(18655232--18657173) -(19314688--19316735) -(19331072--19333119) -19406880 -25174048 -(41954376--41955015) -(42999849--42999850) -(42999852--42999859) -(429\
99861--42999862) -(43524128--43546654) -(45621280--45625870) -(45637632--45641520) -46669856 -(48242720--48246756) -(67436544--67446783)
Fix? no

Free blocks count wrong for group #473 (0, counted=134).
Fix? no

## quite a few more Free blocks wrong messages

Free blocks count wrong (48434540, counted=48384585).
Fix? no

Inode bitmap differences:  -4849670
Fix? no

Free inodes count wrong for group #592 (8187, counted=8186).
Fix? no

Free inodes count wrong (16811016, counted=16811015).
Fix? no

Padding at end of block bitmap is not set. Fix? no


/dev/media01-vg/root: ********** WARNING: Filesystem still has errors **********


       56312 inodes used (0.33%, out of 16867328)
          91 non-contiguous files (0.2%)
          53 non-contiguous directories (0.1%)
             # of inodes with ind/dind/tind blocks: 0/0/0
             Extent depth histogram: 51385/43
    19012244 blocks used (28.19%, out of 67446784)
           0 bad blocks
          14 large files

       46167 regular files
        5101 directories
          12 character device files
          25 block device files
           0 fifos

________________________________
From: darkonc at gmail.com [darkonc at gmail.com] on behalf of Stephen Samuel [samuel at bcgreen.com]
Sent: Thursday, November 19, 2015 8:00 AM
To: Boylan, Ross
Cc: Ext3-users at redhat.com
Subject: Re: recovering corrupt file system

well, the next place to go, if fsck isn't enough would be to to try debugfs(1)
man debugfs.

On Wed, Nov 18, 2015 at 8:39 PM, Boylan, Ross <Ross.Boylan at ucsf.edu<mailto:Ross.Boylan at ucsf.edu>> wrote:
I guess some of the trouble was that the virtual disk was mounted read-only at the VM level.  When I mounted read/write I was able to do fsck, which gave messages about replaying the logs and a couple messages about changing the inode counts (sorry, don't have the exact words).  Then I ran fsck -f, which didn't report any problems.  Then I mounted it, and everything seems OK.

I'm still interested in the general question about how to diagnose and recover from file system errors, since I have another virtual machine that was backed by a failing real disk.
________________________________________
From: Boylan, Ross
Sent: Wednesday, November 18, 2015 4:35 PM
To: Ext3-users at redhat.com<mailto:Ext3-users at redhat.com>
Subject: recovering corrupt file system

Any recommendations for tools to diagnose and recover problems on an ext4 file system?

In particular:
root at jessie01:~# mount -o ro /dev/markov02/root /mnt/markov02
mount: wrong fs type, bad option, bad superblock on /dev/mapper/markov02-root,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
and e2fsck says
root at jessie01:~# e2fsck /dev/markov02/root
e2fsck 1.42.12 (29-Aug-2014)
/dev/markov02/root: recovering journal
Superblock needs_recovery flag is clear, but journal has data.

markov02/root is an LVM volume, built on partitions from 2 disks in a virtual machine.  The initial symptom was that the VM the disks were in originally would only get as far as busybox when it started.  However, I think the filesystem was OK even after that, since it was visible in busybox and in another VM.  I think virt-manager might have overwritten on of the disks because I left "allocate entire disk now" checked when I moved one of the disks between machines.

I'm making copies of the virtual disks now.
Ross Boylan

_______________________________________________
Ext3-users mailing list
Ext3-users at redhat.com<mailto:Ext3-users at redhat.com>
https://www.redhat.com/mailman/listinfo/ext3-users


--
Stephen Samuel http://www.bcgreen.com  Software, like love,
778-861-7641                              grows when you give it away
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20151119/274dc5cd/attachment.htm>

From Ross.Boylan at ucsf.edu  Fri Nov 20 03:06:16 2015
From: Ross.Boylan at ucsf.edu (Boylan, Ross)
Date: Fri, 20 Nov 2015 03:06:16 +0000
Subject: recovering corrupt file system
In-Reply-To: <F1F13E14A610474196571953929C02090BDB7DFD@ex08.net.ucsf.edu>
References: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>
	<F1F13E14A610474196571953929C02090BDB3DF3@ex08.net.ucsf.edu>,
	<CALp1NBjWDnYbM=JRCbODWZCM-n0xB44rHM=obCTkRcKuy+1_zg@mail.gmail.com>,
	<F1F13E14A610474196571953929C02090BDB7DFD@ex08.net.ucsf.edu>
Message-ID: <F1F13E14A610474196571953929C02090BDB8B90@ex08.net.ucsf.edu>

I tried the "just run e2fsck", but it reduced the filesystem to almost nothing.  Before there were nearly 700G of files; after there were 70G.*  Also, the overall filesystem size shrunk to under 300G.  I ran resize2fs to  get the  space back, but of course that didn't get the files back.

This seems like an awful lot of damage from losing a total of 8,192 bytes out of ~700G.  Maybe the first block of zeros caused the recovery to decide it had reached the end?  The logical volume got about 64GB from the first, presumably OK, virtual disk.  The holes in the file occur around 174GB into the 2nd virtual hard disk.

I've still got copies from before e2fsck, and I'm still interested in recovering them (lots of recorded shows on them).

Sorry about the top-posting; my mail client doesn't provide good way to do otherwise.
Ross

*I had expected only to lose the newer files, but a lot of the program files seem to be gone too. startx doesn't exist, for example.


From: Boylan, Ross

Sent: Thursday, November 19, 2015 1:13 PM

To: Stephen Samuel

Cc: Ext3-users at redhat.com

Subject: RE: recovering corrupt file system


Thanks for the pointer.  Turning to my other bad file system, I could use some help interpreting e2fsck.  I have the source and have been looking at various web resources, so  I suppose
 I could figure this out eventually.


Actually, maybe I should ask a simpler question: should I just run e2fck, accepting its recommendations, and live with the results?  No matter what I do I don't think I can recover any more information.


Here's a little diagram:

media01/root   # LVM logical volume on which the ext4 filesystem resides

VM's sda, sdb, various partitions   # physical volumes making up the media01VG

------ virtual machine above here ---

--- physical machine/ host below here -------------------

media01b.vdi                    # host file backing virtual disk sdb

# note I have made a spare copy of media01b.vdi.

# The file backing virtual sda had no hardware problems.

## various more layers here

physical disk


The physical disk at the bottom is failing. I used (g)ddrescue to copy as much of the media01b.vdi file as I could; the file is about 700G, and there were 2 chunks of 0x1000 bytes that could not be recovered and are now 0 filled.


The basic structure of the virtual disks appears intact: the partition tables are still there and the logical volumes can still be assembled.


If it's worth getting into the details, here's what e2fsck, run inside another VM that has the problems disks temporarily inserted says.  What do the individual block bitmap differences mean?  I'm guessing + and minus indicate whether the block was found in
 the scan only or in the file system tables on disk one, but I don't know which.  And what do the numbers mean?  Offsets in bytes? sectors? relative to ??


root at wheezy02:~# e2fsck -vn /dev/media01-vg/root

e2fsck 1.42.12 (29-Aug-2014)

One or more block group descriptor checksums are invalid.  Fix? no


Group descriptor 465 checksum is 0x5e7a, should be 0xa22b.  IGNORED.

Group descriptor 482 checksum is 0x69eb, should be 0x73a5.  IGNORED.

Group descriptor 485 checksum is 0xbd9b, should be 0x21c9.  IGNORED.

Group descriptor 496 checksum is 0xe550, should be 0x9a62.  IGNORED.

Group descriptor 508 checksum is 0xf4d0, should be 0x2466.  IGNORED.

/dev/media01-vg/root contains a file system with errors, check forced.

Pass 1: Checking inodes, blocks, and sizes

Pass 2: Checking directory structure

Pass 3: Checking directory connectivity

Pass 4: Checking reference counts

Pass 5: Checking group summary information

Block bitmap differences:  +15243264 +(15511418--15511423) +(15511488--15511551) +(15523264--15523327) +(15812608--15813174) -(15813349--15813503) -(15813632--158\

14054) +(15814656--15815176) +(15815680--15816191) -(15816505--15816703) -(15823872--15824895) -(15850505--15851519) -(15852544--15853567) -(15896583--15898623) +\

(16029735--16029759) +(16029786--16029823) +(16029852--16029855) +(16029884--16031743) -(16261152--16261954) -(16263740--16263743) -16459791 +(16459795--16459799)\

 -(16459808--16459814) -(16459825--16459839) -(16460000--16460026) -(16460288--16460799) +(16668689--16670719) +(17210175--17211391) +17498112 -(17498624--1749913\

5) -(17499262--17500159) +17534976 -(17536000--17537023) -(17953505--17954815) +(18026714--18028543) +(18031327--18032639) -(18032653--18034687) +(18062252--18063\

359) +(18655232--18657173) -(19314688--19316735) -(19331072--19333119) -19406880 -25174048 -(41954376--41955015) -(42999849--42999850) -(42999852--42999859) -(429\

99861--42999862) -(43524128--43546654) -(45621280--45625870) -(45637632--45641520) -46669856 -(48242720--48246756) -(67436544--67446783)

Fix? no


Free blocks count wrong for group #473 (0, counted=134).

Fix? no


## quite a few more Free blocks wrong messages


Free blocks count wrong (48434540, counted=48384585).

Fix? no


Inode bitmap differences:  -4849670

Fix? no


Free inodes count wrong for group #592 (8187, counted=8186).

Fix? no


Free inodes count wrong (16811016, counted=16811015).

Fix? no


Padding at end of block bitmap is not set. Fix? no


/dev/media01-vg/root: ********** WARNING: Filesystem still has errors **********


       56312 inodes used (0.33%, out of 16867328)

          91 non-contiguous files (0.2%)

          53 non-contiguous directories (0.1%)

             # of inodes with ind/dind/tind blocks: 0/0/0

             Extent depth histogram: 51385/43

    19012244 blocks used (28.19%, out of 67446784)

           0 bad blocks

          14 large files


       46167 regular files

        5101 directories

          12 character device files

          25 block device files

           0 fifos


From: darkonc at gmail.com [darkonc at gmail.com] on behalf of Stephen Samuel [samuel at bcgreen.com]

Sent: Thursday, November 19, 2015 8:00 AM

To: Boylan, Ross

Cc: Ext3-users at redhat.com

Subject: Re: recovering corrupt file system


well, the next place to go, if fsck isn't enough would be to to try debugfs(1)
man debugfs.


On Wed, Nov 18, 2015 at 8:39 PM, Boylan, Ross 
<Ross.Boylan at ucsf.edu> wrote:


I guess some of the trouble was that the virtual disk was mounted read-only at the VM level.  When I mounted read/write I was able to do fsck, which gave messages about replaying the logs and a couple messages about changing the inode counts (sorry, don't have
 the exact words).  Then I ran fsck -f, which didn't report any problems.  Then I mounted it, and everything seems OK.


I'm still interested in the general question about how to diagnose and recover from file system errors, since I have another virtual machine that was backed by a failing real disk.

________________________________________

From: Boylan, Ross

Sent: Wednesday, November 18, 2015 4:35 PM

To: 
Ext3-users at redhat.com

Subject: recovering corrupt file system


Any recommendations for tools to diagnose and recover problems on an ext4 file system?


In particular:

root at jessie01:~# mount -o ro /dev/markov02/root /mnt/markov02

mount: wrong fs type, bad option, bad superblock on /dev/mapper/markov02-root,

       missing codepage or helper program, or other error


       In some cases useful info is found in syslog - try

       dmesg | tail or so.

and e2fsck says

root at jessie01:~# e2fsck /dev/markov02/root

e2fsck 1.42.12 (29-Aug-2014)

/dev/markov02/root: recovering journal

Superblock needs_recovery flag is clear, but journal has data.


markov02/root is an LVM volume, built on partitions from 2 disks in a virtual machine.  The initial symptom was that the VM the disks were in originally would only get as far as busybox when it started.  However, I think the filesystem was OK even after that,
 since it was visible in busybox and in another VM.  I think virt-manager might have overwritten on of the disks because I left "allocate entire disk now" checked when I moved one of the disks between machines.


I'm making copies of the virtual disks now.

Ross Boylan


_______________________________________________

Ext3-users mailing list

Ext3-users at redhat.com

https://www.redhat.com/mailman/listinfo/ext3-users


-- 

Stephen Samuel 
http://www.bcgreen.com  Software, like love, 

778-861-7641                              grows when you give it away


From keld at keldix.com  Fri Nov 20 18:48:40 2015
From: keld at keldix.com (Keld Simonsen)
Date: Fri, 20 Nov 2015 20:48:40 +0200
Subject: recovering corrupt file system
In-Reply-To: <F1F13E14A610474196571953929C02090BDB8B90@ex08.net.ucsf.edu>
References: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>
	<F1F13E14A610474196571953929C02090BDB3DF3@ex08.net.ucsf.edu>
	<CALp1NBjWDnYbM=JRCbODWZCM-n0xB44rHM=obCTkRcKuy+1_zg@mail.gmail.com>
	<F1F13E14A610474196571953929C02090BDB7DFD@ex08.net.ucsf.edu>
	<F1F13E14A610474196571953929C02090BDB8B90@ex08.net.ucsf.edu>
Message-ID: <20151120184840.GC1784@rap.rap.dk>

On Fri, Nov 20, 2015 at 03:06:16AM +0000, Boylan, Ross wrote:
> I tried the "just run e2fsck", but it reduced the filesystem to almost nothing.  Before there were nearly 700G of files; after there were 70G.*  Also, the overall filesystem size shrunk to under 300G.  I ran resize2fs to  get the  space back, but of course that didn't get the files back.
> 
> This seems like an awful lot of damage from losing a total of 8,192 bytes out of ~700G.  Maybe the first block of zeros caused the recovery to decide it had reached the end?  The logical volume got about 64GB from the first, presumably OK, virtual disk.  The holes in the file occur around 174GB into the 2nd virtual hard disk.
> 
> I've still got copies from before e2fsck, and I'm still interested in recovering them (lots of recorded shows on them).
> 
> Sorry about the top-posting; my mail client doesn't provide good way to do otherwise.
> Ross
> 
> *I had expected only to lose the newer files, but a lot of the program files seem to be gone too. startx doesn't exist, for example.

I once made som patches to recover files from a corrupted ext3 file system.
But I did not get it into the main branch and it is not maintained.
But you could try it out. I am not sure if it is any good for an ext4 FS.

http://www.open-std.org/keld/readme-salvage.html

Best regards
keld


From samuel at bcgreen.com  Fri Nov 20 21:08:40 2015
From: samuel at bcgreen.com (Stephen Samuel)
Date: Fri, 20 Nov 2015 13:08:40 -0800
Subject: recovering corrupt file system
In-Reply-To: <F1F13E14A610474196571953929C02090BDB8B90@ex08.net.ucsf.edu>
References: <F1F13E14A610474196571953929C02090BDB3699@ex08.net.ucsf.edu>
	<F1F13E14A610474196571953929C02090BDB3DF3@ex08.net.ucsf.edu>
	<CALp1NBjWDnYbM=JRCbODWZCM-n0xB44rHM=obCTkRcKuy+1_zg@mail.gmail.com>
	<F1F13E14A610474196571953929C02090BDB7DFD@ex08.net.ucsf.edu>
	<F1F13E14A610474196571953929C02090BDB8B90@ex08.net.ucsf.edu>
Message-ID: <CALp1NBhiyxPLuObm6Y3D9zDAXeCWee72UWx20mOQ+4v12QQARA@mail.gmail.com>

 You can try using the secondary superblock:
fsck -b 32768 /dev/whatever

This presumes that  you're using 4K blocks in the filesystem.  you can get
a (more accurate)  list of available
secondary superblocks with

mkfs -n -{other options used to make the filesystem}  /dev/whatever


On Thu, Nov 19, 2015 at 7:06 PM, Boylan, Ross <Ross.Boylan at ucsf.edu> wrote:

> I tried the "just run e2fsck", but it reduced the filesystem to almost
> nothing.  Before there were nearly 700G of files; after there were 70G.*
> Also, the overall filesystem size shrunk to under 300G.  I ran resize2fs
> to  get the  space back, but of course that didn't get the files back.
>
> This seems like an awful lot of damage from losing a total of 8,192 bytes
> out of ~700G.  Maybe the first block of zeros caused the recovery to decide
> it had reached the end?  The logical volume got about 64GB from the first,
> presumably OK, virtual disk.  The holes in the file occur around 174GB into
> the 2nd virtual hard disk.
>
> I've still got copies from before e2fsck, and I'm still interested in
> recovering them (lots of recorded shows on them).
>
> Sorry about the top-posting; my mail client doesn't provide good way to do
> otherwise.
> Ross
>
> *I had expected only to lose the newer files, but a lot of the program
> files seem to be gone too. startx doesn't exist, for example.
>
>
> From: Boylan, Ross
>
> Sent: Thursday, November 19, 2015 1:13 PM
>
> To: Stephen Samuel
>
> Cc: Ext3-users at redhat.com
>
> Subject: RE: recovering corrupt file system
>
>
>
>
>
>
> Thanks for the pointer.  Turning to my other bad file system, I could use
> some help interpreting e2fsck.  I have the source and have been looking at
> various web resources, so  I suppose
>  I could figure this out eventually.
>
>
>
> Actually, maybe I should ask a simpler question: should I just run e2fck,
> accepting its recommendations, and live with the results?  No matter what I
> do I don't think I can recover any more information.
>
>
>
>
> Here's a little diagram:
>
> media01/root   # LVM logical volume on which the ext4 filesystem resides
>
> VM's sda, sdb, various partitions   # physical volumes making up the
> media01VG
>
> ------ virtual machine above here ---
>
> --- physical machine/ host below here -------------------
>
> media01b.vdi                    # host file backing virtual disk sdb
>
> # note I have made a spare copy of media01b.vdi.
>
> # The file backing virtual sda had no hardware problems.
>
> ## various more layers here
>
> physical disk
>
>
>
> The physical disk at the bottom is failing. I used (g)ddrescue to copy as
> much of the media01b.vdi file as I could; the file is about 700G, and there
> were 2 chunks of 0x1000 bytes that could not be recovered and are now 0
> filled.
>
>
>
> The basic structure of the virtual disks appears intact: the partition
> tables are still there and the logical volumes can still be assembled.
>
>
>
> If it's worth getting into the details, here's what e2fsck, run inside
> another VM that has the problems disks temporarily inserted says.  What do
> the individual block bitmap differences mean?  I'm guessing + and minus
> indicate whether the block was found in
>  the scan only or in the file system tables on disk one, but I don't know
> which.  And what do the numbers mean?  Offsets in bytes? sectors? relative
> to ??
>
>
>
> root at wheezy02:~# e2fsck -vn /dev/media01-vg/root
>
> e2fsck 1.42.12 (29-Aug-2014)
>
> One or more block group descriptor checksums are invalid.  Fix? no
>
>
>
> Group descriptor 465 checksum is 0x5e7a, should be 0xa22b.  IGNORED.
>
> Group descriptor 482 checksum is 0x69eb, should be 0x73a5.  IGNORED.
>
> Group descriptor 485 checksum is 0xbd9b, should be 0x21c9.  IGNORED.
>
> Group descriptor 496 checksum is 0xe550, should be 0x9a62.  IGNORED.
>
> Group descriptor 508 checksum is 0xf4d0, should be 0x2466.  IGNORED.
>
> /dev/media01-vg/root contains a file system with errors, check forced.
>
> Pass 1: Checking inodes, blocks, and sizes
>
> Pass 2: Checking directory structure
>
> Pass 3: Checking directory connectivity
>
> Pass 4: Checking reference counts
>
> Pass 5: Checking group summary information
>
> Block bitmap differences:  +15243264 +(15511418--15511423)
> +(15511488--15511551) +(15523264--15523327) +(15812608--15813174)
> -(15813349--15813503) -(15813632--158\
>
> 14054) +(15814656--15815176) +(15815680--15816191) -(15816505--15816703)
> -(15823872--15824895) -(15850505--15851519) -(15852544--15853567)
> -(15896583--15898623) +\
>
> (16029735--16029759) +(16029786--16029823) +(16029852--16029855)
> +(16029884--16031743) -(16261152--16261954) -(16263740--16263743) -16459791
> +(16459795--16459799)\
>
>  -(16459808--16459814) -(16459825--16459839) -(16460000--16460026)
> -(16460288--16460799) +(16668689--16670719) +(17210175--17211391) +17498112
> -(17498624--1749913\
>
> 5) -(17499262--17500159) +17534976 -(17536000--17537023)
> -(17953505--17954815) +(18026714--18028543) +(18031327--18032639)
> -(18032653--18034687) +(18062252--18063\
>
> 359) +(18655232--18657173) -(19314688--19316735) -(19331072--19333119)
> -19406880 -25174048 -(41954376--41955015) -(42999849--42999850)
> -(42999852--42999859) -(429\
>
> 99861--42999862) -(43524128--43546654) -(45621280--45625870)
> -(45637632--45641520) -46669856 -(48242720--48246756) -(67436544--67446783)
>
> Fix? no
>
>
>
> Free blocks count wrong for group #473 (0, counted=134).
>
> Fix? no
>
>
>
> ## quite a few more Free blocks wrong messages
>
>
>
> Free blocks count wrong (48434540, counted=48384585).
>
> Fix? no
>
>
>
> Inode bitmap differences:  -4849670
>
> Fix? no
>
>
>
> Free inodes count wrong for group #592 (8187, counted=8186).
>
> Fix? no
>
>
>
> Free inodes count wrong (16811016, counted=16811015).
>
> Fix? no
>
>
>
> Padding at end of block bitmap is not set. Fix? no
>
>
>
>
>
> /dev/media01-vg/root: ********** WARNING: Filesystem still has errors
> **********
>
>
>
>
>
>        56312 inodes used (0.33%, out of 16867328)
>
>           91 non-contiguous files (0.2%)
>
>           53 non-contiguous directories (0.1%)
>
>              # of inodes with ind/dind/tind blocks: 0/0/0
>
>              Extent depth histogram: 51385/43
>
>     19012244 blocks used (28.19%, out of 67446784)
>
>            0 bad blocks
>
>           14 large files
>
>
>
>        46167 regular files
>
>         5101 directories
>
>           12 character device files
>
>           25 block device files
>
>            0 fifos
>
>
>
>
>
> From: darkonc at gmail.com [darkonc at gmail.com] on behalf of Stephen Samuel [
> samuel at bcgreen.com]
>
> Sent: Thursday, November 19, 2015 8:00 AM
>
> To: Boylan, Ross
>
> Cc: Ext3-users at redhat.com
>
> Subject: Re: recovering corrupt file system
>
>
>
>
>
>
> well, the next place to go, if fsck isn't enough would be to to try
> debugfs(1)
> man debugfs.
>
>
>
> On Wed, Nov 18, 2015 at 8:39 PM, Boylan, Ross
> <Ross.Boylan at ucsf.edu> wrote:
>
>
> I guess some of the trouble was that the virtual disk was mounted
> read-only at the VM level.  When I mounted read/write I was able to do
> fsck, which gave messages about replaying the logs and a couple messages
> about changing the inode counts (sorry, don't have
>  the exact words).  Then I ran fsck -f, which didn't report any problems.
> Then I mounted it, and everything seems OK.
>
>
>
> I'm still interested in the general question about how to diagnose and
> recover from file system errors, since I have another virtual machine that
> was backed by a failing real disk.
>
> ________________________________________
>
> From: Boylan, Ross
>
> Sent: Wednesday, November 18, 2015 4:35 PM
>
> To:
> Ext3-users at redhat.com
>
> Subject: recovering corrupt file system
>
>
>
>
> Any recommendations for tools to diagnose and recover problems on an ext4
> file system?
>
>
>
> In particular:
>
> root at jessie01:~# mount -o ro /dev/markov02/root /mnt/markov02
>
> mount: wrong fs type, bad option, bad superblock on
> /dev/mapper/markov02-root,
>
>        missing codepage or helper program, or other error
>
>
>
>        In some cases useful info is found in syslog - try
>
>        dmesg | tail or so.
>
> and e2fsck says
>
> root at jessie01:~# e2fsck /dev/markov02/root
>
> e2fsck 1.42.12 (29-Aug-2014)
>
> /dev/markov02/root: recovering journal
>
> Superblock needs_recovery flag is clear, but journal has data.
>
>
>
> markov02/root is an LVM volume, built on partitions from 2 disks in a
> virtual machine.  The initial symptom was that the VM the disks were in
> originally would only get as far as busybox when it started.  However, I
> think the filesystem was OK even after that,
>  since it was visible in busybox and in another VM.  I think virt-manager
> might have overwritten on of the disks because I left "allocate entire disk
> now" checked when I moved one of the disks between machines.
>
>
>
> I'm making copies of the virtual disks now.
>
> Ross Boylan
>
>
>
> _______________________________________________
>
> Ext3-users mailing list
>
> Ext3-users at redhat.com
>
> https://www.redhat.com/mailman/listinfo/ext3-users
>
>
>
>
>
>
>
>
>
>
>
> --
>
> Stephen Samuel
> http://www.bcgreen.com  Software, like love,
>
> 778-861-7641                              grows when you give it away
>
>
>
>
>
>
>
>


-- 
Stephen Samuel http://www.bcgreen.com  Software, like love,
778-861-7641                              grows when you give it away
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/ext3-users/attachments/20151120/a4a42d8e/attachment.htm>