[long] major problems on fs; e2fsck running out of memory

Keith Keller kkeller at wombat.san-francisco.ca.us
Sat May 31 18:56:07 UTC 2014


Hello ext3 list,

I am having an odd issue with one of my filesystems, and I am hoping
someone here can help out.  Yes, I do have backups.  :)  But as is often
the case, it's nice to avoid restoring from backup if possible.  If
there is a more appropriate place for this question please let me know.

After quite a while between reboots, I saw a report on the console that
the filesystem was inconsistent and could not be automatically repaired.
After some aborted tests (which I did not log, unfortunately, I was able
to get this far:

# head fsck.out
fsck from util-linux-ng 2.17.2
/dev/mapper/vg1--sdb-lv_vz contains a file system with errors, check forced.

# time passes, progress bar gets to 51.8% with no problems, then

Pass 1: Checking inodes, blocks, and sizes
Inode 266338321 has imagic flag set.  Clear? yes

Inode 266338321 has a extra size (34120) which is invalid
Fix? yes

Inode 266338321 has compression flag set on filesystem without
compression support.  Clear? yes

# some 150k messages later

Inode 266349409, i_blocks is 94855766560840, should be 0.  Fix? yes

Inode 266349363 has a bad extended attribute block 1262962006.  Clear?
yes

Inode 266349363 has illegal block(s).  Clear? yes

Illegal block #6 (1447645766) in inode 266349363.  CLEARED.
Illegal indirect block (1447642454) in inode 266349363.  CLEARED.
Illegal block #270533644 (1702521203) in inode 266349363.  CLEARED.
Warning... fsck.ext4 for device /dev/mapper/vg1--sdb-lv_vz exited with signal 11.

I wasn't sure what that meant, and somewhat without thinking, I made
more attempts to repair the fs (including, IIRC, some attempts to mount
the filesystem ro).  Here's the next fsck attempt:

fsck from util-linux-ng 2.17.2
fsck.ext4: Group descriptors look bad... trying backup blocks...
One or more block group descriptor checksums are invalid.  Fix? yes

Group descriptor 0 checksum is invalid.  FIXED.
Group descriptor 1 checksum is invalid.  FIXED.

# many group descriptors fixed

Group descriptor 40834 checksum is invalid.  FIXED.
Group descriptor 40836 checksum is invalid.  FIXED.
/dev/mapper/vg1--sdb-lv_vz contains a file system with errors, check forced.
This doesn't bode well, but we'll try to go on...
Pass 1: Checking inodes, blocks, and sizes

# again gets to 51.8% with no problems
# again over 100k lines of errors

Inode 266356018 is too big.  Truncate? yes

Block #537922572 (62467) causes directory to be too big.  CLEARED.
Warning... fsck.ext4 for device /dev/mapper/vg1--sdb-lv_vz exited with signal 11.

I tried once more with e2fsck 1.41.12 (stock from CentOS 6), then on a
search, tried e2fsck 1.42.10 from source.

ext2fs_check_desc: Corrupt group descriptor: bad block for block bitmap
./e2fsprogs-1.42.10/e2fsck/e2fsck: Group descriptors look bad... trying backup blocks...
/dev/mapper/vg1--sdb-lv_vz contains a file system with errors, check forced.
./e2fsprogs-1.42.10/e2fsck/e2fsck: A block group is missing an inode table while reading bad blocks inode
This doesn't bode well, but we'll try to go on...
Pass 1: Checking inodes, blocks, and sizes

# again gets to 51.8% with no problems
# again over 100k lines of errors

Illegal block #6 (1498565709) in inode 266374005.  CLEARED.
Inode 266374005 is too big.  Truncate? yes

Block #73401356 (909652270) causes directory to be too big.  CLEARED.
Error storing directory block information (inode=266374005, block=0, num=22224176): Memory allocation failed

/dev/mapper/vg1--sdb-lv_vz: ***** FILE SYSTEM WAS MODIFIED *****

/dev/mapper/vg1--sdb-lv_vz: ***** FILE SYSTEM WAS MODIFIED *****

Repeated attempts seem to get farther into repairs, but there's still a
large number of repairs reported, which seems scary, and it still runs
out of memory on a 128GB-memory server.  I don't currently have a
filesystem with more than 128GB of free space if I wanted to use the
scratch_files option (though if that's really the solution, I'll make a
way).

The 51.8% seems very suspicious to me.  A few weeks ago, I did an online
resize2fs, and the original filesystem was about 52% the size of the new
one (from 2.7TB to 5.3TB).  The resize2fs didn't report any errors, and
I haven't seen any physical errors in the logs, so this is the first
indication I've had of a problem.

My tune2fs output and other possible information is below.

Is there any hope for this filesytem?  The "doesn't bode well" message
doesn't give me hope, but perhaps there's some last-ditch efforts I can
make to try to recover.  If you need any other information about the
filesystem please let me know.

--keith

# uname -a
Linux XXX.XXX 2.6.32-042stab090.2 #1 SMP Wed May 21 19:25:03 MSK 2014 x86_64 x86_64 x86_64 GNU/Linux
# free -g
             total       used       free     shared    buffers     cached
Mem:           125          0        125          0          0          0
-/+ buffers/cache:          0        125
Swap:            0          0          0
# lvs
  LV       VG      Attr   LSize  Origin Snap%  Move Log Copy%  Convert
  lv_local vg0-sda -wi-ao 19.53g
  lv_swap  vg0-sda -wi-a-  2.00g
  lv_tmp   vg0-sda -wi-ao 19.53g
  lv_usr   vg0-sda -wi-ao 19.53g
  lv_var   vg0-sda -wi-ao 19.53g
  lv_vz    vg1-sdb -wi-a-  5.36t
# vgs
  VG      #PV #LV #SN Attr   VSize  VFree
  vg0-sda   1   5   0 wz--n- 96.09g 15.96g
  vg1-sdb   1   1   0 wz--n-  5.36t     0
# pvs
  PV         VG      Fmt  Attr PSize  PFree
  /dev/sda3  vg0-sda lvm2 a--  96.09g 15.96g
  /dev/sdb1  vg1-sdb lvm2 a--   5.36t     0


tune2fs 1.41.12 (17-May-2010)
Filesystem volume name:   <none>
Last mounted on:          /vz
Filesystem UUID:          74a4ea8b-03ed-4e9c-ab01-8574517cd5af
Filesystem magic number:  0xEF53
Filesystem revision #:    1 (dynamic)
Filesystem features:      ext_attr resize_inode dir_index filetype
extent flex_bg sparse_super large_file huge_file uninit_bg dir_nlink
extra_isize
Filesystem flags:         signed_directory_hash
Default mount options:    (none)
Filesystem state:         not clean with errors
Errors behavior:          Continue
Filesystem OS type:       Linux
Inode count:              359661568
Block count:              1438622720
Reserved block count:     60203550
Free blocks:              1030108897
Free inodes:              357427346
First block:              0
Block size:               4096
Fragment size:            4096
Reserved GDT blocks:      681
Blocks per group:         32768
Fragments per group:      32768
Inodes per group:         8192
Inode blocks per group:   512
Flex block group size:    16
Filesystem created:       Thu May 31 14:47:29 2012
Last mount time:          Sun Oct 27 21:48:21 2013
Last write time:          Fri May 30 23:22:31 2014
Mount count:              1
Maximum mount count:      21
Last checked:             Sun Oct 27 21:38:53 2013
Check interval:           15552000 (6 months)
Next check after:         Fri Apr 25 21:38:53 2014
Lifetime writes:          14 TB
Reserved blocks uid:      0 (user root)
Reserved blocks gid:      0 (group root)
First inode:              11
Inode size:	          256
Required extra isize:     28
Desired extra isize:      28
Default directory hash:   half_md4
Directory Hash Seed:      37c55228-9d7b-4a34-bd88-8322f435b9cb
Journal backup:           inode blocks




-- 
kkeller at wombat.san-francisco.ca.us




More information about the Ext3-users mailing list