Re: FS corruption; HTREE-related?

On Oct 07, 2002  05:07 +0000, JP Howard wrote:
> ----
> # ls -laR > /dev/null
> ...
> ls: ./server2/b/user/bxyz/392.: Input/output error
> ----
> esfsck shows "Inodes that were part of a corrupted orphan linked list
> found."
> We've been hitting this computer pretty hard, migrating data across to it
> from 4 servers simultaneously using rsync. Of around a million files or
> so, 250 developed this problem.

It appears that the inode was marked as deleted, but it was not unlinked
from the directory tree.  It would be important to know whether this
file should or should not exist at this time (i.e. was it ever deleted)?

> debugfs:  stat 392.
> Inode: 14992585   Type: regular    Mode:  0600   Flags: 0x0   Generation:
> 2449155561
> User:   504   Group:   505   Size: 0
> File ACL: 0    Directory ACL: 0
> Links: 0   Blockcount: 0
> Fragment:  Address: 0    Number: 0    Size: 0
> ctime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
> atime: 0x3da05967 -- Sun Oct  6 10:40:23 2002
> mtime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002
> dtime: 0x3da0be92 -- Sun Oct  6 17:52:02 2002

The dtime is consistent with a file that was deleted, and not one that
is on an orphan list.  

> debugfs:  ncheck 14992585
> Inode   Pathname
> 14992585        /var/data/server2/b/user/bxyz/392.

This shows the inode is still in the directory tree.  If it was supposed
to have been deleted (7 hours after it was created, it appears), then it
would definitely point to a hashing or other lookup problem in the htree
code (either it was inserted into the wrong hash block or subsequent
block splits placed it into the wrong block, or it could not be found
in the leaf block at delete time).  If it was never supposed to have
been deleted it would likely be a different sort of problem.

I can see from the size of the parent directory (not quoted, but 20kB)
that this would be an htree directory.

> debugfs:  testi 392.
> Inode 14992585 is not in use

So, the inode was also marked free in the inode bitmap, further
indication that it was supposed to have been deleted.

> [root server5 bxyz]# ls -l 392.
> ls: 392.: Input/output error

The -EIO error is most likely caused by the fact that any operations
on this deleted inode are from the "is_bad_inode()" VFS ops, which all
return -EIO for every kind of operation.

> BTW, now that I've disabled directory indexing, will folders with the
> relevent flag already set still use hashed indexes?

No, they will be accessible via the normal linear-search methods in ext3
though.  Depending on the size of your directories this may or may not
be "acceptible performance", but it is usually better than not working
at all - another reason I'm glad we made the effort for the htree code
to be compatible with non-htree kernels.

Cheers, Andreas
Andreas Dilger

