[Linux-cluster] Files are there, but not.
rpeterso at redhat.com
Wed Sep 27 16:59:56 UTC 2006
Jaap Dijkshoorn wrote:
> It looks like it!
> I have aksed the user who is having this problem, what exactly is
> happening with those files during his job. I hope this will give us a
> clue in what ways those files are touched and/or deleted etc.
> All files are read/write by the users through NFS. But that strange
> thing is that on 4 of the 5 servers the files are still available, on
> GFS as well on the clients through NFS.
> thanks already for the effort. I hope we can tackle this bug!
> Best Regards,
Soon after I sent the last email, I did recreate the problem here in our
though it was after several days of trying. That's good: It means the U4 is
very stable, and it means I can probably work on the problem without the
need for further information from people in the field. I did just
bugzilla, but here's what I know so far:
This is hard to explain, so let me simplify by calling "A" the cluster node
that shows the files correctly, and "B" the cluster node that say the files
are missing. Let's further say that an example "missing" file is:
/mnt/gfs/subdir/xyz. So "ls /mnt/gfs/subdir/xyz" from "A" shows the
file correctly, while the same command from "B" produces
"No such file or directory".
The biggest clue I've found today is this:
It looks as if "B" somehow seems to have the wrong inode cached for
"subdir". In other words, a stat command run on the directory
shows the wrong directory inode (possibly a deleted subdirectory?) on
"B" whereas "A" has the correct inode for "subdir" with the same stat
command. I'm not sure yet if this incorrect cached inode is coming from
or whether it's in the Linux vfs. I'm still investigating.
Please update the bugzilla if you get more information. In the meanwhile,
I'll continue working on the problem and I'll keep the bugzilla up to date
when I find out more.
Red Hat Cluster Suite
More information about the Linux-cluster