[Linux-cluster] Re: Hard lockups during file transfer to GNBD/GFS device
David Brieck Jr.
dbrieck at gmail.com
Thu Sep 28 19:08:58 UTC 2006
On 9/28/06, David Brieck Jr. <dbrieck at gmail.com> wrote:
> Here is our setup: 2 GNBD servers attached to a shared SCSI array. Each (of 9) nodes uses multipath to import the shared device from both servers. We are also using GFS on to of that for our shared storage.
> What is happening is that I need to transfer a large number of files (about 1.5 million) from a nodes local storage to the network storage. I'm using rsync locally to move all the files. Orginally my problem was that the oom killer would start running partway through the transfer and the machine would then be unusable (however it was still up enough that it wasn't fenced). Here is that log:
> I found a few postings saying that using the hugemem kernel would solve the problems (they claimed it was a known SMP bug by redhat) so all my systems are now running on that kernel. It did solve the out of memory problem, but it seems to have introduced some new ones. Here are the logs from the most recent crashes:
> The GNBD servers stay online and don't have any problems, it's just the client where all the trouble is coming from. Is this a bug or is something not setup right?
> If you need more info I'll be happy to provide it.
I just tried to more the same data by tar-ing it up to the network,
same result. Again, this is about 94GB and 1.5 million files that I
seem to be unable to move from local storage to shared. Anyone have
More information about the Linux-cluster