[Linux-cluster] rhcs + gfs performance issues

Tue Oct 7 15:00:17 UTC 2008

>  >
> > I can't see a way around some significant downtime even with that, and
> > there is no way they will give me the option to be down from a planned
> > perspective.  
> 
> So, out of nowhere straight into production, without performance user 
> acceptance testing period? And they won't allow any planned downtime? My 
> mind boggles.

Yours too huh?  This is the strangest place I have ever worked quite
frankly.  I've never been anywhere I could not set aside a 2 hour window
at 3 am once a month for upgrades/maintenance.  They don't allow me that
here.  Migrating from old file server to new one, was done with zero
downtime and no interruption to the user community.  Due to $$$, very
little redundancy.  Sure, downtime does happen, but only when something
breaks.  I could go on, I think you get the picture and whining about it
doesn't help me here.

Straight into production...well, not exactly.  I set up a cluster, and
moved one application over and it ran for about 3 months before we began
the user moves.  Once the users and mail were moved, that's when the
load issues reared it's ugly head.  Like I said, was really bad at
first.  Had to bump the nfs processes to 256..that helped some..setting
the fs to fast = 1, had a much bigger impact.  The odd thing is, it
doesn't seem to take much to drive up the load.  Being an engineering
school, we have a lot of cadence users, and cadence writes 2-5k files on
a big job, and it doesn't take more than 2 or 3 users doing this, along
with the normal stuff always touching the fileserver (such as mail, web,
etc) to drive up load.  I can virus scan my mapped home directory and
watch load jump by 2 or 3.  Mounting my old home directory on the old
file server and doing the same thing, you wouldn't even know I was
touching files out there.  It's like directory/file access is just very
expensive for some reason and it goes against everything I know :P.

Let me run this by you.  I thought about another potential upgrade path.
What if I remove one node from the cluster and run on one node, take the
2nd down, install 5, get it prepped.  Is there anyway in the world to
somehow bring it up and have it mount the volumes AS master and take the
current primary down to then rebuild it?  I think my answer is no, but
thought it worth asking.  This inability to cross version participate
seems to really be my achilles heal here in getting it upgraded.

> 
> Good luck.
> 
> Gordan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster