[Linux-cluster] mixing OS versions?

Steven Whitehouse swhiteho at redhat.com
Fri Apr 25 11:42:59 UTC 2014


Hi,

On 24/04/14 17:29, Alan Brown wrote:
> On 30/03/14 12:34, Steven Whitehouse wrote:
>
>> Well that is not entirely true. We have done a great deal of
>> investigation into this issue. We do test quotas (among many other
>> things) on each release to ensure that they are working. Our tests have
>> all passed correctly, and to date you have provided the only report of
>> this particular issue via our support team. So it is certainly not
>> something that lots of people are hitting.
>
> Someone else reported it on this list (on centos), so we're not an 
> isolated case.
>
>> We do now have a good idea of where the issue is. However it is clear
>> that simply exceeding quotas is not enough to trigger it. Instead quotas
>> need to be exceeded in a particular way.
>
> My suspicion is that it's some kind of interaction between quotas and 
> NFS, but it'd be good if you could provide a fuller explanation.
>
Yes, thats what we thought to start with... however that turned out to 
be a bit of a red herring. Or at least the issue has nothing 
specifically to do with NFS. The problem was related to when quota was 
exceeded, and specifically what operation was in progress. You could 
write to files as often as you wanted to, and exceeding quota would be 
handled correctly. The problem was a specific code path within the inode 
creation code, if it didn't result in quota being exceeded on that one 
specific code path, then everything would work as expected.

Also, quite often when the problem did appear, it did not actually 
trigger a problem until later, making it difficult to track down.

You are correct that someone else reported the issue on the list, 
however I'm not aware of any other reports beyond yours and theirs. 
Also, this was specific to certain versions of GFS2, and not something 
that relates to all versions.

The upstream patch is here:
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/gfs2?id=059788039f1e6343f34f46d202f8d9f2158c2783

It should be available in RHEL shortly - please ping support via the 
ticket for updates,

Steve.

>> Returning to the original point however, it is certainly not recommended
>> to have mixed RHEL or CentOS versions running in the same cluster. It is
>> much better to keep everything the same, even though the GFS2 on-disk
>> format has not changed between the versions.
>
> More specfically (for those who are curious): Whilst the on-disk 
> format has not changed between EL5 and EL6, the way that RH cluster 
> members communicate with each other has.
>
> I ran a quick test some time back and the 2 different OS cluster 
> versions didn't see each other for LAN heartbeating.
>
>
>




More information about the Linux-cluster mailing list