Latest FC kernel still have SMP bugs?

Nigel Wade nmw at ion.le.ac.uk
Wed Apr 7 10:04:35 UTC 2004


Norman Gaywood wrote:
> On Tue, Apr 06, 2004 at 03:38:21PM +0100, Nigel Wade wrote:
> 
>>Nigel Wade wrote:
>>
>>>Norman Gaywood wrote:
>>>
>>>>On Fri, Apr 02, 2004 at 11:31:06AM +0100, Nigel Wade wrote:
>>>>
>>>>>Unfortunately, RH9/RHEL3 don't have the version of LDAP I require, 
>>>>>and when I tried an upgrade to openldap I started getting FUTEX 
>>>>>locking problems.
>>>>
>>>>Install the RHEL3 kernel on FC1. I installed:
>>>>
>>>>kernel-smp-2.4.21-9.0.1.EL
>>>>kernel-smp-unsupported-2.4.21-9.0.1.EL
>>
>>Ok, the bottom line is that this doesn't work either.
>>
>>I've built the RHEL3 kernel on FC1 and run the system up using this. 
>>Ordinary ldap requests seem quite happy, but I'm still seeing the futex 
>>lock problem with nss_ldap. E.g. when I add 'hosts: files ldap dns' to 
>>/etc/nsswitch.conf I get:
>>
>># strace ping host
>>munmap(0xb75ff000, 4096)                = 0
>>uname({sys="Linux", node="hostname", ...}) = 0
>>futex(0x53e4ec, FUTEX_WAIT, 2, NULL
> 
> 
> Interesting. I thought the EL kernel had the fast futex code back-ported
> but I have not checked this. So I thought the EL kernel would work like
> RH9 and FC1 kernels.

I've since found a bug listed in bugzilla specifically related to ping and 
futex - https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=110563
The workaround there is to build nss_ldap from PADL rather than use the FC1 
version. That will be my next step.

> 
> Could I ask what what steps you took to build the EL kernel on FC1? I
> didn't have much luck building with rpmbuild when I tried this.

I built the kernel RPMs from src.rpm on a RH9 system, I couldn't get them to 
build on FC1. I then installed the kernel-source RPM on the FC1 system. 
After the ususal 'make mproper; cp configs/whatever .config; make oldconfig; 
make xconfig' I edited the Makefile and set it to use gcc32 rather than gcc. 
Then 'make dep; make bzImage; make modules; make modules_install; make 
install'. The reboot with the new kernel.

> 
> 
>>Setting LD_ASSUME_KERNEL=2.4.1 cures this particular problem, but isn't a 
>>solution for a system relying on LDAP for all authentication and NSS 
>>functions.
> 
> 
> Another suggestion for you to try is on one of the bugzillas. That is,
> to rebuild the FC1 kernel with low latency scheduling turned off. Maybe
> that's your best option.
> 
> https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=109962
> 
> Comments #42 and #44
> 

Ok, that's new since I last looked. I'll give it a go and see what happens.


-- 
Nigel Wade, System Administrator, Space Plasma Physics Group,
             University of Leicester, Leicester, LE1 7RH, UK
E-mail :    nmw at ion.le.ac.uk
Phone :     +44 (0)116 2523548, Fax : +44 (0)116 2523555





More information about the fedora-list mailing list