[rhelv6-list] Kernel memory leak?

Tue Feb 1 15:45:34 UTC 2011

Hi,

We have exactly the same problem in our test environment. (identical IBM xSeries hardware) Some machines crashing after a few days.

Note: We are testing with Scientific Linux 6 kernel 2.6.32-71.14.1.el6.x86_64. (But should make no difference to RHEL6)

Particularly striking: the large size of dentry in slabtop.

Any ideas?

Best regards,

Morten

> -----Original Message-----
> From: rhelv6-list-bounces at redhat.com [mailto:rhelv6-list-
> bounces at redhat.com] On Behalf Of Masopust, Christian
> Sent: Tuesday, February 01, 2011 12:25 PM
> To: 'Red Hat Enterprise Linux 6 (Santiago) discussion mailing-list'
> Subject: Re: [rhelv6-list] Kernel memory leak?
> 
> 
> any news on this topic?
> 
> Here I have some RHEL6 systems that now and then crash (appr. all 10
> days)
> and currently I've absolutely no idea why (other RHEL6 run fine on same
> HW)
> 
> How exactly can/should I monitor the memory usage?
> 
> Thanks,
> christian
> 
> 
> -----Ursprüngliche Nachricht-----
> Von: rhelv6-list-bounces at redhat.com [mailto:rhelv6-list-
> bounces at redhat.com] Im Auftrag von Chris Adams
> Gesendet: Samstag, 22. Jänner 2011 05:07
> An: Stephen John Smoogen
> Cc: Red Hat Enterprise Linux 6 (Santiago) discussion mailing-list
> Betreff: Re: [rhelv6-list] Kernel memory leak?
> 
> Once upon a time, Stephen John Smoogen <smooge at gmail.com> said:
> > > I looked at "slabtop", and the dentry cache is the culprit:
> > >
> > >  OBJS ACTIVE  USE OBJ SIZE  SLABS OBJ/SLAB CACHE SIZE NAME
> > > 3078900 3078900 100%    0.19K 153945       20    615780K dentry
> > >
> > > Anybody else seeing this?
> >
> > What is the box doing? How is it set up (ext3, ext4, ?). I haven't
> > really looked but would want to check with a system that is set up
> > similarly.
> 
> It is running ext4 on LVM on md-raid1.  It is running Nagios, Apache,
> Quagga, Network UPS Tools (monitoring a couple of UPSes), and smstools.
> Quagga is running OSPF (to learn some routes to some of the
> Nagios-monitored devices) and BGP (to advertise some routes via BGP
> from
> a home-written "bad IP" monitor).
> 
> The bad-IP monitor uses several perl scripts I wrote, one of which uses
> the Linux::Inotify2 module to watch a directory that gets log files
> added and removed for each bad IP.  The last few days have been rather
> busy for my bad IP detector; there are 1292 files in that directory
> right now for the last 48 hours.
> 
> I wondered if the single inotify could be a trigger (as that's the only
> thing really unusual), but stopping that daemon doesn't free the RAM
> from the dentry cache.
> 
> This same set of software was running on the old server (running Fedora
> 7 i386 - yes, I was that far behind).  It was different hardware, but
> the same setup except for ext3 instead of ext4 (still LVM on md-raid1,
> Nagios, Apache, etc.).  The old server's RAM usage had been level for
> years at 256M RAM; the new server started at about 512M (not unexpected
> since I switched to x86_64) and has increased in an almost perfectly
> straight line to just under 1G in 8 days.
> 
> The only other difference from the old server is that it had SELinux
> disabled and the new one has SELinux running in permissive mode (still
> trying to work on a useful policy to allow Nagios to do all the things
> I
> need).
> 
> --
> Chris Adams <cmadams at hiwaay.net>
> Systems and Network Administrator - HiWAAY Internet Services
> I don't speak for anybody but myself - that's enough trouble.
> 
> _______________________________________________
> rhelv6-list mailing list
> rhelv6-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rhelv6-list
> 
> _______________________________________________
> rhelv6-list mailing list
> rhelv6-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rhelv6-list