[rhelv6-list] Kernel memory leak?

robinprice at gmail.com robinprice at gmail.com
Tue Aug 30 14:08:09 UTC 2011


I did some more searching this morning as I mentioned I would last
night.  I have not found anything in particular to your situation.
The only suggestions I have would be:

1) Try getting a core during the memory consumption as I mentioned and
do a RCA on the vmcore.
2) Write a stap script to trace d_alloc in the kernel (or one of the
d_cache functions) to see who is allocating dentries, and correlate
that to a process (a perf script would help you do that pretty
easily).
3) Use lsof to see who has tons of open files.  Presumably if you're
swapping with 100% of ram holding dentries, someone is using those
dentries which means lots of open files.

Good luck.  Sorry I couldn't find anything.

If anyone has a valid RHEL subscription, I would encourage you to try
with the latest RHEL6 kernel to see if the leak is still there, and if
it is, allow GSS to help you find root cause.

~rp


On Mon, Aug 29, 2011 at 7:52 PM, Abdussamad Abdurrazzaq
<abdussamad at abdussamad.com> wrote:
> On 08/30/2011 04:39 AM, robinprice at gmail.com wrote:
>>
>> On Mon, Aug 29, 2011 at 6:18 PM, Abdussamad Abdurrazzaq
>> <abdussamad at abdussamad.com>  wrote:
>>>
>>> Hello
>>>
>>> Ok please ignore my previous email (if you've seen it). It's quite
>>> confused
>>> because I posted using gmane.org.
>>>
>>> I know about how Linux reports memory usage. My problem is very much
>>> real.
>>> Memory usage keeps increasing because of a memory leak in the kernel
>>> dentry
>>> cache. This is the same problem as outlined by others here:
>>>
>>> https://www.redhat.com/archives/rhelv6-list/2011-February/msg00001.html
>>>
>>> So I was wondering whether this problem was fixed? I am using centos 6
>>> with
>>> the following kernel:
>>>
>>> Linux serve3.websitetheme.com. 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon Jun
>>> 27
>>> 19:49:27 BST 2011 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> At one point dentry was using 3GB plus on  my 8GB system!
>>>
>>> I am currently using a cron job to clear the cache every so often:
>>>
>>> sync&&  echo 2>/proc/sys/vm/drop_caches
>>>
>>> The above works but I am looking for a more permanent solution. To that
>>> end
>>> I tried increasing:
>>>
>>> echo 10000>  /proc/sys/vm/vfs_cache_pressure
>>>
>>> And in /etc/sysctl.conf But to no effect.
>>>
>>> So any idea how to fix this?
>>>
>>> Regards,
>>> Abdussamad
>>>
>>> _______________________________________________
>>> rhelv6-list mailing list
>>> rhelv6-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/rhelv6-list
>>>
>>
>> There are some things you can try to do. You can collect a vmcore from
>> a time period during which the system has exhausted nearly all of it's
>> memory due to this leak.  Then you could try to analyze kmem to
>> indicate if there is a problem with the kernel.
>>
>> You could even go as simple as just looking at top, or the contents of
>> /proc/<pid>/status periodically for the set of apps you suspect.  If
>> you do have a single app that is leaking memory, you should be able to
>> record and graph a consistent increasing trend in the amount of memory
>> the faulty app is leaking.  That would at least give you a starting
>> point for where to use an app like valgrind.
>>
>> Also, if you don't know which proc is at fault, I'd start with
>> /etc/crontab:
>>   */5 * * * * root ps axo comm,vsize,rss | tail -n +2>>  /tmp/rawdata
>>
>> This would collect /proc/pid/statm data for a while.
>>
>> This should work on an selinux-enforcing machine based on the output of:
>>  # sesearch -As crond_t | grep tmp
>>
>> Then use awk to find the low- and high-water marks for each comm.
>>
>> You could get fancy and add timestamps to the records; maybe track
>> current, low-water, and high-water marks, then gnuplot with error
>> bars.
>>
>> As far as your drop_caches work around, know that drop_caches may
>> cause performance degrade because some cached data are flushed and
>> system have to load them from disk if they are needed again.
>>
>> Use the "ps aux" above results in the cron job and locate which
>> program have a growing RSS.
>> And sysstat (/var/log/sa/sar*) may provide some historical memory
>> information that you may be interested in.
>>
>> Hope this gives you somewhere to look.
>>
>> I will follow-up on the thread you mentioned.
>>
>> ~rp
>>
>> _______________________________________________
>> rhelv6-list mailing list
>> rhelv6-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/rhelv6-list
>>
> I don't understand. Isn't dentry cache managed by the kernel? So why would I
> look at applications for possible leaks when its obviously the kernel that's
> at fault here? Please read my post again including the thread I linked to.
>  It seems to me you've misunderstood my problem.
>
> _______________________________________________
> rhelv6-list mailing list
> rhelv6-list at redhat.com
> https://www.redhat.com/mailman/listinfo/rhelv6-list
>




More information about the rhelv6-list mailing list