[Spacewalk-list] 2.0 Upgrade - serious performances issues during daily "diff profiled config files and deployed config files"

Sebastien Leterrier sleterrier at blizzard.com
Sat Aug 3 06:55:02 UTC 2013


I actually take part of this back: the upgrade to 2.0 was completed 9 days ago (07/24), but we are observing this behavior since only 3 days (07/31).

I am not aware of any changes that might have happened on the host at this date, but I am still investigating. Any suggestions are of course more than welcome.



Thanks



From: spacewalk-list-bounces at redhat.com [mailto:spacewalk-list-bounces at redhat.com] On Behalf Of Sebastien Leterrier
Sent: Friday, August 02, 2013 11:44 PM
To: spacewalk-list at redhat.com
Subject: [Spacewalk-list] 2.0 Upgrade - serious performances issues during daily "diff profiled config files and deployed config files"



Hello,



Since the upgrade of our master instance to Spacewalk 2.0, we are experiencing extremely high CPU/MEM usage as soon as the daily "Show differences between profiled config files and deployed config files<https://itspacewalk/rhn/schedule/CompletedSystems.do?aid=20766>" job kicks in at 23:00. Indeed, the server is pretty much unresponsive for the next two hours (API calls fail, GUI timing out), busy as it is with the hundreds of postmaster processes that we see popping up during that time. A few additional facts:

-      The number of pending actions processed during that time is close to zero. At the end of this frenzy, a quarter of our clients have been diff'ed and the rest of them ends up marked as "failed action".

-      Once the job has kicked in, cancelling the actions will not solve anything. The peak of resources usage will continue until 01:00am, whatever happens. I even tried a server reboot yesterday but as soon as it came back up, the postmaster processes started to queue again.



The number of managed clients (~150) barely changed between 1.9 and 2.0  on our side, a few hosts might have been added, but less than 5 anyway. Looking at our monitoring graphs, we were barely noticing this operation before (CPUs going from 20% to 25% usage) the upgrade to 2.0. But since the upgrade, the CPU graphs peaking at 100% between 23:00 and 01:00 (precisely!) every day.



I am guessing the way these diffs are being done have changed since 2.0? I would greatly appreciate some help in order to troubleshoot this further and hopefully coming back to a stable state.



Thanks.

Sebastien L.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/spacewalk-list/attachments/20130803/1cd65571/attachment.htm>


More information about the Spacewalk-list mailing list