[Spacewalk-list] Massive problems with slow updates on rhnServerAction
Patrick Hurrelmann
patrick.hurrelmann at lobster.de
Wed Oct 31 09:01:57 UTC 2012
On 19.09.2012 10:13, Patrick Hurrelmann wrote:
> Hi List,
>
> since some weeks my SW 1.7 on CentOS 6.3 is grinding to halt regularly
> and it is getting worse from day to day. Right now I have to restart it
> several times a day. The db connections to PostgreSQL fail with "FATAL:
> sorry, too many clients already". The max connections were already
> bumped several times and is set to 300 currently. But thats not the real
> problem, it seems.
>
> I tried to track it down and and stumbled over frequent updates on the
> table rhnServerAction that take ages (several hours for a single update
> statement) to complete. The client seems to run into a timeout and
> reissues the statements (I have some update statement several times in
> logs) while the old ones are still running until all connections are in
> use and SW grinds to halt.
> E.g.:
> 2012-09-18 11:24:31 CEST [3284]: [118-1] LOG: duration: 8250548.876 ms
> statement:
> update rhnServerAction
> set status = 1,
> pickup_time = current_timestamp,
> remaining_tries = 3 - 1
> where action_id = 6233
> and server_id = 1000010014
>
> 2012-09-18 11:24:31 CEST [3119]: [295-1] LOG: duration: 8248422.890 ms
> statement:
> update rhnServerAction
> set status = 1,
> pickup_time = current_timestamp,
> remaining_tries = 3 - 1
> where action_id = 6252
> and server_id = 1000010007
>
> For each update on rhnServerAction the trigger
> rhn_server_action_mod_trig_fun() is fired, but I still can't see why the
> update should take so long. Manually analyzing the updates does not show
> anything suspicious.
>
> My SW installation is not that big (35 clients, with osad and
> configuration management). Total database size is 2,3GB. The table
> rhnServerAction itself only has 4600 rows.
>
>
> Can anybody please help in this regard or shed some light on this?
>
> Regards
> Patrick
>
Hi all,
just an update on the issue. I think I finally got to fix this. After
reading the thread "rhn_check hangs"
(https://www.redhat.com/archives/spacewalk-list/2012-October/msg00024.html)
and the associated bugzilla entries, I tried the patch for
python-psycopg2 myself as I found similar errors in my logs and
rhn_check hung several time. And it seems to be the cure. Since I
applied the patch and built a new rpm locally I no longer have any
hanging update statements. All is running smoothly. I even could
reenable osad on the clients and disable my nightly restart of SW. There
are sill idle connection, but thats a different issue for sure.
The bugzilla entry and patch für this is
https://bugzilla.redhat.com/show_bug.cgi?id=843723. Maybe someone else
can verify this and test if this fixes their problems, too?
Is there any progress in getting this pushed upstream? From my pov this
is getting a showstopper. It seems that many problems are connected to this.
Regards
Patrick
--
Lobster LOGsuite GmbH, Münchner Straße 15a, D-82319 Starnberg
HRB 178831, Amtsgericht München
Geschäftsführer: Dr. Martin Fischer, Rolf Henrich
More information about the Spacewalk-list
mailing list