[Spacewalk-list] CPU bound postgresql process ? Why and how to stop it?

Paul Robert Marino prmarino1 at gmail.com
Mon Dec 9 21:24:53 UTC 2013


sorry for the delayed responce on this I thought I sent it but its
been sitting in my drafts folder

On Wed, Dec 4, 2013 at 11:21 AM, Boyd, Robert
<Robert.Boyd at peoplefluent.com> wrote:
> I have several hundred servers in spacewalk.   How much time should I allow for a window to run this db maintenance?
that depends on how fragmented the tables and free space in the tables
are which is a direct correlation to how many changes have been made
since autovacuume has cleaned it up.
my facilities are 24x7 365 mission critical but on the weekend no
changes are allowed to any systems so I have a cron job that does it
at midnight on the first Saturday of each month and I don't really
time it.

if its the first time you are doing it I would schedule 4 hours.
1 hour if you skip the reindex and second vacuum analyze. the reindex
is whats really time consuming and not really required.


>
> And how does this relate to the bugzilla that Michael mentioned?

Yes and no they are and are not related
Not doing the maintenance I mentioned would significantly exacerbate
the issue in the bugzilla ticket.
this is because it seems to be a planner issue and the planer utilizes
the statistics created by the analyze process to figure out how to
handle complex queries like this.

additionally you could more quickly help it if you tune the planner
for example in ~postgres/data/postgresql.conf adjusting the size of
effective_cache_size can be extremely helpful.

Also turning on constraint_exclusion in the same file is helpfull with
spacewalk! it means the planner takes longer but can automatically
figure out things to exclude in subqueries and joins based on
constraints in the other parts of the overall query so the resulting
query can be faster.
by default constraint_exclusion is turned off in PostgreSQL because if
you don't do a lot of joins and conditional sub queries it will hurt
you performance but in the case of spacewalk its a significant help.

the query planner in PostgreSQL is a complex subject
but the major things to look at for spacewalk tuning are

shared_buffers (increase a lot)
work_mem (increase a lot if possible)
effective_cache_size (definitely increase a lot)
default_statistics_target (increase a little or a lot but the down
size is it makes analyze take longer the more you increase it and
there is a point of diminishing returns)
constraint_exclusion (enable this)
from_collapse_limit (increase slightly)
join_collapse_limit (increase slightly)


Also you you want to speed up vacuum, autovacuum, and reindex, and
analyze operations increasing the maintenance_work_mem is helpful for
that but keep in mind that autovacuums may use that during normal
database operation so be careful not to make it too high so for
example on my production boxes I have it set to 1GB




>
> Michael, how long before the spacewalk-java update will move from nightly to production release?
>
> Thank you both for your assistance!
>
> Robert
>
> -----Original Message-----
> From: spacewalk-list-bounces at redhat.com [mailto:spacewalk-list-bounces at redhat.com] On Behalf Of Paul Robert Marino
> Sent: Tuesday, December 03, 2013 1:16 PM
> To: spacewalk-list at redhat.com
> Subject: Re: [Spacewalk-list] CPU bound postgresql process ? Why and how to stop it?
>
> I think I know whats going on here.
> The autovacuum process in PostgreSQL 8.x was new and wasn't quite perfected yet. you need to stop spacewalk and do a full vacuum analyze on the database.
> chances are your planner stats are out of date. It also may not hurt to do a reindexdb as well.
> I schedule a monthly window to do these procedures to absolutely ensure that every thing is working optimally.
>
> here is what I run
> "
> #!/bin/bash
> spacewalk-service stop
> export PGUSER=postgres
> export PGPASSWORD=password
> vacuumdb -azvf
> reindexdb -a
> vacuumdb -a
> spacewalk-service start
> "
>
>
> On Tue, Dec 3, 2013 at 11:52 AM, Boyd, Robert <Robert.Boyd at peoplefluent.com> wrote:
>> Charles,  Here's what I found associated with the CPU bound process:
>> -- so, what is this being triggered by and what is it doing? Why is it
>> taking so long?  And is it ever going to finish?
>>
>>
>>
>> spaceschema=# select * from pg_stat_activity;
>>
>> datid |   datname   | procpid | usesysid |  usename  |
>> current_query
>>
>>                         | waiting |          xact_start           |
>> query_start          |         backend_start         | client_addr | client
>>
>> _port
>>
>>
>>
>> 16384 | spaceschema |   15809 |    16388 | spaceuser | SELECT DISTINCT
>> snv.server_id AS server_id, S.name, S.release, SA.name as arch,
>>
>>                         | f       | 2013-12-02 13:30:00.7034-05   |
>> 2013-12-03 11:45:02.564862-05 | 2013-12-02 12:43:22.424659-05 |
>> 127.0.0.1
>> |
>>
>> 43711
>>
>>                                                       :         urn.user_id
>>
>>
>>
>>                                                       :       FROM (
>>
>>
>>
>>                                                       : --
>>
>>
>>
>>                                                       : select
>> rhnChannelErrata.errata_id, rhnChannelErrata.channel_id,
>> rhnServerChannel.server_id, rhnE
>>
>> rrataPackage.package_id
>>
>>                                                       : from
>> rhnChannelErrata, rhnErrataPackage, rhnChannelNewestPackage,
>> rhnPackageEVR,
>>
>>
>>
>>                                                       :
>> rhnServerChannel, rhnServerPackage, rhnPackageUpgradeArchCompat
>>
>>
>>
>>                                                       : where
>> rhnChannelErrata.errata_id = rhnErrataPackage.errata_id
>>
>>
>>
>>                                                       : --
>>
>>
>>
>>                                                       :         and
>> rhnChannelErrata.channel_id = rhnChannelNewestPackage.channel_id
>>
>>
>>
>>                                                       :         and
>> rhnErrataPackage.package_id = rhnChannelNewestPackage.package_id
>>
>>
>>
>>                                                       : --
>>
>>
>>
>>                                                       :         and
>> rhnChannelErrata.channel_id = rhnServerChannel.channel_id
>>
>>
>>
>>                                                       :         and
>> rhnChannelNewestPackage.name_id = rhnServerPackage.name_id
>>
>>
>>
>>                                                       :         and
>> rhnServerChannel.server_id = rhnServerPackage.server_id
>>
>>
>>
>>                                                       : --
>>
>>
>>
>>                                                       :         and
>> rhnChannelNewestPackage.evr_id = rhnPackageEVR.id
>>
>>
>>
>>                                                       : --
>>
>>
>>
>>                                                       :         and
>> rhnServerPackage.package_arch_id =
>> rhnPackageUpgradeArchCompat.package_arch_id
>>
>>
>>
>>                                                       :         and
>> rhnPackageUpgradeArchCompat.pack
>>
>>
>>
>> From: spacewalk-list-bounces at redhat.com
>> [mailto:spacewalk-list-bounces at redhat.com] On Behalf Of Charles
>> Richards
>> Sent: Tuesday, December 03, 2013 11:18 AM
>> To: spacewalk-list at redhat.com
>> Subject: Re: [Spacewalk-list] CPU bound postgresql process ? Why and
>> how to stop it?
>>
>>
>>
>> You can check what postgresql is doing by running the following query
>> as the postgres user:
>>
>>
>>
>> select * from pg_stat_activity;
>>
>>
>>
>> I've seen this behavior on my instance when the errata_cache task
>> runs, but have not found a resolution for it (running SW 1.9  / PG
>> 8.4.13)
>>
>>
>>
>>
>>
>> On Dec 2, 2013, at 1:18 PM, Boyd, Robert
>> <Robert.Boyd at peoplefluent.com>
>> wrote:
>>
>>
>>
>> I'm running spacewalk 2.0 on RHEL 6.4 with database local postgresql
>> 8.4 all patched current.  In recent days I'm seeing one or more very
>> busy processes like this one:
>>
>>
>>
>> 15809 postgres  20   0 1161m 256m 225m R 99.3  6.7  64:35.98 postgres:
>> spaceuser spaceschema 127.0.0.1(43711) SELECT
>>
>>
>>
>> The server has been CPU bound since about 11/21/2013.   I've tried
>> stopping/restarting spacewalk and postgresql, but as soon as both get
>> fully started one postgresql process goes CPU bound.
>>
>>
>>
>> Here's a CPU usage graph from a couple of days before through a couple
>> of days after the problem began.
>>
>>
>>
>> <image005.jpg>
>>
>>
>>
>> How can I begin to find out what is going on with spacewalk and the
>> database that is keeping it busy like this?
>>
>>
>>
>> I tried looking in the /var/log files for something at the system
>> level but there just isn't anything noteworthy in the time interval
>> where this problem began.
>>
>>
>>
>> In the pgsql logs (e.g. /var/lib/pgsql/data/pg_log/postgresql-Sun.log)
>> I see a lot of messages like these:
>>
>>
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  automatic analyze of table "spaceschema.public.qrtz_triggers"
>> system
>> usage: CPU 0.00s/0.00u sec elapsed 0.18 sec
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  automatic vacuum of table "spaceschema.public.qrtz_triggers":
>> index
>> scans: 0
>>
>>         pages: 0 removed, 1421 remain
>>
>>         tuples: 0 removed, 30797 remain
>>
>>         system usage: CPU 0.17s/0.03u sec elapsed 2.33 sec
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  automatic analyze of table "spaceschema.public.qrtz_cron_triggers"
>> system usage: CPU 0.00s/0.00u sec elapsed 0.08 sec
>>
>> LOG:  automatic vacuum of table "spaceschema.public.qrtz_cron_triggers":
>> index scans: 0
>>
>>         pages: 0 removed, 786 remain
>>
>>         tuples: 0 removed, 30856 remain
>>
>>         system usage: CPU 0.00s/0.00u sec elapsed 0.15 sec
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> ERROR:  -20243 : (package_arch_not_found) - Package architecture could
>> not be found
>>
>> CONTEXT:  SQL statement "SELECT
>> rhn_exception.raise_exception('package_arch_not_found')"
>>
>>         PL/pgSQL function "lookup_package_arch" line 14 at PERFORM
>>
>> STATEMENT:
>>
>>                     insert into rhnServerPackage
>>
>>                     (server_id, name_id, evr_id, package_arch_id,
>> installtime)
>>
>>                     values (1000010465,
>> LOOKUP_PACKAGE_NAME(E'keyutils-libs'), LOOKUP_EVR(NULL, E'1.0', E'2'),
>>
>>                         LOOKUP_PACKAGE_ARCH(E''), TO_TIMESTAMP(NULL,
>> 'YYYY-MM-DD HH24:MI:SS')
>>
>>                     )
>>
>>
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>> LOG:  unexpected EOF on client connection
>>
>>
>>
>> Robert Boyd
>>
>> Senior Systems Engineer | Peoplefluent
>>
>> p. 919-645-2972 | c. 919-306-4681
>>
>> e. Robert.Boyd at peoplefluent.com
>>
>> Visit: www.peoplefluent.com | Read: Peoplefluent Blog
>>
>> Follow: @peoplefluent | Download: iPad App
>>
>> <image001.png>
>>
>> <image002.png>
>>
>>
>>
>> _______________________________________________
>> Spacewalk-list mailing list
>> Spacewalk-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>>
>>
>>
>>
>> _______________________________________________
>> Spacewalk-list mailing list
>> Spacewalk-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/spacewalk-list
>
> _______________________________________________
> Spacewalk-list mailing list
> Spacewalk-list at redhat.com
> https://www.redhat.com/mailman/listinfo/spacewalk-list
>
> _______________________________________________
> Spacewalk-list mailing list
> Spacewalk-list at redhat.com
> https://www.redhat.com/mailman/listinfo/spacewalk-list




More information about the Spacewalk-list mailing list