[Spacewalk-list] Run away DELETE query eating my disk space
Jon Miller
jonebird at gmail.com
Sun May 4 00:29:40 UTC 2014
That's another fine suggestion though I still believe the issues were/are
still related to the actual SQL queries being ran. I completed my planned
routine last evening but just recreating the DB and reloading my backup
didn't change anything. I then changed the following PostgreSQL tunables:
- shared_buffers from (default?) 32MB => 2048MB
- work_mem from 1MB => 20MB
- maintenance_work_mem from 16MB => 300MB
After changing those, I'm able to run the SELECT version of that pesky query
that I was unable to complete before. So, that's a victory but unfortunately
I'm still staring at a fluctuating file system size. It is now going from 52G
available down to nothing in 11min and then presumably something is removed
and the cycle from 52G free repeats.
When I use "inotifywatch" to help characterize the activity on the filesystem
(limited to opens and creates), I get the following:
$ inotifywatch -r /local/mnt -e open -e create
Establishing watches...
Finished establishing watches, now collecting statistics.
total open create filename
2990751 2986326 4425 /local/mnt/workspace/pgsql/data/base/pgsql_tmp/
780 780 0 /local/mnt/workspace/pgsql/data/base/314186/
710 682 28 /local/mnt/workspace/pgsql/data/pg_stat_tmp/
154 154 0 /local/mnt/workspace/pgsql/data/global/
77 77 0 /local/mnt/openv/var/host_cache/
49 25 24 /local/mnt/openv/var/host_cache/19a/
33 17 16 /local/mnt/openv/var/host_cache/021/
...
So, that's why I keep my attention on the DB. And now when I look for what
queries are running, I keep seeing only this one:
SELECT DISTINCT snv.server_id AS server_id, S.name, S.release, SA.name as arch, urn.user_id
FROM (select rhnChannelErrata.errata_id, rhnChannelErrata.channel_id, rhnServerChannel.server_id,rhnErrataPackage.package_id
from rhnChannelErrata, rhnErrataPackage, rhnChannelNewestPackage, rhnPackageEVR,
rhnServerChannel, rhnServerPackage, rhnPackageUpgradeArchCompat
where rhnChannelErrata.errata_id = rhnErrataPackage.errata_id
and rhnChannelErrata.channel_id = rhnChannelNewestPackage.channel_id
and rhnErrataPackage.package_id = rhnChannelNewestPackage.package_id
and rhnChannelErrata.channel_id = rhnServerChannel.channel_id
and rhnChannelNewestPackage.name_id = rhnServerPackage.name_id
and rhnServerChannel.server_id = rhnServerPackage.server_id
and rhnChannelNewestPackage.evr_id = rhnPackageEVR.id
and rhnServerPackage.package_arch_id = rhnPackageUpgradeArchCompat.package_arch_id
and rhnPackageUpgradeArchCompat.pack )
Does PostgreSQL use pgsql_tmp space when queries can not be performed within
working memory?
Thanks again,
Jon Miller
Paul Robert Marino <prmarino1 at gmail.com> writes:
> Try increasing taskomatics working memory. I had a similar issue recently
> with taskomatic where my repo syncs were using more ram than allowed. The
> default is 1 GB I increase it to two and my problem went away
>
> -- Sent from my HP Pre3
>
> ---------------------------------------------------------------------------
> On May 2, 2014 19:01, Jon Miller <jonebird at gmail.com> wrote:
>
> My current plan is to stop Spacewalk this evening, dump the DB, recreate
> the
> DB and load my data back. Why? I was troubleshooting another issue related
> to
> clients that use a Spacewalk proxy. While I was strace'ing Tomcat, I found
> that forked processes were segfault'ing and the thing they did just before
> was reading from a PostgreSQL socket. My evening procedure is going off the
> theory that my DB is somehow corrupted.
>
> Meanwhile, here are some specs on our Spacewalk server:
> - VMware instance running RHEL 6.4
> + 8GB of memory, 2 vCPUs
> - PosgresSQL version 8.4.13-1.el6_3.x86_64
> - Spacewalk version 2.0.1-1 (spacewalk-postgresql)
>
> Thanks for the interest,
> Jon Miller
>
> Paul Robert Marino <prmarino1 at gmail.com> writes:
>
>> Increase your working memory. Also increase your maintenance working
>> memory.
>> That will help but we need to know more details before we can help.
>> The exact specs of the box.
>> The PostgreSQL configuration.
>> The exact version of PostgreSQL you are running there are huge
> differences
>> between 8 and 9.
>> What OS and version you are running on. For example I run PostgreSQL 9.x
> on
>> RHEL 6 which causes huge problems unless you change the default java
> driver
>> which was at the time I did it a manual operation. And this can also be a
>> problem for external PostgreSQL databases.
>>
>> -- Sent from my HP Pre3
>>
>> -
> --------------------------------------------------------------------------
>> On May 1, 2014 21:52, Jon Miller <jonebird at gmail.com> wrote:
>>
>> Chris <dmagick at gmail.com> writes:
>>
>>>> Thanks for the reply. I wondered if I could do that but it would have
>> been a
>>>> while before I would have came up with that query. Perhaps upstream
>> folks
>>>> would like to test that version for Task_queries.xml within the git
>> repo.
>>>>
>>>> Like before, I like to turn it into a "select count(*)" version first
> as
>> a
>>>> test. I ran it and forgot about it until I realized it was still
> running
>> and
>>>> killed it after ~3.5hours.
>>>>
>>>> spaceschema=# select count(*) from rhnPackageChangeLogData where id in
>>>> spaceschema-# (SELECT d.id from rhnPackageChangeLogData d
>>>> spaceschema(# left join rhnPackageChangeLogRec l on
>> (d.id=l.changelog_data_id)
>>>> spaceschema(# where l.changelog_data_id is null);
>>>> ^C
>>>> Session terminated, killing shell... ...killed.
>>>
>>> Yikes.
>>>
>>> Time to check indexes? Check there is one on
>>> rhnpackagechangelogrec(changelog_data_id):
>>>
>>> spacewalk=# \d rhnpackagechangelogrec
>>> Table "public.rhnpackagechangelogrec"
>>> Column | Type | Modifiers
>>> -------------------+--------------------------+------------------------
>>> id | numeric | not null
>>> package_id | numeric | not null
>>> changelog_data_id | numeric | not null
>>> created | timestamp with time zone | not null default now()
>>> modified | timestamp with time zone | not null default now()
>>> Indexes:
>>> "rhn_pkg_clr_id_pk" PRIMARY KEY, btree (id)
>>> "rhn_pkg_clr_pid_cld_uq" UNIQUE, btree (package_id, changelog_data_id)
>>> "rhn_pkg_clr_cld_uq" btree (changelog_data_id)
>>>
>>> <snip>
>>
>> Yes, I have the same indexes:
>> spaceschema=# \d rhnpackagechangelogrec
>> Table "public.rhnpackagechangelogrec"
>> Column | Type | Modifiers
>> -------------------+--------------------------+------------------------
>> id | numeric | not null
>> package_id | numeric | not null
>> changelog_data_id | numeric | not null
>> created | timestamp with time zone | not null default now()
>> modified | timestamp with time zone | not null default now()
>> Indexes:
>> "rhn_pkg_clr_id_pk" PRIMARY KEY, btree (id)
>> "rhn_pkg_clr_pid_cld_uq" UNIQUE, btree (package_id, changelog_data_id)
>> "rhn_pkg_clr_cld_uq" btree (changelog_data_id)
>>
>>> and the one on 'id' in changelogData should be there because it's a
>>> primary key:
>>>
>>> spacewalk=# \d rhnpackagechangelogdata
>>> Table "public.rhnpackagechangelogdata"
>>> Column | Type | Modifiers
>>> ---------+--------------------------+------------------------
>>> id | numeric | not null
>>> name | character varying(128) | not null
>>> text | character varying(3000) | not null
>>> time | timestamp with time zone | not null
>>> created | timestamp with time zone | not null default now()
>>> Indexes:
>>> "rhn_pkg_cld_id_pk" PRIMARY KEY, btree (id)
>>>
>>> <snip>
>>
>> Same here as well or perhaps an extra one? (maybe you snipped too much?)
>> spaceschema=# \d rhnpackagechangelogdata
>> Table "public.rhnpackagechangelogdata"
>> Column | Type | Modifiers
>> ---------+--------------------------+------------------------
>> id | numeric | not null
>> name | character varying(128) | not null
>> text | character varying(3000) | not null
>> time | timestamp with time zone | not null
>> created | timestamp with time zone | not null default now()
>> Indexes:
>> "rhn_pkg_cld_id_pk" PRIMARY KEY, btree (id)
>> "rhn_pkg_cld_nt_idx" btree (name, "time")
>>
>> Two days ago I actually shutdown Spacewalk so I could do a "VACUUM FULL
>> VERBOSE ANALYZE;" and then I rebuild every index on the DB before
> bringing
>> Spacewalk back up again.
>>
>> While the query was running yesterday, I was watching a "vmstat" output
> and
>> while it was busy, there didn't seem to be any blatant resource
> constraint
>> with the system on the whole. A lot of block I/O and the occasional swap
>> activity but definitely not thrashing.
>>
>> Have you made any tweaks to your postgresql.conf that you can share? That
>> is
>> where my head is now but I've got more reading / learning to do before I
>> start introducing tuning changes.
>>
>> Thanks,
>
> _______________________________________________
> Spacewalk-list mailing list
> Spacewalk-list at redhat.com
> https://www.redhat.com/mailman/listinfo/spacewalk-list
More information about the Spacewalk-list
mailing list