[Spacewalk-list] Run away DELETE query eating my disk space
Jon Miller
jonebird at gmail.com
Fri May 2 01:46:50 UTC 2014
Chris <dmagick at gmail.com> writes:
>> Thanks for the reply. I wondered if I could do that but it would have been a
>> while before I would have came up with that query. Perhaps upstream folks
>> would like to test that version for Task_queries.xml within the git repo.
>>
>> Like before, I like to turn it into a "select count(*)" version first as a
>> test. I ran it and forgot about it until I realized it was still running and
>> killed it after ~3.5hours.
>>
>> spaceschema=# select count(*) from rhnPackageChangeLogData where id in
>> spaceschema-# (SELECT d.id from rhnPackageChangeLogData d
>> spaceschema(# left join rhnPackageChangeLogRec l on (d.id=l.changelog_data_id)
>> spaceschema(# where l.changelog_data_id is null);
>> ^C
>> Session terminated, killing shell... ...killed.
>
> Yikes.
>
> Time to check indexes? Check there is one on
> rhnpackagechangelogrec(changelog_data_id):
>
> spacewalk=# \d rhnpackagechangelogrec
> Table "public.rhnpackagechangelogrec"
> Column | Type | Modifiers
> -------------------+--------------------------+------------------------
> id | numeric | not null
> package_id | numeric | not null
> changelog_data_id | numeric | not null
> created | timestamp with time zone | not null default now()
> modified | timestamp with time zone | not null default now()
> Indexes:
> "rhn_pkg_clr_id_pk" PRIMARY KEY, btree (id)
> "rhn_pkg_clr_pid_cld_uq" UNIQUE, btree (package_id, changelog_data_id)
> "rhn_pkg_clr_cld_uq" btree (changelog_data_id)
>
> <snip>
Yes, I have the same indexes:
spaceschema=# \d rhnpackagechangelogrec
Table "public.rhnpackagechangelogrec"
Column | Type | Modifiers
-------------------+--------------------------+------------------------
id | numeric | not null
package_id | numeric | not null
changelog_data_id | numeric | not null
created | timestamp with time zone | not null default now()
modified | timestamp with time zone | not null default now()
Indexes:
"rhn_pkg_clr_id_pk" PRIMARY KEY, btree (id)
"rhn_pkg_clr_pid_cld_uq" UNIQUE, btree (package_id, changelog_data_id)
"rhn_pkg_clr_cld_uq" btree (changelog_data_id)
> and the one on 'id' in changelogData should be there because it's a
> primary key:
>
> spacewalk=# \d rhnpackagechangelogdata
> Table "public.rhnpackagechangelogdata"
> Column | Type | Modifiers
> ---------+--------------------------+------------------------
> id | numeric | not null
> name | character varying(128) | not null
> text | character varying(3000) | not null
> time | timestamp with time zone | not null
> created | timestamp with time zone | not null default now()
> Indexes:
> "rhn_pkg_cld_id_pk" PRIMARY KEY, btree (id)
>
> <snip>
Same here as well or perhaps an extra one? (maybe you snipped too much?)
spaceschema=# \d rhnpackagechangelogdata
Table "public.rhnpackagechangelogdata"
Column | Type | Modifiers
---------+--------------------------+------------------------
id | numeric | not null
name | character varying(128) | not null
text | character varying(3000) | not null
time | timestamp with time zone | not null
created | timestamp with time zone | not null default now()
Indexes:
"rhn_pkg_cld_id_pk" PRIMARY KEY, btree (id)
"rhn_pkg_cld_nt_idx" btree (name, "time")
Two days ago I actually shutdown Spacewalk so I could do a "VACUUM FULL
VERBOSE ANALYZE;" and then I rebuild every index on the DB before bringing
Spacewalk back up again.
While the query was running yesterday, I was watching a "vmstat" output and
while it was busy, there didn't seem to be any blatant resource constraint
with the system on the whole. A lot of block I/O and the occasional swap
activity but definitely not thrashing.
Have you made any tweaks to your postgresql.conf that you can share? That is
where my head is now but I've got more reading / learning to do before I
start introducing tuning changes.
Thanks,
--
Jon Miller
More information about the Spacewalk-list
mailing list