[Spacewalk-list] Spacewalk fails

Konstantin Raskoshnyi konrasko at gmail.com
Wed Mar 22 06:52:41 UTC 2017


We have ~1000 clients, in the evening spacewalk runs a lot of commands
(checks files revisions for example)

I'm receiving ~1000 tracebacks, clients can't connect to spacewalk.

1. Usually sp as ~300 processes, during those task ~ 1000
2. I didn't change any tomcat/httpd settings
3. Only changed postgres setttings to be optimized for 64Gb or ram
4.

No any errors on backends, but top:

op - 06:48:54 up 1 day, 6 min,  2 users,  load average: 155.93, 133.08,
117.28
Tasks: 965 total, 119 running, 846 sleeping,   0 stopped,   0 zombie
%Cpu(s): 95.3 us,  1.2 sy,  0.0 ni,  3.5 id,  0.0 wa,  0.0 hi,  0.0 si,
 0.0 st
KiB Mem : 65767568 total, 50071348 free,  7842848 used,  7853372 buff/cache
KiB Swap: 33008636 total, 33002300 free,     6336 used. 56303848 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
44415 postgres  20   0 15.577g 188576 185440 R  11.6  0.3   0:22.19 postgres
45154 postgres  20   0 15.586g  21764  15564 S  10.9  0.0   0:15.56 postgres
45271 postgres  20   0 15.586g  19588  13680 S  10.9  0.0   0:14.95 postgres
45136 postgres  20   0 15.586g  19944  14064 R  10.6  0.0   0:16.09 postgres
45161 postgres  20   0 15.586g  22348  16044 R  10.6  0.0   0:16.13 postgres
45172 postgres  20   0 15.586g  19512  13680 S  10.6  0.0   0:15.86 postgres
44792 postgres  20   0 15.586g  22292  16044 R  10.3  0.0   0:17.78 postgres
44885 postgres  20   0 15.584g  18824  13932 R  10.3  0.0   0:16.73 postgres
44998 postgres  20   0 15.586g  21296  15100 R  10.3  0.0   0:16.36 postgres
45011 postgres  20   0 15.586g  21200  15048 R  10.3  0.0   0:16.45 postgres
45034 postgres  20   0 15.586g  19348  13540 S  10.3  0.0   0:16.59 postgres
45120 postgres  20   0 15.586g  22060  15608 S  10.3  0.0   0:15.85 postgres
45131 postgres  20   0 15.586g  19352  13560 R  10.3  0.0   0:15.76 postgres
45167 postgres  20   0 15.586g  19416  13580 S  10.3  0.0   0:15.88 postgres
45254 postgres  20   0 15.586g  21096  15020 S  10.3  0.0   0:11.00 postgres
45261 postgres  20   0 15.586g  19328  13516 R  10.3  0.0   0:15.47 postgres
45267 postgres  20   0 15.586g  19372  13560 R  10.3  0.0   0:15.14 postgres
44492 postgres  20   0 15.586g  24872  18508 R  10.0  0.0   0:21.62 postgres
44791 postgres  20   0 15.586g  24396  17840 S  10.0  0.0   0:17.04 postgres
44944 postgres  20   0 15.586g  19324  13512 S  10.0  0.0   0:17.23 postgres
44946 postgres  20   0 15.586g  19388  13556 S  10.0  0.0   0:16.82 postgres
44957 postgres  20   0 15.586g  19356  13520 R  10.0  0.0   0:16.76 postgres
45045 postgres  20   0 15.586g  19372  13564 S  10.0  0.0   0:16.89 postgres
45099 postgres  20   0 15.586g  19448  13624 R  10.0  0.0   0:16.24 postgres
45116 postgres  20   0 15.586g  19444  13628 S  10.0  0.0   0:15.95 postgres
45142 postgres  20   0 15.586g  19412  13612 R  10.0  0.0   0:15.75 postgres
45153 postgres  20   0 15.586g  20932  14924 S  10.0  0.0   0:15.63 postgres
45169 postgres  20   0 15.586g  19900  14064 S  10.0  0.0   0:15.76 postgres
45197 postgres  20   0 15.586g  19368  13532 R  10.0  0.0   0:15.79 postgres
45218 postgres  20   0 15.586g  19824  13964 R  10.0  0.0   0:15.04 postgres
45259 postgres  20   0 15.586g  19364  13548 S  10.0  0.0   0:15.56 postgres
44447 postgres  20   0 15.586g  26928  20336 R   9.6  0.0   0:21.75 postgres
44763 postgres  20   0 15.586g  22256  16024 R   9.6  0.0   0:16.38 postgres
44799 postgres  20   0 15.586g  24700  18116 S   9.6  0.0   0:17.20 postgres
44836 postgres  20   0 15.586g  21084  14928 S   9.6  0.0   0:16.58 postgres
44895 postgres  20   0 15.586g  20784  14464 R   9.6  0.0   0:17.45 postgres
44950 postgres  20   0 15.586g  19272  13464 S   9.6  0.0   0:16.52 postgres
44954 postgres  20   0 15.586g  18128  12736 R   9.6  0.0   0:16.56 postgres
44955 postgres  20   0 15.586g  19412  13584 R   9.6  0.0   0:16.68 postgres

#------------------------------------------------------------------------------
# pgtune run on 2017-03-22
# Based on 65767568 KB RAM, platform Linux
#------------------------------------------------------------------------------

maintenance_work_mem = 2GB
checkpoint_completion_target = 0.9
effective_cache_size = 44GB
work_mem = 52MB
wal_buffers = 16MB
shared_buffers = 15GB
max_connections = 600

Any thoughts how to optimize get back sp to life? Thanks
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/spacewalk-list/attachments/20170321/476167cc/attachment.htm>


More information about the Spacewalk-list mailing list