[Pulp-list] 2.8 Tasks Stuck at Waiting
Paul Jochum
paul.jochum at nokia.com
Tue Apr 12 20:14:53 UTC 2016
Slight correction: (in the middle of my email):
I did grep /var/log/httpd/ssl_access_log *for my ta**sk**-id
(**f3a4a828-1a98-4d30-989f-a8f62acc29f9)* in /var/log/* and
/var/log/*/*, but it only found it in /var/log/httpd/ssl_access_log and
/var/log/httpd/ssl_request_log (lines like:
/var/log/httpd/ssl_access_log:135.1.215.52 - - [12/Apr/2016:15:04:17
-0500] "GET /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/
HTTP/1.1" 200 579
regards,
Paul
On 04/12/2016 03:09 PM, EXT Paul Jochum wrote:
> Hi Brian:
>
> I am running into the same problem of "waiting", so here are my
> outputs (let me know if you need anything else):
>
> -sh-4.1# pulp-admin -vv tasks details --task-id
> f3a4a828-1a98-4d30-989f-a8f62acc29f9
> +----------------------------------------------------------------------+
> Task Details
> +----------------------------------------------------------------------+
>
> 2016-04-12 14:59:46,539 - DEBUG - sending GET request to
> /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/
> 2016-04-12 14:59:46,602 - INFO - GET request to
> /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/ with
> parameters None
> 2016-04-12 14:59:46,602 - INFO - Response status : 200
>
> 2016-04-12 14:59:46,602 - INFO - Response body :
> {
> "exception": null,
> "task_type": "pulp.server.managers.repo.sync.sync",
> "_href": "/pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/",
> "task_id": "f3a4a828-1a98-4d30-989f-a8f62acc29f9",
> "tags": [
> "pulp:repository:rhel-i386-server-optional-6.5.z-debuginfo",
> "pulp:action:sync"
> ],
> "finish_time": null,
> "_ns": "task_status",
> "start_time": null,
> "traceback": null,
> "spawned_tasks": [],
> "progress_report": {},
> "queue": "None.dq",
> "state": "waiting",
> "worker_name": null,
> "result": null,
> "error": null,
> "_id": {
> "$oid": "570d1727e75b4adec6bd0d58"
> },
> "id": "570d1727e75b4adec6bd0d58"
> }
>
> Operations: sync
> Resources: rhel-i386-server-optional-6.5.z-debuginfo (repository)
> State: Waiting
> Start Time: Unstarted
> Finish Time: Incomplete
> Result: Incomplete
> Task Id: f3a4a828-1a98-4d30-989f-a8f62acc29f9
> Progress Report:
>
>
> -sh-4.1# ps -awfux | grep celery
> Warning: bad syntax, perhaps a bogus '-'? See
> /usr/share/doc/procps-3.2.8/FAQ
> root 16147 0.0 0.0 6444 704 pts/2 S+ 15:01 0:00 \_ grep
> celery
> apache 977 0.1 0.1 561608 25908 ? Sl Apr09 6:15
> /usr/bin/python /usr/bin/celery beat
> --app=pulp.server.async.celery_instance.celery
> --scheduler=pulp.server.async.scheduler.Scheduler
> --workdir=/var/run/pulp/ -f /var/log/pulp/celerybeat.log -l INFO
> --detach --pidfile=/var/run/pulp/celerybeat.pid
> apache 9558 0.0 0.4 553760 58972 ? Sl 10:40 0:05
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n resource_manager at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO -Q resource_manager
> --logfile=/var/log/pulp/resource_manager.log
> --pidfile=/var/run/pulp/resource_manager.pid
> apache 9602 0.4 0.3 472928 50512 ? Sl 10:40 1:16 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n resource_manager at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO -Q resource_manager
> --logfile=/var/log/pulp/resource_manager.log
> --pidfile=/var/run/pulp/resource_manager.pid
> apache 9792 0.0 0.4 553976 58924 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-0 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-0.log
> --pidfile=/var/run/pulp/reserved_resource_worker-0.pid
> apache 9865 0.0 0.3 472916 49692 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-0 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-0.log
> --pidfile=/var/run/pulp/reserved_resource_worker-0.pid
> apache 9807 0.0 0.4 553976 58920 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-1 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-1.log
> --pidfile=/var/run/pulp/reserved_resource_worker-1.pid
> apache 9887 0.0 0.3 472916 49688 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-1 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-1.log
> --pidfile=/var/run/pulp/reserved_resource_worker-1.pid
> apache 9826 0.0 0.4 553564 58896 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-2 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-2.log
> --pidfile=/var/run/pulp/reserved_resource_worker-2.pid
> apache 9893 0.0 0.3 472924 49684 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-2 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-2.log
> --pidfile=/var/run/pulp/reserved_resource_worker-2.pid
> apache 9847 0.0 0.4 553976 58912 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-3 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-3.log
> --pidfile=/var/run/pulp/reserved_resource_worker-3.pid
> apache 9917 0.0 0.3 472916 49692 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-3 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-3.log
> --pidfile=/var/run/pulp/reserved_resource_worker-3.pid
> apache 9871 0.0 0.4 553752 58896 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-4 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-4.log
> --pidfile=/var/run/pulp/reserved_resource_worker-4.pid
> apache 9945 0.0 0.3 472920 49684 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-4 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-4.log
> --pidfile=/var/run/pulp/reserved_resource_worker-4.pid
> apache 9896 0.0 0.4 553988 58932 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-5 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-5.log
> --pidfile=/var/run/pulp/reserved_resource_worker-5.pid
> apache 9982 0.0 0.3 472924 49700 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-5 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-5.log
> --pidfile=/var/run/pulp/reserved_resource_worker-5.pid
> apache 9918 0.0 0.4 553972 58916 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-6 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-6.log
> --pidfile=/var/run/pulp/reserved_resource_worker-6.pid
> apache 9990 0.0 0.3 472912 49680 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-6 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-6.log
> --pidfile=/var/run/pulp/reserved_resource_worker-6.pid
> apache 9940 0.0 0.4 553756 58892 ? Sl 10:41 0:06
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-7 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-7.log
> --pidfile=/var/run/pulp/reserved_resource_worker-7.pid
> apache 10013 0.0 0.3 472920 49684 ? S 10:41 0:00 \_
> /usr/bin/python -m celery.__main__ worker --heartbeat-interval=30 -c 1
> -n reserved_resource_worker-7 at lss-pulp01.ih.lucent.com --events
> --app=pulp.server.async.app --loglevel=INFO
> --logfile=/var/log/pulp/reserved_resource_worker-7.log
> --pidfile=/var/run/pulp/reserved_resource_worker-7.pid
> -sh-4.1#
>
>
> I did grep /var/log/httpd/ssl_access_log in /var/log/* and
> /var/log/*/*, but it only found it in /var/log/httpd/ssl_access_log
> and /var/log/httpd/ssl_request_log (lines like:
> /var/log/httpd/ssl_access_log:135.1.215.52 - - [12/Apr/2016:15:04:17
> -0500] "GET /pulp/api/v2/tasks/f3a4a828-1a98-4d30-989f-a8f62acc29f9/
> HTTP/1.1" 200 579
>
>
> -sh-4.1# rpm -q -a | grep -i pulp | sort
> mod_wsgi-3.4-2.pulp.el6.x86_64
> pulp-admin-client-2.8.1-1.el6.noarch
> pulp-puppet-admin-extensions-2.8.1-1.el6.noarch
> pulp-puppet-plugins-2.8.1-1.el6.noarch
> pulp-rpm-admin-extensions-2.8.1-1.el6.noarch
> pulp-rpm-plugins-2.8.1-1.el6.noarch
> pulp-selinux-2.8.1-1.el6.noarch
> pulp-server-2.8.1-1.el6.noarch
> python-isodate-0.5.0-4.pulp.el6.noarch
> python-kombu-3.0.33-4.pulp.el6.noarch
> python-pulp-bindings-2.8.1-1.el6.noarch
> python-pulp-client-lib-2.8.1-1.el6.noarch
> python-pulp-common-2.8.1-1.el6.noarch
> python-pulp-oid_validation-2.8.1-1.el6.noarch
> python-pulp-puppet-common-2.8.1-1.el6.noarch
> python-pulp-repoauth-2.8.1-1.el6.noarch
> python-pulp-rpm-common-2.8.1-1.el6.noarch
>
>
> -sh-4.1# uname -a
> Linux lss-pulp01.ih.lucent.com 2.6.32-042stab113.21 #1 SMP Wed Mar 23
> 11:05:25 MSK 2016 x86_64 x86_64 x86_64 GNU/Linux
>
>
> -sh-4.1# more /etc/*release*
> ::::::::::::::
> /etc/redhat-release
> ::::::::::::::
> Scientific Linux release 6.7 (Carbon)
> ::::::::::::::
> /etc/system-release
> ::::::::::::::
> Scientific Linux release 6.7 (Carbon)
> ::::::::::::::
> /etc/system-release-cpe
> ::::::::::::::
> cpe:/o:redhat:enterprise_linux:6.7:ga
> -sh-4.1#
>
> On 04/11/2016 03:32 PM, Brian Bouterse wrote:
>> To anyone who has a stuck task running Pulp 2.6 or newer, can you open a
>> bug or e-mail the list with the following. The full output from:
>>
>> pulp-admin -vv tasks details --task-id <the_stuck_task_UUID>
>>
>> Post the full output, but observe the value of "worker_name" similar to
>> what was suggested here[0].
>>
>> Can you also include the `sudo ps -awfux | grep celery` output to show
>> your running process?
>>
>> Also can you grep the logs for that UUID and include any lines that you
>> see as output too.
>>
>> [0]: https://www.redhat.com/archives/pulp-list/2016-April/msg00026.html
>>
>> Thanks,
>> Brian
>>
>>
>> On 04/11/2016 01:32 PM, Mihai Ibanescu wrote:
>>> I have a similar problem on 2.7.1 with rabbitmq.
>>>
>>> Occasionally a task will get "lost":
>>>
>>> pulp-admin tasks details --task-id 622041ac-e9e4-4a15-bd7c-7c98a17782e0
>>> +----------------------------------------------------------------------+
>>>
>>> Task Details
>>> +----------------------------------------------------------------------+
>>>
>>>
>>> Operations: publish
>>> Resources: thirdparty-snapshot-rpm-latest (repository)
>>> State: Waiting
>>> Start Time: Unstarted
>>> Finish Time: Incomplete
>>> Result: Incomplete
>>> Task Id: 622041ac-e9e4-4a15-bd7c-7c98a17782e0
>>> Progress Report:
>>>
>>>
>>> I have seen more than one getting in this state.
>>>
>>> I can cancel the task, but I'd like to find out if I can tickle
>>> something to re-process the queue. Other tasks do get correctly
>>> processed.
>>>
>>> If there's something you would like me to do in order to debug this
>>> further, please let me know.
>>>
>>> Thank you!
>>> Mihai
>>>
>>> On Fri, Apr 8, 2016 at 9:27 AM, Brian Bouterse <bbouters at redhat.com
>>> <mailto:bbouters at redhat.com>> wrote:
>>>
>>> There is a known issue [0] introduced in 2.8.0 which will cause
>>> your
>>> tasks to become stuck at waiting. This occurs after Pulp runs
>>> normally
>>> for hours or days.
>>>
>>> A fix is completed but won't be released until Pulp 2.8.3 is
>>> released.
>>> The good news is, you can resolve that issue now by upgrading
>>> python-kombu on your existing 2.8.0, 2.8.1, or 2.8.2 system.
>>> Manually
>>> install python-kombu-3.0.33-5.pulp for your distro on your system,
>>> restart Pulp services and you'll be all set. All of the links
>>> below come
>>> out of the Pulp build system[1]
>>>
>>> EL6: http://tinyurl.com/j5sy5zh
>>> EL7: http://tinyurl.com/jppwcdv
>>> F22: http://tinyurl.com/hp7z4sq
>>> F23: http://tinyurl.com/zplzua3
>>> F24: update is in -testing repo [2]
>>> rawhide: already available in stable
>>>
>>> These are the same versions of python-kombu that you'll receive
>>> when
>>> 2.8.3 comes out so upgrading this package will keep your
>>> packages on the
>>> normal upgrade path. These newer python-kombu versions are
>>> compatible
>>> with 2.8.0, 2.8.1, and 2.8.2 versions of Pulp.
>>>
>>> [0]: https://pulp.plan.io/issues/1801
>>> [1]: http://koji.katello.org/koji/packageinfo?packageID=665
>>> [2]:
>>> https://bodhi.fedoraproject.org/updates/FEDORA-2016-ec038bbf19
>>>
>>> -Brian
>>>
>>> _______________________________________________
>>> Pulp-list mailing list
>>> Pulp-list at redhat.com <mailto:Pulp-list at redhat.com>
>>> https://www.redhat.com/mailman/listinfo/pulp-list
>>>
>>>
>> _______________________________________________
>> Pulp-list mailing list
>> Pulp-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-list
>
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20160412/619a27bb/attachment.htm>
More information about the Pulp-list
mailing list