[Pulp-list] Missing celery workers

Joel Golden slalomnut at gmail.com
Wed Nov 18 17:17:41 UTC 2015


I submitted this a couple day ago.  Subject: Workers disappearing /
canceled tasks after upgrading to 2.7.0.1 - CentOS 6.7

Can these be merged?

On Wed, Nov 18, 2015 at 9:05 AM, Brian Bouterse <bbouters at redhat.com> wrote:

> Jason and Jeffrey,
>
> Thanks for reporting this. I've written up a bug [0] and I am
> investigating the root cause.
>
> On the bug are you able to leave some answers to these questions?
>
> - Can you confirm that it affects both RabbitMQ and Qpid usage?
> - Can you confirm that the workers "go missing" and then return, and
> then "go missing" in a continuous cycle? I expect it to happen every 90
> seconds.
>
> - Jeffrey specifically, what OS are you using?
>
> [0]: https://pulp.plan.io/issues/1380
>
> Thanks,
> Brian
>
> On 11/18/2015 09:33 AM, Miller, Jeffrey L wrote:
> > I am seeing this behavior as well after upgrading from 2.6 to 2.7.
> > However, I am using qpid not rabbitmq.
> >
> >
> >
> > -Jeffrey
> >
> >
> >
> >
> >
> >
> >
> > *From:* pulp-list-bounces at redhat.com
> > [mailto:pulp-list-bounces at redhat.com] *On Behalf Of *Ashby, Jason (IMS)
> > *Sent:* Wednesday, November 18, 2015 8:29 AM
> > *To:* pulp-list at redhat.com
> > *Subject:* [Pulp-list] Missing celery workers
> >
> >
> >
> > Hi all,
> >
> > I’m hitting another issue with the upgrade to Pulp 2.7.0 + changing from
> > qpid to rabbitmq for messaging.  The workers are continuously going
> > missing, every minute or so.  The effect is that the tasks in the task
> > list stay in a Waiting state and are never completed.
> >
> >
> >
> > Rabbitmq looks healthy; I see successful accepted connections per the
> > logs and can see a bunch of connections in the rabbitmq management GUI.
> > I’m kind of stuck as far as troubleshooting goes.  Any tips on what else
> > to investigate?
> >
> >
> >
> > Pulp and rabbitmq servers are both CentOS 6.
> >
> >
> >
> > # /var/log/messages
> >
> > Nov 18 08:53:56 pulp01 pulp: celery.worker.consumer:INFO: missed
> > heartbeat from resource_manager at pulp01
> >
> > Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> > worker 'reserved_resource_worker-3 at pulp01' discovered
> >
> > Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> > worker 'reserved_resource_worker-1 at pulp01' discovered
> >
> > Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> > worker 'reserved_resource_worker-2 at pulp01' discovered
> >
> > Nov 18 09:05:46 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> > worker 'reserved_resource_worker-0 at pulp01' discovered
> >
> > Nov 18 09:05:56 pulp01 pulp: pulp.server.async.worker_watcher:INFO: New
> > worker 'resource_manager at pulp01' discovered
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> > 'reserved_resource_worker-3 at pulp01' has gone missing, removing from list
> > of work
> >
> > ers
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> > named reserved_resource_worker-3 at pulp01 is missing. Canceling the tasks
> > in its q
> >
> > ueue.
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> > 'reserved_resource_worker-1 at pulp01' has gone missing, removing from list
> > of work
> >
> > ers
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> > named reserved_resource_worker-1 at pulp01 is missing. Canceling the tasks
> > in its q
> >
> > ueue.
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> > 'reserved_resource_worker-2 at pulp01' has gone missing, removing from list
> > of work
> >
> > ers
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> > named reserved_resource_worker-2 at pulp01 is missing. Canceling the tasks
> > in its q
> >
> > ueue.
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> > 'reserved_resource_worker-0 at pulp01' has gone missing, removing from list
> > of work
> >
> > ers
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> > named reserved_resource_worker-0 at pulp01 is missing. Canceling the tasks
> > in its q
> >
> > ueue.
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
> > 'resource_manager at pulp01' has gone missing, removing from list of
> workers
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
> > named resource_manager at pulp01 is missing. Canceling the tasks in its
> queue.
> >
> > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: There
> > are 0 pulp_resource_manager processes running. Pulp will not operate
> > correctly without
> >
> > at least one pulp_resource_mananger process running.
> >
> >
> >
> > ------------------------------------------------------------------------
> >
> >
> > Information in this e-mail may be confidential. It is intended only for
> > the addressee(s) identified above. If you are not the addressee(s), or
> > an employee or agent of the addressee(s), please note that any
> > dissemination, distribution, or copying of this communication is
> > strictly prohibited. If you have received this e-mail in error, please
> > notify the sender of the error.
> >
> >
> >
> > _______________________________________________
> > Pulp-list mailing list
> > Pulp-list at redhat.com
> > https://www.redhat.com/mailman/listinfo/pulp-list
> >
>
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20151118/086f46ec/attachment.htm>


More information about the Pulp-list mailing list