[Pulp-list] Missing celery workers

Brian Bouterse bbouters at redhat.com
Wed Nov 18 23:13:10 UTC 2015


tl;dr I bet anyone experiencing this issue is running CentOS. It's not
yet known how to resolve the issue.

We should avoid sending email to everyone on the e-mail list for further
investigation so let's use the bug tracker [0] to work on this more
together. On any Pulp issue you can 'watch' the issue there and you'll
receive e-mail updates as new info becomes available.

If you're interested, go read more about the details there [0].

@chandlermelton thanks for your comment there. It provided a key piece
of information.

Can users confirm via a comment on the bug report [0] that installations
which are affected are or are not running CentOS and which version?

[0]: https://pulp.plan.io/issues/1380

-Brian


On 11/18/2015 12:17 PM, Joel Golden wrote:
> I submitted this a couple day ago.  Subject: Workers disappearing /
> canceled tasks after upgrading to 2.7.0.1 - CentOS 6.7
> 
> Can these be merged?
> 
> On Wed, Nov 18, 2015 at 9:05 AM, Brian Bouterse <bbouters at redhat.com
> <mailto:bbouters at redhat.com>> wrote:
> 
>     Jason and Jeffrey,
> 
>     Thanks for reporting this. I've written up a bug [0] and I am
>     investigating the root cause.
> 
>     On the bug are you able to leave some answers to these questions?
> 
>     - Can you confirm that it affects both RabbitMQ and Qpid usage?
>     - Can you confirm that the workers "go missing" and then return, and
>     then "go missing" in a continuous cycle? I expect it to happen every 90
>     seconds.
> 
>     - Jeffrey specifically, what OS are you using?
> 
>     [0]: https://pulp.plan.io/issues/1380
> 
>     Thanks,
>     Brian
> 
>     On 11/18/2015 09:33 AM, Miller, Jeffrey L wrote:
>     > I am seeing this behavior as well after upgrading from 2.6 to 2.7.
>     > However, I am using qpid not rabbitmq.
>     >
>     >
>     >
>     > -Jeffrey
>     >
>     >
>     >
>     >
>     >
>     >
>     >
>     > *From:* pulp-list-bounces at redhat.com
>     <mailto:pulp-list-bounces at redhat.com>
>     > [mailto:pulp-list-bounces at redhat.com
>     <mailto:pulp-list-bounces at redhat.com>] *On Behalf Of *Ashby, Jason (IMS)
>     > *Sent:* Wednesday, November 18, 2015 8:29 AM
>     > *To:* pulp-list at redhat.com <mailto:pulp-list at redhat.com>
>     > *Subject:* [Pulp-list] Missing celery workers
>     >
>     >
>     >
>     > Hi all,
>     >
>     > I’m hitting another issue with the upgrade to Pulp 2.7.0 +
>     changing from
>     > qpid to rabbitmq for messaging.  The workers are continuously going
>     > missing, every minute or so.  The effect is that the tasks in the task
>     > list stay in a Waiting state and are never completed.
>     >
>     >
>     >
>     > Rabbitmq looks healthy; I see successful accepted connections per the
>     > logs and can see a bunch of connections in the rabbitmq management
>     GUI.
>     > I’m kind of stuck as far as troubleshooting goes.  Any tips on
>     what else
>     > to investigate?
>     >
>     >
>     >
>     > Pulp and rabbitmq servers are both CentOS 6.
>     >
>     >
>     >
>     > # /var/log/messages
>     >
>     > Nov 18 08:53:56 pulp01 pulp: celery.worker.consumer:INFO: missed
>     > heartbeat from resource_manager at pulp01
>     >
>     > Nov 18 09:05:46 pulp01 pulp:
>     pulp.server.async.worker_watcher:INFO: New
>     > worker 'reserved_resource_worker-3 at pulp01' discovered
>     >
>     > Nov 18 09:05:46 pulp01 pulp:
>     pulp.server.async.worker_watcher:INFO: New
>     > worker 'reserved_resource_worker-1 at pulp01' discovered
>     >
>     > Nov 18 09:05:46 pulp01 pulp:
>     pulp.server.async.worker_watcher:INFO: New
>     > worker 'reserved_resource_worker-2 at pulp01' discovered
>     >
>     > Nov 18 09:05:46 pulp01 pulp:
>     pulp.server.async.worker_watcher:INFO: New
>     > worker 'reserved_resource_worker-0 at pulp01' discovered
>     >
>     > Nov 18 09:05:56 pulp01 pulp:
>     pulp.server.async.worker_watcher:INFO: New
>     > worker 'resource_manager at pulp01' discovered
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
>     > 'reserved_resource_worker-3 at pulp01' has gone missing, removing
>     from list
>     > of work
>     >
>     > ers
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
>     > named reserved_resource_worker-3 at pulp01 is missing. Canceling the
>     tasks
>     > in its q
>     >
>     > ueue.
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
>     > 'reserved_resource_worker-1 at pulp01' has gone missing, removing
>     from list
>     > of work
>     >
>     > ers
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
>     > named reserved_resource_worker-1 at pulp01 is missing. Canceling the
>     tasks
>     > in its q
>     >
>     > ueue.
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
>     > 'reserved_resource_worker-2 at pulp01' has gone missing, removing
>     from list
>     > of work
>     >
>     > ers
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
>     > named reserved_resource_worker-2 at pulp01 is missing. Canceling the
>     tasks
>     > in its q
>     >
>     > ueue.
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
>     > 'reserved_resource_worker-0 at pulp01' has gone missing, removing
>     from list
>     > of work
>     >
>     > ers
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
>     > named reserved_resource_worker-0 at pulp01 is missing. Canceling the
>     tasks
>     > in its q
>     >
>     > ueue.
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: Worker
>     > 'resource_manager at pulp01' has gone missing, removing from list of
>     workers
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.tasks:ERROR: The worker
>     > named resource_manager at pulp01 is missing. Canceling the tasks in
>     its queue.
>     >
>     > Nov 18 09:06:46 pulp01 pulp: pulp.server.async.scheduler:ERROR: There
>     > are 0 pulp_resource_manager processes running. Pulp will not operate
>     > correctly without
>     >
>     > at least one pulp_resource_mananger process running.
>     >
>     >
>     >
>     >
>     ------------------------------------------------------------------------
>     >
>     >
>     > Information in this e-mail may be confidential. It is intended only for
>     > the addressee(s) identified above. If you are not the addressee(s), or
>     > an employee or agent of the addressee(s), please note that any
>     > dissemination, distribution, or copying of this communication is
>     > strictly prohibited. If you have received this e-mail in error, please
>     > notify the sender of the error.
>     >
>     >
>     >
>     > _______________________________________________
>     > Pulp-list mailing list
>     > Pulp-list at redhat.com <mailto:Pulp-list at redhat.com>
>     > https://www.redhat.com/mailman/listinfo/pulp-list
>     >
> 
>     _______________________________________________
>     Pulp-list mailing list
>     Pulp-list at redhat.com <mailto:Pulp-list at redhat.com>
>     https://www.redhat.com/mailman/listinfo/pulp-list
> 
> 
> 
> 
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
> 




More information about the Pulp-list mailing list