[Pulp-list] workers keep dissapearing

Cristian Falcas cristi.falcas at gmail.com
Mon Mar 16 21:59:05 UTC 2015


Hello,

I'm trying to install 2 pulp nodes in 2 different regions: one in US,
the other in Romania. The one from US is the "master" one: has qpid
and mongodb (both with ssl enabled). Also I want to use this one one
to sync the server from Romania (it will be a child node).

Unfortunately, I can't get the workers from Romania to stay up, so any
sync request remains in "Waiting to begin...". I see those messages in
the logs from both pulp servers:

Mar 16 23:51:57 host_dc1 pulp[19906]:
pulp.server.async.scheduler:ERROR: Workers
'reserved_resource_worker-1 at host_dc1.company.net' has gone missing,
removing from list of workers
Mar 16 23:51:57 host_dc1 pulp[19906]: pulp.server.async.tasks:ERROR:
The worker named reserved_resource_worker-1 at host_dc1.company.net is
missing. Canceling the tasks in its queue.
Mar 16 23:51:58 host_dc1 pulp[19906]:
pulp.server.async.scheduler:ERROR: Workers
'reserved_resource_worker-3 at host_dc1.company.net' has gone missing,
removing from list of workers
Mar 16 23:51:58 host_dc1 pulp[19906]: pulp.server.async.tasks:ERROR:
The worker named reserved_resource_worker-3 at host_dc1.company.net is
missing. Canceling the tasks in its queue.
Mar 16 23:51:58 host_dc1 pulp[19906]:
pulp.server.async.worker_watcher:INFO: New worker
'reserved_resource_worker-1 at host_dc1.company.net' discovered
Mar 16 23:51:58 host_dc1 pulp[19906]:
pulp.server.async.scheduler:ERROR: Workers
'reserved_resource_worker-2 at host_dc1.company.net' has gone missing,
removing from list of workers
Mar 16 23:51:58 host_dc1 pulp[19906]: pulp.server.async.tasks:ERROR:
The worker named reserved_resource_worker-2 at host_dc1.company.net is
missing. Canceling the tasks in its queue.
Mar 16 23:51:58 host_dc1 pulp[19906]:
pulp.server.async.scheduler:ERROR: Workers
'resource_manager at host_dc1.company.net' has gone missing, removing
from list of workers
Mar 16 23:51:58 host_dc1 pulp[19906]: pulp.server.async.tasks:ERROR:
The worker named resource_manager at host_dc1.company.net is missing.
Canceling the tasks in its queue.
Mar 16 23:51:58 host_dc1 pulp[19906]:
pulp.server.async.worker_watcher:INFO: New worker
'reserved_resource_worker-2 at host_dc1.company.net' discovered
Mar 16 23:51:59 host_dc1 pulp[19906]:
pulp.server.async.worker_watcher:INFO: New worker
'resource_manager at host_dc1.company.net' discovered

I presume it's some kind of timeout at play that sees the second node
disconnecting. But where to look and what to change?

Thank you,
Cristian Falcas




More information about the Pulp-list mailing list