[Pulp-list] Stuck repo sync job
Brian Bouterse
bbouters at redhat.com
Fri Apr 8 20:33:53 UTC 2016
After you reboot and restart all Pulp services I expect there to be 0
messages on the 'resource_manager' queue. Can you show the queue depths
of your broker in those situations for all queues? With Qpid you can do
this with `qpid-stat -q` if I remember correctly. RabbitMQ has a similar
command but I don't know it.
Also for those "stuck" tasks that are not starting when you expect them
too, can you see if they are "assigned" to a worker. This doesn't show
up in pulp-admin output[0] of a task detail so instead use a command
like this to show the task details by uuid.
pulp-admin -vv tasks details --task-id a83b32c4-4cb1-439e-a373-91797d0b185a
The -vv part shows the actual webserver response which will contain a
line like: "worker_name": "reserved_resource_worker-0 at dev",
This info would be helpful in resolving your issue.
[0]: https://pulp.plan.io/issues/1832
-Brian
On 04/08/2016 03:15 PM, Matthew Madey wrote:
> When I checked on the state, the status was "not started". I think
> that's why it remained stuck after reboots and bouncing the services.
>
> On Fri, Apr 8, 2016 at 1:27 PM, Brian Bouterse <bbouters at redhat.com
> <mailto:bbouters at redhat.com>> wrote:
>
> I'm not sure how you would get into this situations. When it occurs can
> you check which worker is assigned the work, and verify that that worker
> is still running?
>
> Upon starting a worker will move previous tasks it was handling that are
> still in the running state to cancelled. Also pulp_celerybeat monitors
> pulp workers to determine if died to move its tasks to a cancelled
> state. Both of these mechanisms would have to fail in order to have a
> task stay in the running state when its not running. Could it still be
> running?
>
> You indicate you rebooted the box and those tasks didn't go to
> cancelled. Is it possible they are on another box connected to your
> broker or when killing them it didn't respond to the signal you sent it?
>
> The 2.7.1 doesn't have any known defects like the ones your describing
> so sending more info to the list would be good.
>
> -Brian
>
>
> On 03/30/2016 05:58 PM, Matthew Madey wrote:
> > I have a job that mistakenly thinks it's still running..
> >
> > # pulp-admin -u admin -p ************ rpm repo sync run
> > --repo-id=rhel-x86_64-server-7-base-tools
> >
> +----------------------------------------------------------------------+
> > Synchronizing Repository [rhel-x86_64-server-7-base-tools]
> >
> +----------------------------------------------------------------------+
> >
> > A sync task is already in progress for this repository. Its
> progress will be
> > tracked below.
> >
> > This command may be exited via ctrl+c without affecting the request.
> >
> > [/]
> > Waiting to begin...
> >
> >
> > I checked all running processes and there is no repo sync currently
> > running. I have even gone so far as to delete the repo and
> recreate it..
> > same issue. I have also tried running pulp-admin orphan remove --all,
> > which executes successfully, but does not fix the problem. I have also
> > tried rebooting the server, bouncing all pulp services.. still no joy.
> > I'm guessing there is a file somewhere that tracks pending tasks? How
> > can I clear this so I can successfully run the job again? I'm running
> > Pulp 2.7.1-1
> >
> >
> > _______________________________________________
> > Pulp-list mailing list
> > Pulp-list at redhat.com <mailto:Pulp-list at redhat.com>
> > https://www.redhat.com/mailman/listinfo/pulp-list
> >
>
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com <mailto:Pulp-list at redhat.com>
> https://www.redhat.com/mailman/listinfo/pulp-list
>
>
More information about the Pulp-list
mailing list