[Pulp-list] Stuck repo sync job

Brian Bouterse bbouters at redhat.com
Fri Apr 8 20:33:53 UTC 2016


After you reboot and restart all Pulp services I expect there to be 0
messages on the 'resource_manager' queue. Can you show the queue depths
of your broker in those situations for all queues? With Qpid you can do
this with `qpid-stat -q` if I remember correctly. RabbitMQ has a similar
command but I don't know it.

Also for those "stuck" tasks that are not starting when you expect them
too, can you see if they are "assigned" to a worker. This doesn't show
up in pulp-admin output[0] of a task detail so instead use a command
like this to show the task details by uuid.

pulp-admin -vv tasks details --task-id a83b32c4-4cb1-439e-a373-91797d0b185a

The -vv part shows the actual webserver response which will contain a
line like:  "worker_name": "reserved_resource_worker-0 at dev",

This info would be helpful in resolving your issue.

[0]: https://pulp.plan.io/issues/1832

-Brian

On 04/08/2016 03:15 PM, Matthew Madey wrote:
> When I checked on the state, the status was "not started". I think
> that's why it remained stuck after reboots and bouncing the services. 
> 
> On Fri, Apr 8, 2016 at 1:27 PM, Brian Bouterse <bbouters at redhat.com
> <mailto:bbouters at redhat.com>> wrote:
> 
>     I'm not sure how you would get into this situations. When it occurs can
>     you check which worker is assigned the work, and verify that that worker
>     is still running?
> 
>     Upon starting a worker will move previous tasks it was handling that are
>     still in the running state to cancelled. Also pulp_celerybeat monitors
>     pulp workers to determine if died to move its tasks to a cancelled
>     state. Both of these mechanisms would have to fail in order to have a
>     task stay in the running state when its not running. Could it still be
>     running?
> 
>     You indicate you rebooted the box and those tasks didn't go to
>     cancelled. Is it possible they are on another box connected to your
>     broker or when killing them it didn't respond to the signal you sent it?
> 
>     The 2.7.1 doesn't have any known defects like the ones your describing
>     so sending more info to the list would be good.
> 
>     -Brian
> 
> 
>     On 03/30/2016 05:58 PM, Matthew Madey wrote:
>     > I have a job that mistakenly thinks it's still running..
>     >
>     > # pulp-admin -u admin -p ************ rpm repo sync run
>     > --repo-id=rhel-x86_64-server-7-base-tools
>     >
>     +----------------------------------------------------------------------+
>     >        Synchronizing Repository [rhel-x86_64-server-7-base-tools]
>     >
>     +----------------------------------------------------------------------+
>     >
>     > A sync task is already in progress for this repository. Its
>     progress will be
>     > tracked below.
>     >
>     > This command may be exited via ctrl+c without affecting the request.
>     >
>     > [/]
>     > Waiting to begin...
>     >
>     >
>     > I checked all running processes and there is no repo sync currently
>     > running. I have even gone so far as to delete the repo and
>     recreate it..
>     > same issue. I have also tried running pulp-admin orphan remove --all,
>     > which executes successfully, but does not fix the problem. I have also
>     > tried rebooting the server, bouncing all pulp services.. still no joy.
>     > I'm guessing there is a file somewhere that tracks pending tasks? How
>     > can I clear this so I can successfully run the job again? I'm running
>     > Pulp 2.7.1-1
>     >
>     >
>     > _______________________________________________
>     > Pulp-list mailing list
>     > Pulp-list at redhat.com <mailto:Pulp-list at redhat.com>
>     > https://www.redhat.com/mailman/listinfo/pulp-list
>     >
> 
>     _______________________________________________
>     Pulp-list mailing list
>     Pulp-list at redhat.com <mailto:Pulp-list at redhat.com>
>     https://www.redhat.com/mailman/listinfo/pulp-list
> 
> 




More information about the Pulp-list mailing list