[Pulp-dev] #2186 (pulp-manage-db) options

Mon Dec 12 03:36:14 UTC 2016

I like option 2 a lot. I'm in favor of revising the story to have it
implement this. It should display to the user how long it's going to wait,
as in: "Waiting 17 more seconds to ensure no Pulp workers are running." I
don't think we need a fancy countdown of any kind. Maybe print a "." every
second if you want extra credit.

Option 1 could make it more difficult for someone to automate the upgrade.
The overall experience of option 2 seems more user-friendly.

I like that both of these can be implemented and make a difference for a
user's next upgrade, rather than needing to use another upgrade just to get
data quality straightened out.

Thanks for adding new ideas!

Michael

On Fri, Dec 9, 2016 at 4:07 PM, Brian Bouterse <bbouters at redhat.com> wrote:

> Looking at the process list is easy, but Pulp is a clustered software so
> it isn't effective for those users. In terms of the value in checking for
> running workers, a clustered installation is really where a user would
> especially want this feature. Let's think through some more ideas. Here are
> two options, I'm a big fan of option 2.
>
> Option 1:
> * bring the feature back that checks the database (this is easy because we
> already have the code)
> * have the check consider worker records dead if they are older than 1
> minute (not 5)
> * also update the worker failover time to be 1 minute instead of 5 (also
> easy, and not a requirement since the worker check and worker failover do
> not have to use the same timing)
> * do not prompt for y/N but instead gave the user output and accepted an
> --ignore-workers flag to override
> * Make the output from the check be more verbose and identify the problem,
> the argument they could pass, and explain that they can wait 60 seconds,
> and also list the worker entries that are still running.
>
> Option 2:
> As a variation on the above, instead of failing and having a flag, have
> pulp-manage-db wait until 1-minute after the latest timestamp and check the
> db again. This would potentially be < 20 seconds allowing 40 seconds for
> package upgrade. This would be a great user experience because there are no
> flags, it would just work and take a little longer.
>
> I think a user who sees an error they experience during upgrade and can
> wait 20ish seconds for it to be confirmed as a real issue is an ok user
> experience. I think an interactive user would even be hard pressed to
> experience this delay by upgrading the packages and running pulp-manage-db
> in under 60 seconds.
>
> What do you all think about ^?
>
>
>
>
>
> On Thu, Dec 8, 2016 at 5:25 PM, Michael Hrivnak <mhrivnak at redhat.com>
> wrote:
>
>>
>>
>> On Thu, Dec 8, 2016 at 1:49 PM, Bihan Zhang <bizhang at redhat.com> wrote:
>>
>>> The changes made for #2186 [0] was pulled from the 2.11.0 release
>>> yesterday, and we should talk about how to implement it for 2.12
>>>
>>> From what I can see there are 2 ways to move forward with #2186
>>>
>>> *1.* We can fix the pulp worker db record cleanup so that
>>> pulp_celerybeat exits cleanly (aka put this back: [1]) and make new changes
>>> to clean up pulp_workers with a SIGTERM handler in a 2.11.z release.
>>> We can then re-revert the commit and put the feature back in 2.12 with
>>> little effort.
>>>
>>
>> We would also need a way for pulp-manage-db to know if a record in the
>> Worker collection was created by a version of pulp that does not do full
>> cleanup. As described in a previous email, one way would be to add some
>> kind of new field to the Worker model, such as the version of pulp the
>> worker was running.
>>
>>
>>>
>>> The original reason #2186 was implemented using the db records was so we
>>> can support a clustered pulp installation.
>>> But this approach would make migration to 2.12 more difficult, since
>>> users now have to upgrade to the 2.11.z release first before going to 2.12
>>>
>>>
>> Perhaps the original intent was lost. This feature was motivated
>> primarily by katello deployments (all of which are single-system), where
>> users either forgot to stop pulp services or didn't know that was a
>> requirement before running pulp-manage-db. The normal upgrade process for
>> them handles it automatically, but when manual intervention is required,
>> that's where we've seen people running pulp-manage-db while pulp services
>> are running.
>>
>>
>>> *2. *We can rethinking our approach to #2186 and perform the check
>>> against the process list
>>> Upgrade-wise implementing it this way is a lot easier for users, since
>>> they can do a straight upgrade to 2.12 without going through an
>>> intermediary release.
>>> The downside is that in clustered environments this would not catch
>>> every potential error. But #2186 is a best effort story, and if this is the
>>> best effort I am ok with it.
>>>
>>
>> This is easy and effective. I think we should do it, and augment/replace
>> it in the future with the worker checking.
>>
>>
>>
>>> Regardless of which option we go with I think we should get the pulp
>>> worker db cleanup in.
>>> We should also have a --ignore-running-worker flag [2] to prevent
>>> automated upgrade problems.
>>>
>>>
>>> [0] https://pulp.plan.io/issues/2186
>>> [1] https://github.com/werwty/pulp/commit/4f43a85dd568f4a0b5
>>> 0ae9b07bbec7138861e92b#diff-80e8b96df1f5da9a551bb6ff18dea352
>>> [2] https://pulp.plan.io/issues/2469
>>>
>>> _______________________________________________
>>> Pulp-dev mailing list
>>> Pulp-dev at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>
>>>
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20161211/49ad082f/attachment.htm>