[Pulp-list] pulp3 High availability and disaster recovery

Brian Bouterse bmbouter at redhat.com
Tue Feb 18 14:45:41 UTC 2020


Hi Bin Li,

When you perform the failover to the passive standby Pulp what does the
status API show for its workers before, during, and after failover? Note
the workers and webservers all need to be using the same Redis because the
task data flows through Redis.

I'm wondering if maybe your passive Pulp after failover is indeed receiving
the workers that belong to it and that all "initially active" workers are
declared missing/dead. Then if the right workers are being shown they all
need to be using the same Redis instance (the webservers and the workers).

Also the "stalling" task should show the "worker" it was assigned to.
Sharing that info would help to know where the work was being routed to
also.



On Tue, Feb 18, 2020 at 9:28 AM Dennis Kliban <dkliban at redhat.com> wrote:

> The redis cluster support needs to be added to rq actually[0,1]. Looks
> like there is an open PR but it hasn't moved forward in a long time[2].
>
> [0] https://github.com/rq/rq/issues/862
> [1] https://github.com/rq/rq/issues/1048
> [2] https://github.com/rq/rq/pull/942
>
> On Tue, Feb 18, 2020 at 9:07 AM Dennis Kliban <dkliban at redhat.com> wrote:
>
>> How many instances of Redis are involved? Is every pulpcore-api instance
>> and pulpcore-worker instance pointing to the same redis instance? This is
>> necessary for the work to be routed correctly.
>>
>> Pulpcore currently uses redis-py, which does not support connecting to a
>> Redis Cluster[0]. However, we should investigate if it's viable to switch
>> to using redis-py-cluster[1].
>>
>> [0] https://github.com/andymccurdy/redis-py/issues/931
>> [1] https://github.com/Grokzen/redis-py-cluster
>>
>> On Thu, Feb 13, 2020, 6:28 PM Bin Li (BLOOMBERG/ 120 PARK) <
>> bli111 at bloomberg.net> wrote:
>>
>>> Hi Brian,
>>> I did a quick test on a active passive pulp 3.1 setup. Two pulp servers
>>> are pointing to the same external postgres database. Only one server is
>>> active at any time. Redis queue resides on the localhost. The /var/lib/pulp
>>> are synced from primary to the contingency host.
>>> After I shutdown primary host, I was able to bring up the contingency
>>> pulp server and created a repo. Deleting any repo stuck in a waiting state.
>>> Then I started primary host and shutdown contingency host, I was able to
>>> delete repos I created on the contingency host but all previous delete job
>>> continually stuck in the waiting state.
>>> I am wonder if anything I could do to make this work on contingency host
>>> or this setup is not going to work?
>>>
>>> Thanks
>>>
>>>
>>> From: pulp-list at redhat.com At: 01/03/20 12:01:44
>>> To: pulp-list at redhat.com
>>> Subject: Pulp-list Digest, Vol 122, Issue 1
>>>
>>> Send Pulp-list mailing list submissions to
>>> pulp-list at redhat.com
>>>
>>> To subscribe or unsubscribe via the World Wide Web, visit
>>> https://www.redhat.com/mailman/listinfo/pulp-list
>>> or, via email, send a message with subject or body 'help' to
>>> pulp-list-request at redhat.com
>>>
>>> You can reach the person managing the list at
>>> pulp-list-owner at redhat.com
>>>
>>> When replying, please edit your Subject line so it is more specific
>>> than "Re: Contents of Pulp-list digest..."
>>>
>>>
>>> Today's Topics:
>>>
>>> 1. Re: pulp3 High availability and disaster recovery (Brian Bouterse)
>>>
>>>
>>> ----------------------------------------------------------------------
>>>
>>> Message: 1
>>> Date: Thu, 2 Jan 2020 16:10:29 -0500
>>> From: Brian Bouterse <bmbouter at redhat.com>
>>> To: JASON STELZER <jasonstelzer at boomi.com>
>>> Cc: pulp-list <pulp-list at redhat.com>
>>> Subject: Re: [Pulp-list] pulp3 High availability and disaster recovery
>>> Message-ID:
>>> <CAAcvrTGDYCJxcO3TR50Wub1j2Suc6g9Q1_yqjVdsYS_t44qDYw at mail.gmail.com>
>>> Content-Type: text/plain; charset="utf-8"
>>>
>>> Sorry for the late reply. Each component of Pulp itself can be deployed
>>> in
>>> HA configurations. Of the services Pulp's processes depend on, Redis is
>>> the
>>> one service that can't run as a full cluster because RQ doesn't support
>>> that yet, so the best you can do is a hot-spare Redis that auto-fails
>>> over.
>>> That isn't graceful failover so when traffic routes to your hot-spare
>>> Redis
>>> it has to data and doesn't have the tasking system's data. Those Pulp
>>> tasks
>>> would be cancelled, and Pulp would be immediately ready to accept new
>>> tasks
>>> so they could be resubmitted, e.g. Katello resubmits some job failures I
>>> believe.
>>>
>>> More docs about this are here:
>>> https://docs.pulpproject.org/components.html#architecture-and-deploying
>>> More questions are welcome; sorry for the slow response. If you can see
>>> any
>>> way to improve the docs and want to get involved, PRs are welcome!
>>>
>>> -Brian
>>>
>>>
>>> On Mon, Nov 18, 2019 at 7:37 AM JASON STELZER <jasonstelzer at boomi.com>
>>> wrote:
>>>
>>> > For what it is worth, at heart pulp3 is a django app. So, following the
>>> > advice for HA and django apps generally works. A lot of it is driven
>>> by the
>>> > particulars of your use case.
>>> >
>>> > My use case is a little different than yours I'm sure. But in terms of
>>> HA
>>> > for now I'm good with a balancer and nodes in multiple azs, an RDS db
>>> with
>>> > failover, and regular db backups.
>>> >
>>> > In my case, the pulp3 server is far enough behind the scenes that even
>>> if
>>> > there were to be a several hour outage, the impact would be minimal.
>>> YMMV.
>>> >
>>> > Others can chime in with pulp3 specifics.
>>> >
>>> > On Fri, Nov 15, 2019 at 11:41 AM Bin Li (BLOOMBERG/ 120 PARK) <
>>> > bli111 at bloomberg.net> wrote:
>>> >
>>> >> Does pulp3 support active/active or active/passive configuration?
>>> What is
>>> >> the strategy to restore the pulp3 service on a different server if the
>>> >> primary is down? Do we have any documentation on this topic?
>>> >>
>>> >> Thanks
>>> >> _______________________________________________
>>> >> Pulp-list mailing list
>>> >> Pulp-list at redhat.com
>>> >> https://www.redhat.com/mailman/listinfo/pulp-list
>>> >
>>> >
>>> >
>>> > --
>>> > J.
>>> > _______________________________________________
>>> > Pulp-list mailing list
>>> > Pulp-list at redhat.com
>>> > https://www.redhat.com/mailman/listinfo/pulp-list
>>> -------------- next part --------------
>>> An HTML attachment was scrubbed...
>>> URL:
>>> <
>>> https://www.redhat.com/archives/pulp-list/attachments/20200102/4cc40982/atta
>>> chment.html>
>>>
>>> ------------------------------
>>>
>>> _______________________________________________
>>> Pulp-list mailing list
>>> Pulp-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-list
>>>
>>> End of Pulp-list Digest, Vol 122, Issue 1
>>> *****************************************
>>>
>>>
>>> _______________________________________________
>>> Pulp-list mailing list
>>> Pulp-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-list
>>
>> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20200218/6048049e/attachment.htm>


More information about the Pulp-list mailing list