[Pulp-list] Messaging Questions

Jeff Ortel jortel at redhat.com
Wed Jul 7 21:37:46 UTC 2010



On 07/07/2010 08:57 AM, Jason Dobies wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Sorry I'm so late getting back to this.
>
>>> Synchronous messages will fail immediately if the agent is unavailable
>>> so let's assume for the moment, we're only talking about asynchronous
>>> messages (RMI).
>
> Makes sense.
>
>>> I believe that all asynchronous messages to the agent
>>> should be dispatched through the Tasking framework. That way *all*
>>> policy around asynchronous operations will be in one place.
>
> So in this rationale, all message bus invocations are synchronous, they
> just get their asynchronous-ness from our task framework?
>
> I like that, it limits the amount of places we need to address these
> ugly cases.

Yes.

>
>>> The
>>> lifecycle of the asynchronous message should be tied to (and implemented
>>> by) the Task.  So long as the task lives, the message should also live.
>>> If the task times out, the message should be dequeued.
>
> Does our tasking framework support time outs yet?

Not yet.

>
>>>   So, the
>>> messaging framework need to support message dequeuing.  It can do this
>>> by sending a cancellation message with higher priority if dequeuing not
>>> directly supported by qpid.
>>>
>>> This leaves orphaned queues for consumer un-registration.  Seems like
>>> the ConsumerApi could be responsible for this by doing something like:
>>
>>
>> from pulp.agent import Agent
>> Agent.purge('foo')
>>
>>
>>> which would remove the associated queue.
>
> That covers the case where an agent knowingly is going away, but what
> about when the consumer just full on disappears? For instance, the box
> is reprovisioned, goes up in flames, or whatever other reason and the
> admin doesn't think to unregister it?

Unfortunately, AWOL consumers leave a lot of resources that need to be cleaned up.  I'll 
ping the qpid guys and see how earnest we need to be about cleaning up dead queues.

>
> I think we still need some sort of reaper/ping on agents to make sure
> they are still alive. That'll get us into the questions on what happens
> if an agent is temporarily down, but I think those are better than the
> alternative of dead queues floating around.

Agreed.

Thinking that agents would publish heartbeat events on the bus ...

>
>>> The messaging framework (pmf) ensures that messages are processed
>>> (dispatched) before they are acknowledged (taken from the queue).  This
>>> prevents against cases where the agent consumes a message then dies and
>>> thus never processes it.  Due to guaranteed message delivery, the agent
>>> will always reply unless it's dead.  In which case, see above.
>
> I see what you're saying, but I'm thinking of a different case. Maybe
> I'm viewing this wrong. I thought the flow looked like:
>
> - - Server sends message to agent
> - - Agent acknowledges and says it'll start processing the request
> - - Server makes note somewhere that the requested action is "in progress"
> - - Later, when it's finished, the agent sends a message to the server
> that the operation has completed and its status. Looking at the wiki,
> this looks like its sent to the server queue.

The approach I'm thinking of is that async activities will be dispatched using async 
tasks.  When a task runs, it does synchronous RMI on the agent.  If the agent is 
unavailable, the messaging framework reports this immediately.  The task catch the 
'Unavailable' exception, goes to RETRY (later) state.  This way, all this logic is in the 
Tasks framework.

The task states would go something like this:

<agent is down>

NEW
RETRY
<try again later>
RETRY
<agent is alive now>
IN-PROGRESS
FINISHED

>
> If that's the case, then my question is about what happens when that
> last bullet point doesn't happen (for instance, zombie attack caused the
> power to go out and the machine died). Won't there still be something in
> the server that says "I sent a message to the agent that was accepted,
> but he never sent me a message back. I'm sad."

This is a good reason to keep all the asynchronous behaviour in the Task framework.

>
> If that's not the case, can you clear up how that flow looks for me?

Yeah.  It would be a mess.

>
>>> Yes.  All requests (messages) have unique serial numbers which are
>>> placed in the reply and matched by the message framework.  Agent B, will
>>> never see request 1234.  This behaviour is standardized and enforced by
>>> the messaging framework.
>
> I'm gonna punt on my follow up question until I'm clear on the above
> flow so I don't make us discuss something that's potentially not relevant.
>
>
>>> Assuming that it cannot re-register with the same ID, it would be
>>> considered a new consumer.   The previous registration, will orphan many
>>> resources in pulp - including the queue.  Orphans need to be addressed
>>> across the board.  See comment above for queue clean up.
>>
>>> If not, when does that queue get deleted? What happens if that
>> re-registration happens while the agent is doing a task before it
>> replies, will it confuse the server that the reply came from a
>> "different" consumer?
>>
>> - Are replies back to the server guaranteed delivery as well?
>>
>>> Yes.
>>
>>> I'm
>> thinking of the situation where the server is offline when the agent
>> finishes doing its business.
>>
>>
>>
>>>
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list
>
>> _______________________________________________
>> Pulp-list mailing list
>> Pulp-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-list
>
>
> - --
> Jason Dobies
> RHCE# 805008743336126
> Freenode: jdob
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v2.0.14 (GNU/Linux)
> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
>
> iQEcBAEBAgAGBQJMNIeyAAoJEOMmcTqOSQHCR8sH/iTzlTtNyl2yL22H08fC2yUI
> IycGbHieGXdgG1/0+b8vu/tKxUVDbLO3jA7NGPljJyJH7Nc2BDzRwrDoV8lFogmP
> GVtbJ8lxwgq1w0ITg6AP4WDRu56dMOn12m0eSVn0TPiEbuw7Io4Vfaqbd1EgQNBi
> QsrUj1MIZJo4xuugbiBF8albqI+TXyafqmLs8sMKko00rT06hTZtlKLg9SKmfx3u
> V/D3nrftjYPHOQpdZIZ16xO/GqdZUQ9gGOS+Cz5f8+BQi7OBYMlogtncHzjffgVS
> 1vThgFo7XopWDZjL1IGwZsBAScG2w+pO36tCG40JZZwTrkC3qr7Ef/mFAqvJjbQ=
> =tIUF
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Pulp-list mailing list
> Pulp-list at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-list

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5126 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20100707/cd8ac784/attachment.p7s>


More information about the Pulp-list mailing list