[Pulp-list] Messaging Questions

Jeff Ortel jortel at redhat.com
Thu Jul 8 14:29:36 UTC 2010



On 07/07/2010 06:27 PM, Bryan Kearney wrote:
> Jortel:
>
> Two questions:
>
> 1) at what point does this become QMF?

This is still a long way from QMF.  But, a good question to periodically ask as we go along.

> 2) Does AMQP support the notion of temporary queues? That could/should
> solve te dead queue issue.

Yes it does.  If we stick with only synchronous requests to that agent and leave 
asynchronous stuff to the pulp Task engine, temporary (non-durable) queues will be a good 
approach to pruning dead queues.

> -- bk
>
>
>
> On 07/07/2010 05:37 PM, Jeff Ortel wrote:
>>
>>
>> On 07/07/2010 08:57 AM, Jason Dobies wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> Sorry I'm so late getting back to this.
>>>
>>>>> Synchronous messages will fail immediately if the agent is unavailable
>>>>> so let's assume for the moment, we're only talking about asynchronous
>>>>> messages (RMI).
>>>
>>> Makes sense.
>>>
>>>>> I believe that all asynchronous messages to the agent
>>>>> should be dispatched through the Tasking framework. That way *all*
>>>>> policy around asynchronous operations will be in one place.
>>>
>>> So in this rationale, all message bus invocations are synchronous, they
>>> just get their asynchronous-ness from our task framework?
>>>
>>> I like that, it limits the amount of places we need to address these
>>> ugly cases.
>>
>> Yes.
>>
>>>
>>>>> The
>>>>> lifecycle of the asynchronous message should be tied to (and
>>>>> implemented
>>>>> by) the Task. So long as the task lives, the message should also live.
>>>>> If the task times out, the message should be dequeued.
>>>
>>> Does our tasking framework support time outs yet?
>>
>> Not yet.
>>
>>>
>>>>> So, the
>>>>> messaging framework need to support message dequeuing. It can do this
>>>>> by sending a cancellation message with higher priority if dequeuing
>>>>> not
>>>>> directly supported by qpid.
>>>>>
>>>>> This leaves orphaned queues for consumer un-registration. Seems like
>>>>> the ConsumerApi could be responsible for this by doing something like:
>>>>
>>>>
>>>> from pulp.agent import Agent
>>>> Agent.purge('foo')
>>>>
>>>>
>>>>> which would remove the associated queue.
>>>
>>> That covers the case where an agent knowingly is going away, but what
>>> about when the consumer just full on disappears? For instance, the box
>>> is reprovisioned, goes up in flames, or whatever other reason and the
>>> admin doesn't think to unregister it?
>>
>> Unfortunately, AWOL consumers leave a lot of resources that need to be
>> cleaned up. I'll ping the qpid guys and see how earnest we need to be
>> about cleaning up dead queues.
>>
>>>
>>> I think we still need some sort of reaper/ping on agents to make sure
>>> they are still alive. That'll get us into the questions on what happens
>>> if an agent is temporarily down, but I think those are better than the
>>> alternative of dead queues floating around.
>>
>> Agreed.
>>
>> Thinking that agents would publish heartbeat events on the bus ...
>>
>>>
>>>>> The messaging framework (pmf) ensures that messages are processed
>>>>> (dispatched) before they are acknowledged (taken from the queue). This
>>>>> prevents against cases where the agent consumes a message then dies
>>>>> and
>>>>> thus never processes it. Due to guaranteed message delivery, the agent
>>>>> will always reply unless it's dead. In which case, see above.
>>>
>>> I see what you're saying, but I'm thinking of a different case. Maybe
>>> I'm viewing this wrong. I thought the flow looked like:
>>>
>>> - - Server sends message to agent
>>> - - Agent acknowledges and says it'll start processing the request
>>> - - Server makes note somewhere that the requested action is "in
>>> progress"
>>> - - Later, when it's finished, the agent sends a message to the server
>>> that the operation has completed and its status. Looking at the wiki,
>>> this looks like its sent to the server queue.
>>
>> The approach I'm thinking of is that async activities will be dispatched
>> using async tasks. When a task runs, it does synchronous RMI on the
>> agent. If the agent is unavailable, the messaging framework reports this
>> immediately. The task catch the 'Unavailable' exception, goes to RETRY
>> (later) state. This way, all this logic is in the Tasks framework.
>>
>> The task states would go something like this:
>>
>> <agent is down>
>>
>> NEW
>> RETRY
>> <try again later>
>> RETRY
>> <agent is alive now>
>> IN-PROGRESS
>> FINISHED
>>
>>>
>>> If that's the case, then my question is about what happens when that
>>> last bullet point doesn't happen (for instance, zombie attack caused the
>>> power to go out and the machine died). Won't there still be something in
>>> the server that says "I sent a message to the agent that was accepted,
>>> but he never sent me a message back. I'm sad."
>>
>> This is a good reason to keep all the asynchronous behaviour in the Task
>> framework.
>>
>>>
>>> If that's not the case, can you clear up how that flow looks for me?
>>
>> Yeah. It would be a mess.
>>
>>>
>>>>> Yes. All requests (messages) have unique serial numbers which are
>>>>> placed in the reply and matched by the message framework. Agent B,
>>>>> will
>>>>> never see request 1234. This behaviour is standardized and enforced by
>>>>> the messaging framework.
>>>
>>> I'm gonna punt on my follow up question until I'm clear on the above
>>> flow so I don't make us discuss something that's potentially not
>>> relevant.
>>>
>>>
>>>>> Assuming that it cannot re-register with the same ID, it would be
>>>>> considered a new consumer. The previous registration, will orphan many
>>>>> resources in pulp - including the queue. Orphans need to be addressed
>>>>> across the board. See comment above for queue clean up.
>>>>
>>>>> If not, when does that queue get deleted? What happens if that
>>>> re-registration happens while the agent is doing a task before it
>>>> replies, will it confuse the server that the reply came from a
>>>> "different" consumer?
>>>>
>>>> - Are replies back to the server guaranteed delivery as well?
>>>>
>>>>> Yes.
>>>>
>>>>> I'm
>>>> thinking of the situation where the server is offline when the agent
>>>> finishes doing its business.
>>>>
>>>>
>>>>
>>>>>
>>> _______________________________________________
>>> Pulp-list mailing list
>>> Pulp-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-list
>>>
>>>> _______________________________________________
>>>> Pulp-list mailing list
>>>> Pulp-list at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/pulp-list
>>>
>>>
>>> - --
>>> Jason Dobies
>>> RHCE# 805008743336126
>>> Freenode: jdob
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.14 (GNU/Linux)
>>> Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/
>>>
>>> iQEcBAEBAgAGBQJMNIeyAAoJEOMmcTqOSQHCR8sH/iTzlTtNyl2yL22H08fC2yUI
>>> IycGbHieGXdgG1/0+b8vu/tKxUVDbLO3jA7NGPljJyJH7Nc2BDzRwrDoV8lFogmP
>>> GVtbJ8lxwgq1w0ITg6AP4WDRu56dMOn12m0eSVn0TPiEbuw7Io4Vfaqbd1EgQNBi
>>> QsrUj1MIZJo4xuugbiBF8albqI+TXyafqmLs8sMKko00rT06hTZtlKLg9SKmfx3u
>>> V/D3nrftjYPHOQpdZIZ16xO/GqdZUQ9gGOS+Cz5f8+BQi7OBYMlogtncHzjffgVS
>>> 1vThgFo7XopWDZjL1IGwZsBAScG2w+pO36tCG40JZZwTrkC3qr7Ef/mFAqvJjbQ=
>>> =tIUF
>>> -----END PGP SIGNATURE-----
>>>
>>> _______________________________________________
>>> Pulp-list mailing list
>>> Pulp-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-list
>>
>>
>>
>> _______________________________________________
>> Pulp-list mailing list
>> Pulp-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-list
>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5126 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20100708/da6b6aac/attachment.p7s>


More information about the Pulp-list mailing list