[Pulp-list] Messaging Questions

Fri Jul 9 11:33:52 UTC 2010

On 07/08/2010 05:44 PM, Jeff Ortel wrote:
>
>
> On 07/08/2010 04:00 PM, Bryan Kearney wrote:
>> On 07/08/2010 02:40 PM, Jason Dobies wrote:
>>> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>>>> If the thinking is that:
>>>>
>>>>> asynchronous = pub/sub = efficiency
>>>>
>>>>> Then, an important consideration is: How does qpid implements
>>>>> *durable*
>>>>> subscription to topics (pub/sub)? The durable nature of the
>>>>> subscription usually requires that brokers implement using queues
>>>>> where
>>>>> messages are routed to subscriber queues based on topic/subject and
>>>>> selectors. Asynchronous request/response assumes guaranteed delivery.
>>>>> This means that the message must be queued so it can be delivered to
>>>>> consumers that are not connected when the message is published.
>>>>> This is
>>>>> the definition of durable subscription. So, publishing a message to
>>>>> 10k
>>>>> agents probably still requires 10k queues.
>>>>
>>>>> But, if we anticipate this kind of mass operation, then asynchronous
>>>>> will be much more efficient and worth the extra complexity because we
>>>>> can have all the agents performing the operation in parallel. If we
>>>>> did
>>>>> this synchronously, we'd be limited to the Task thread limit.
>>>
>>> +1 to this whole train of thought.
>>>
>>> That brings me to the question of throttling (yes, I'm playing devil's
>>> advocate against myself).
>>>
>>> If we use sync RMI over the async task queue, then we can only send out
>>> X many package install invocations at a time, where X is the number of
>>> threads in the queue. If we have 10K consumers and 50 threads, it's
>>> gonna be rough.
>>>
>>> So if we use an async model to send the package install requests to the
>>> consumers, then we don't block on the size of the task queue. However,
>>> then we have a situation where 10K consumers suddenly smash our repos at
>>> the same time asking for package bits.
>>>
>>> I like how Jeff put it. The sync RMI approach gives us a form of
>>> throttling. It's just in the wrong place. What we'll need is something
>>> in place to throttle and/or load balance the requests on the repos
>>> themselves. I'm sure that's on a backlog somewhere, just throwing it out
>>> there now since it's relevant to the discussion.
>>
>>
>> Open questions which I see:
>>
>> 1) Does pulp require durable queues? Do I want to ensure that a packge
>> is updated the next time it wakes up? If so, we need to handle queue
>> purging. Perhaps this is tied to consumer deletion.
>
> Agreed.
>
>>
>> 2) Can the same message (install package) be sent P2P Sync, Fire and
>> Forget, and broadcast. I think the answer to this should be yes. If so,
>> in broadcast, is there a notion of the success/failure?
>
> Can you define how you mean fire-and-forget and broadcast?

Fire and forget is P2P Async, with no correlatoin. "Do this" and I do 
not care if you actually do it. It could be tat i look for a status 
report later on it. It is a slippery slope once we start tracking 
request statuses async.

-- bk