[Freeipa-devel] [PATCH] 0064 Rework task naming in LDAP updates to avoid conflicts

Petr Viktorin pviktori at redhat.com
Tue Jul 24 12:23:54 UTC 2012


On 07/24/2012 02:06 PM, Alexander Bokovoy wrote:
> On Tue, 24 Jul 2012, Petr Viktorin wrote:
>> On 07/24/2012 12:01 PM, Alexander Bokovoy wrote:
>>> Hi,
>>>
>>> There are two problems in task naming in LDAP updates:
>>>
>>> 1. Randomness may be scarce in virtual machines
>>> 2. Random number is added to the time value rounded to a second
>>>
>>> The second issue leads to values that may repeat themselves as time
>>> only grows and random number is non-negative as well, so
>>> t2+r2 can be equal to t1+t2 generated earlier.
>>>
>>> Since task name is a DN, there is no strict requirement to use an
>>> integer value.  Instead, we can take time and attribute name. To get
>>> reasonable 'randomness' these values are then hashed with sha1 and use
>>> the resulting string as task name.
>>>
>>> SHA1 may technically be an overkill here as we could simply use
>>>
>>>   indextask_$date_$attribute
>>>
>>> where $date is a value of time.time() but SHA1 gives a resonable
>>> 'randomness' into the string.
>>
>> What kind of randomness do you mean? SHA1 is deterministic, it doesn't
>> add any randomness at all. It just obscures what's really happening.
> Hence using quotes to describe it. We don't need randomness in the task
> names, we need something that avoids collisions.
>
> An issue here is in time.time() -- it may give us sub-second resolution
> if underlying OS supports it, it may not. Having a second-level
> resolution is not enough, especially on fast machines, so we can't
> simply use int(times.time()) as it was in the original version.
>
> indextask_$date_$attribute has this issue that we don't have enough
> guarantee for $date (time.time()) to be unique in sufficiently tight
> conditions, thus use of SHA-1 to generate something that has better
> chances to avoid collisions than $data_$attribute.

My point is that if "indextask_$date_$attribute" is not unique, neither 
is SHA1("indextask_$date_$attribute"). Hashing has no effect on the 
chance of collisions.

You could use Python's pseudorandom number generator (random.randint) 
instead of random.SystemRandom. It's not cryptographically secure but 
it's enough to avoid collisions, and it doesn't use up system entropy 
(except for initial seeding, through `import random`).
"indextask_$date_$attribute_$pseudorandomvalue" should be unique enough.

>> Same with repeating [tasktime, attribute] two times.
> This can be reduced as SHA-1 output does not depend on size of the
> input message.


-- 
Petr³





More information about the Freeipa-devel mailing list