[Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

Fri Jul 24 15:03:06 UTC 2009

John Dennis wrote:
>>> 10)  Such code will not produce right results since there are multiple
>>> ways why function can return NULL and not in all cases
>>> the errno is set.
>>>
>>>          if ((pw = new_path_watch(existing_directory_ancestor)) ==
>>> NULL) {
>>> +            error = errno;
>>> +            log_msg(LOG_ERROR, _("%s: could not allocate new watch for
>>> \"%s\" (%s)\n"),
>>> +                    __FUNCTION__, existing_directory_ancestor,
>>> error_string(error));
>>> +            return error;
>>>
>>> Better approach to such cases would be either
>>>
>>> pw = new_path_watch(existing_directory_ancestor,&error);
>>> if (!pw) {
>>> ...
>>>
>>> or which I prefer more:
>>>
>>> error = new_path_watch(&pw, existing_directory_ancestor);
>>> if(error) {
>>> ...
>>
>> This second approach is more in line with the rest of the code in the
>> SSSD. We do this all over the code. Also, our preference is for this
>> error code to remain in line with the errno set, so things like ENOENT
>> and ENOMEM are standard. (We also defined EOK which is identical to
>> EXIT_SUCCESS for convenience)
>
> Hopefully you'll have noticed the 2nd approach, the preferred one used
> in SSSD, is the one I used for virtually all function definitions. The
> function cited above is an anomaly. It's good your review caught it
> (BTW, the reason it was coded that way originally was it was
> originally just an "allocator" function whose only failure would have
> been no memory, but it grew new functionality and it should have been
> modified at that point to return an error status just like everything
> else does.
>
>>> 12) In search for some functions I looked at path_utils.
>>> First in general I am a bit scared about using static buffers to
>>> hold paths.
>>> Is it still an Ok practice? In the SSSD all code is allocated in
>>> collections
>>> and ini the only place  where the static buffer was used was some
>>> parsing of the
>>> file where the reasonable length of the line in the ini file is less
>>> than 100 bytes
>>> while buffer is about 64K. Should we use dynamic memory for paths?
>>> Start with malloc(MAX_PATH) and then realloc if needed?
>>>
>>
>> The idea behind MAX_PATH is that it's defined on every system to be the
>> largest legitimate value of a path that the system can handle
>> internally. malloc(MAX_PATH) doesn't gain anything over path[MAX_PATH]
>> except a need to manage the memory. I generally prefer, however, only to
>> allocate memory as-needed. This will come in handy in the future if
>> anyone should want to include this daemon on a limited-memory device
>> (such as a Linux-based router). That said, for speed and ease of use,
>> you can't beat this static buffer. I just balk at always using a full
>> memory page for every path, when in most cases you're going to need<
>> 100 bytes. Considering that the whole purpose of this tool is to easily
>> manage large numbers of paths, that 4kiB (on Linux; 8kiB or more on some
>> other systems) can really add up.
>
> I think one can make two equally valid arguments with respect to
> storing path names. Pre-declaring a variable or structure member of
> sufficient size to hold any pathname has the advantage of simplicity,
> simplicity tends to lead toward more robust and efficient code (less
> mistakes to make and fewer cycles spent trying to optimize memory use).
>
> However, Stephen is right on the money when he observes in most cases
> the bulk of the memory allocated won't be used and this *may* be a
> liability.
>
> First of all one needs to be aware that much of the use of the
> construct "path[MAX_PATH]" occurs in automatic variables on the stack.
> In this context it's hard to argue the unused memory in the string is
> a liability, the memory comes into existence and is freed by a simple
> adjustment of the stack pointer and has a very short lifetime (yes, it
> does cost extra stack space). The alternative is more complex memory
> management, either alloca on the stack or from the heap, both require
> more complex logic with a greater opportunity for program error and
> many more processor cycles (don't forget these path variables only
> come into existence because we need to manipulate them which means one
> must always be doing bounds checking and reallocations on virtually
> all operations).
>
> There are places where buffers of size MAX_PATH are members of heap
> allocated objects. These are better candidates for memory
> optimization, however it does incur much more complex program logic.
> In fact the problem really reduces to:
>
> Dynamic String Objects
>
> E.g. in C++ std::string. It's not just a matter of dynamic buffer
> management, it also requires functions to be bound to the underlying
> buffer which can perform various string operations.
>
> To the best of my knowledge we don't have a library available to us in
> C which has been blessed for our use which implements this. Just like
> we were required to write our own hash table implementation we would
> be required to write our own dynamic string library.
>
> So when the choice came as to whether to write a new string library or
> declare paths using MAX_PATH and utilize the existing clib string
> functions (plus some customs functions) it seemed to me to be clearly
> a win to opt for simplicity, processor efficiency, shorter development
> time and proven library correctness.
>
> That's not to say there isn't a good argument on the other side too.
> Perhaps it would be possible to use macros in the current implemention
> which would allow switching to dynamic operations later down the road.
>
> Of course we could also just wait and see if it's an issue in
> practice. The golden rule of optimization is *not* to optimize in the
> beginning, rather keep things simple, concentrate on program
> correctness, and only later come back and tweak specific problem areas.
>
Agree.

-- 
Thank you,
Dmitri Pal

Engineering Manager IPA project,
Red Hat Inc.

-------------------------------
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/