[Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)
John Dennis
jdennis at redhat.com
Fri Jul 24 14:55:31 UTC 2009
>> 10) Such code will not produce right results since there are multiple
>> ways why function can return NULL and not in all cases
>> the errno is set.
>>
>> if ((pw = new_path_watch(existing_directory_ancestor)) == NULL) {
>> + error = errno;
>> + log_msg(LOG_ERROR, _("%s: could not allocate new watch for
>> \"%s\" (%s)\n"),
>> + __FUNCTION__, existing_directory_ancestor,
>> error_string(error));
>> + return error;
>>
>> Better approach to such cases would be either
>>
>> pw = new_path_watch(existing_directory_ancestor,&error);
>> if (!pw) {
>> ...
>>
>> or which I prefer more:
>>
>> error = new_path_watch(&pw, existing_directory_ancestor);
>> if(error) {
>> ...
>
> This second approach is more in line with the rest of the code in the
> SSSD. We do this all over the code. Also, our preference is for this
> error code to remain in line with the errno set, so things like ENOENT
> and ENOMEM are standard. (We also defined EOK which is identical to
> EXIT_SUCCESS for convenience)
Hopefully you'll have noticed the 2nd approach, the preferred one used
in SSSD, is the one I used for virtually all function definitions. The
function cited above is an anomaly. It's good your review caught it
(BTW, the reason it was coded that way originally was it was originally
just an "allocator" function whose only failure would have been no
memory, but it grew new functionality and it should have been modified
at that point to return an error status just like everything else does.
>> 12) In search for some functions I looked at path_utils.
>> First in general I am a bit scared about using static buffers to hold paths.
>> Is it still an Ok practice? In the SSSD all code is allocated in
>> collections
>> and ini the only place where the static buffer was used was some
>> parsing of the
>> file where the reasonable length of the line in the ini file is less
>> than 100 bytes
>> while buffer is about 64K. Should we use dynamic memory for paths?
>> Start with malloc(MAX_PATH) and then realloc if needed?
>>
>
> The idea behind MAX_PATH is that it's defined on every system to be the
> largest legitimate value of a path that the system can handle
> internally. malloc(MAX_PATH) doesn't gain anything over path[MAX_PATH]
> except a need to manage the memory. I generally prefer, however, only to
> allocate memory as-needed. This will come in handy in the future if
> anyone should want to include this daemon on a limited-memory device
> (such as a Linux-based router). That said, for speed and ease of use,
> you can't beat this static buffer. I just balk at always using a full
> memory page for every path, when in most cases you're going to need<
> 100 bytes. Considering that the whole purpose of this tool is to easily
> manage large numbers of paths, that 4kiB (on Linux; 8kiB or more on some
> other systems) can really add up.
I think one can make two equally valid arguments with respect to storing
path names. Pre-declaring a variable or structure member of sufficient
size to hold any pathname has the advantage of simplicity, simplicity
tends to lead toward more robust and efficient code (less mistakes to
make and fewer cycles spent trying to optimize memory use).
However, Stephen is right on the money when he observes in most cases
the bulk of the memory allocated won't be used and this *may* be a
liability.
First of all one needs to be aware that much of the use of the construct
"path[MAX_PATH]" occurs in automatic variables on the stack. In this
context it's hard to argue the unused memory in the string is a
liability, the memory comes into existence and is freed by a simple
adjustment of the stack pointer and has a very short lifetime (yes, it
does cost extra stack space). The alternative is more complex memory
management, either alloca on the stack or from the heap, both require
more complex logic with a greater opportunity for program error and many
more processor cycles (don't forget these path variables only come into
existence because we need to manipulate them which means one must always
be doing bounds checking and reallocations on virtually all operations).
There are places where buffers of size MAX_PATH are members of heap
allocated objects. These are better candidates for memory optimization,
however it does incur much more complex program logic. In fact the
problem really reduces to:
Dynamic String Objects
E.g. in C++ std::string. It's not just a matter of dynamic buffer
management, it also requires functions to be bound to the underlying
buffer which can perform various string operations.
To the best of my knowledge we don't have a library available to us in C
which has been blessed for our use which implements this. Just like we
were required to write our own hash table implementation we would be
required to write our own dynamic string library.
So when the choice came as to whether to write a new string library or
declare paths using MAX_PATH and utilize the existing clib string
functions (plus some customs functions) it seemed to me to be clearly a
win to opt for simplicity, processor efficiency, shorter development
time and proven library correctness.
That's not to say there isn't a good argument on the other side too.
Perhaps it would be possible to use macros in the current implemention
which would allow switching to dynamic operations later down the road.
Of course we could also just wait and see if it's an issue in practice.
The golden rule of optimization is *not* to optimize in the beginning,
rather keep things simple, concentrate on program correctness, and only
later come back and tweak specific problem areas.
--
John Dennis <jdennis at redhat.com>
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
More information about the Freeipa-devel
mailing list