[Freeipa-devel] [PATCH] initial commit of log watcher (lwatch)

Fri Jul 24 01:01:51 UTC 2009

John Dennis wrote:
> This is a big patch, sorry, but there just isn't a realistic way to
> develop a major piece of code into a working state given our checkin
> policy which requires peer review for every change, it's just easier
> to develop elsewhere and submit the working result.
>
> This is the basic framework for the client audit code, it produces an
> executable called lwatch. In it's default mode lwatch watches a set of
> (log) files. It understands how log files are rotated. When a log file
> reaches a modification threshold it's newly appended data is prepared
> for transport to the audit server. When the log file is rotated,
> created, renamed, etc. the appropriate actions take place.
>
> lwatch maintains it's persistent state in a sqlite database. You can
> start and stop lwatch and it self synchronizes based on what is in the
> database and what it finds in the file system.
>
> lwatch is also capable of listing every log under a directory root,
> this can be handy for initializing the set of log files to watch.
>
> lwatch can also dump in a pretty format the current state of the SQL
> database.
>
> lwatch is not completely finished, but it's been in reasonable working
> shape for a while now, but sitting in my own git repository, it's time
> to get into the project's repository. What it needs next is better
> configuration options and to link in the code for audit server
> transmission (coming soon).
>
> Just one note about the patch, it includes a trivial one line fix to
> dhash.h which was missing a const declaration. I realize that should
> have been in a separate patch, but it got included here.
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Freeipa-devel mailing list
> Freeipa-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/freeipa-devel
John,

Steve wanted me to take a look at this code too.
Just to have a second pair of eyes.
I am doing it and will write notes as I go.
I do not know if any of those are blockers and would trigger nacks and
which can be
treated as "work" in progress. Let us use bast judgment.

I have not read Steven's note on purpose so that I do not narrow my vision.

Here we go:
1) There is generally a lack of comments in the code. it is very hard to
read.
IMO the lack of comments is a style issue rather then maturity of the
code issue. 
It seems funny that function that trims spaces is commented. It would
have been very easy to understand what the function does by its name.
And the code is pretty simple.   While more complex functions around do
not have any line on comments.
It would think that the following  practice should be employed:
Each function is commented on in the header or in the module (better in
both).
It least in one of these two places it is necessary to explain what the
function does, why it is needed and where it used.
For example:
/* Allocates and builds a memory buffer out the hash table entries. Used
when ... */
char *pathname_table_string(hash_table_t *table)

The more complex the function is the better it should be commented.

2) In many places the spaces are missing around commas, "-", "+".
It is hard to read. Again would be nice for it to follow standard.
I find these things in my code to and try to correct them as I see them.
Just a  suggestion to try a bit harder make things  more readable.
There are many places that can be improved.

3) Heavy use of the complex constructs:
            if ((tbl_string = realloc(tbl_string, alloc_len =
alloc_len+MAX(needed_len,alloc_increment))) == NULL) {
Would be much more readable if things like that are split into two lines.
             alloc_len += MAX(needed_len, alloc_increment);
            if ((tbl_string = realloc(tbl_string, alloc_len)) == NULL) {

 4) In multiple places there is just one line of code inside the  {}
after if for example:
    if ((tbl_string = malloc(alloc_len = alloc_increment)) == NULL) {
        return NULL;
    }

    should be:

    if ((tbl_string = malloc(alloc_len = alloc_increment)) == NULL) 
return NULL;

    but I would even prefer something like:

    alloc_len = alloc_increment;
    tbl_string = malloc(alloc_len);
    if (!tbl_string) {
          <some tracing or debugging here >
         return NULL;
    }

5) Variables are not initialized at the top of the function.
It is generally a good practice to initialize things at the top of the
function.

6) In several places I noticed you miss va_end macro where you use
va_list and va_start().
>From vsnprintf man page:
       These functions do not call the va_end
       macro. Consequently, the value of ap is undefined after the 
call.  The
       application should call va_end(ap) itself afterwards.

7) It should be a practice of using thread safe versions of system
functions like localtime (localtime_r) , getgrgid (getgrgid_r) etc.
We are considering combining the log monitor with audit collection.
These are two separate functions it would be logical
to have them in separate threads. This is undecided but at least I think
it would be better to be on the safe side and not use
unsafe calls. Just good defensive programming practice.

8) A lot of static global data. It would be beneficial to combine these
things together in some kind of context structure and pass it around.
Even if you keep it static for now it would be easier to refine code
later. Currently too many low level functions rely the global data
so it would require a significant refactoring effort to get rid of those
down the road.

9) SSSD and Common and Audit all three use tree different ways of
tracing/debugging things.
Shouldn't we get to some consistency there?
It would be better if the log_msg be a macro from the very beginning.
This way it would be easier
to make it conditionally compiled debugging. Currently is is function
and would require significant
effort to go through the code and make it consistent. I would argue that
a) Tracing/debugging should be conditionally compiled day 1
b) Debugging strings do not require translation
c) Tracing pattern should be consistent - every error should be traced
otherwise it is hard to find problems.

Just something to think about...

10)  Such code will not produce right results since there are multiple
ways why function can return NULL and not in all cases
the errno is set.

        if ((pw = new_path_watch(existing_directory_ancestor)) == NULL) {
+            error = errno;
+            log_msg(LOG_ERROR, _("%s: could not allocate new watch for
\"%s\" (%s)\n"),
+                    __FUNCTION__, existing_directory_ancestor,
error_string(error));
+            return error;

Better approach to such cases would be either

pw = new_path_watch(existing_directory_ancestor, &error);
if (!pw) {
...

or which I prefer more:

error = new_path_watch(&pw, existing_directory_ancestor);
if(error) {
...

11) From our design discussions I remember that  you mentioned that
there was an issue with recursive directories and monitoring nested
paths or something like this.
It seems that find_existing_directory_ancestor() has to do with this issue.
I was expecting to see some extended comment about this somewhere around
create_watch() function.

12) In search for some functions I looked at path_utils.
First in general I am a bit scared about using static buffers to hold paths.
Is it still an Ok practice? In the SSSD all code is allocated in
collections
and ini the only place  where the static buffer was used was some
parsing of the
file where the reasonable length of the line in the ini file is less
than 100 bytes
while buffer is about 64K. Should we use dynamic memory for paths?
Start with malloc(MAX_PATH) and then realloc if needed?

13) I was amazed how well the path_util interface is defined and
documented .
I will consider using it whenever I need to deal with paths.

This is all for now.
A general comment is that without running the code and playing with it
for some time
it is hard to find any specific issues (if any) in logic. On the surface
things
seem well thought through and the functions are not big and modular but
lack of comments makes it really hard to digest.

-- 
Thank you,
Dmitri Pal

Engineering Manager IPA project,
Red Hat Inc.

-------------------------------
Looking to carve out IT costs?
www.redhat.com/carveoutcosts/