[Freeipa-devel] Using JSON for tlog config files

Nikolai Kondrashov Nikolai.Kondrashov at redhat.com
Tue Jun 14 13:40:03 UTC 2016


Hi everyone,

Although this was mentioned several times before, I'd like to bring additional
attention to the idea of using config files written in JSON for tlog, because
there were some concerns over that being appropriate.

Tlog is a terminal I/O recording package [1], with primary purpose of sending
the recordings as JSON-formatted log messages to ElasticSearch. For the
purpose of reading what it wrote, it links with json-c.

At this moment both of tlog programs (tlog-rec and tlog-play) expect their
global configuration to be in JSON, with comments allowed. See the default
configurations attached. Plus, tlog-rec accepts an environment variable,
containing the whole, or a part of the configuration to (partially) override
the global one. That is also in JSON. Tlog uses the same json-c to parse all
of these. Internally, tlog uses json-c structures to pass around and merge the
configurations.

The question is, should the global tlog configuration, located in
/etc/tlog/tlog-rec.conf and /etc/tlog/tlog-play.conf, be in JSON, or should it
be something else?

The cons I heard so far:

     * Administrators don't expect to find JSON in /etc.
     * JSON is a fragile format.

The pros I'd like to present:

     * Administrators setting up tlog are expected to also be familiar with
       ElasticSearch, which tlog targets as the storage. ElasticSearch speaks
       JSON exclusively and is configured using either YAML or JSON. So, the
       administrators should be largely familiar with it.

     * Although JSON uses explicit and rigid syntax, such as quoting and
       prohibited trailing commas, it is still easy to read, and its
       specification is succint and easy to learn: http://json.org/

     * Tlog is already linked with json-c, to read what it wrote, and
       reusing it for configuration reading avoids adding another dependency.

Overall, I consider the present situation a good compromise between
smaller/simpler code and reduced dependencies vs. familiarity and ease of
editing and reading for administrators.

The alternatives presented so far are YAML and INI. I'll list each of their
pros and cons, as I see them.

YAML

Pros

     * Has a subset of syntax (sufficient for our purposes), which is easy to
       read and write, doesn't require quoting, not critical to commas and
       other separators.
     * Has official specification.

Cons

     * Requires additional dependency to be used in tlog.
     * Only one implementation in C.
     * Uses significant whitespace, which is easier to overlook than explicit
       syntax.
     * *Sometimes* requires quoting to enforce value type, which is easy to
       overlook. E.g. an all-digits string requires quoting, otherwise it is
       considered a number.
     * Although well-defined, specification is long and complicated:
       http://www.yaml.org/spec/1.2/spec.html This makes it hard to fully
       understand the language and be proficient at it.
     * In an attempt to make the language as human-readable as possible, made
       it actually harder for humans to write, in some cases.
     * Has too many features, complicating parsers, leading to harder to use
       APIs, and more bugs.

INI

Pros

     * Familiar, already used by sister projects: SSSD, Kerberos, Samba, etc.
     * Light, simple syntax

Cons

     * Requires additional dependency to be used in tlog.
     * No official specification, lots of variance in the field:
       https://en.wikipedia.org/wiki/INI_file#Varying_features
       This requires explicit description of the actually used syntax in
       the program manuals. I.e. it cannot simply link to a specification.
       Administrators have to discover which flavor to use. This will become
       worse if we'll implement storing (a subset of) tlog-rec configuration
       in LDAP verbatim, as suggested so far, because the documentation for the
       format will be less discoverable for the person editing the directory.
     * Cannot be written without newlines, in a single line. This will make
       overriding configuration with an environment variable in tlog-rec harder
       to use. I.e. the environment variable value will have to contain
       newlines, or instead refer to a file containing the configuration.
     * No escaping for special characters, multiline value support is patchy
       (not present at all in dinglibs). This will limit the ways to specify
       the recording notice presented to the users at the start of tlog-rec.


Your own pros/cons, and suggestions for other formats to use are welcome!
Thank you for your attention.

Nick

[1]: https://github.com/Scribery/tlog
-------------- next part --------------
//
// Tlog-play system-wide configuration. See tlog-play.conf(5) for details.
// This file uses JSON format with both C and C++ comments allowed.
//
{
    // The type of "log reader" to use for retrieving log messages. The chosen
    // reader needs to be configured using its own dedicated parameters.
    // "reader" : "file",

    // File reader parameters
    "file": {
        // The "file" reader log file path.
        // "path" : ""
    },

    // ElasticSearch reader parameters
    "es": {
        // The base URL to request ElasticSearch through. Should not
        // contain the query (?...) or fragment (#...) parts.
        // "baseurl" : "",

        // The query string to send to ElasticSearch
        // "query" : ""
    }
}
-------------- next part --------------
//
// Tlog-rec system-wide configuration. See tlog-rec.conf(5) for details.
// This file uses JSON format with both C and C++ comments allowed.
//
{
    // The path to the shell executable that should be spawned.
    // "shell" : "/bin/bash",

    // A message which will be printed before starting
    // recording and the user shell. Can be used to warn
    // the user that the session is recorded.
    // "notice" : "\nATTENTION! Your session is being recorded!\n\n",

    // The data which does not exceed maximum payload
    // stays in memory and is not logged until this number of
    // seconds elapses.
    // "latency" : 10,

    // Maximum encoded data (payload) size per message, bytes.
    // As soon as payload exceeds this number of bytes,
    // it is formatted into a message and logged.
    // "payload" : 2048,

    // Logged data set parameters
    "log": {
        // If specified as true, user input is logged.
        // "input" : true,

        // If specified as true, terminal output is logged.
        // "output" : true,

        // If specified as true, terminal window size changes are logged.
        // "window" : true
    },

    // The type of "log writer" to use for logging. The writer needs
    // to be configured using its dedicated parameters.
    // "writer" : "syslog",

    // File writer parameters
    "file": {
        // The "file" writer log file path.
        // "path" : ""
    },

    // Syslog writer parameters
    "syslog": {
        // Syslog facility the "syslog" writer should use for the messages.
        // "facility" : "authpriv",

        // Syslog priority the "syslog" writer should use for the messages.
        // "priority" : "info"
    }
}


More information about the Freeipa-devel mailing list