[collectd] tail plugin (was: Plugins for postfix, amavisd-new and powerdns)

Florian Forster octo at verplant.org
Sun Feb 24 17:31:05 CET 2008


Hi Luke,

I've looked at your patches yesterday and today and made some changes to
them. I feel a bit bad for changing your code so much. While looking at
the actual plugins I though ``hm, using regular expressions this should
be a lot simpler'' and started writing a `match' so simplify the
handling of regular expressions. And then I got carried away from there
:/ I hope you forgive me ;)

On Mon, Feb 18, 2008 at 03:24:04PM -0800, Luke Heberling wrote:
> utils_tail.diff
>    A utility library for watching the end of a log file.

I've basically kept `utils_tail.[ch]' as they were, except for some
improvements in error handling.
Maybe the logic could be improved a bit though: If the inode of a file
changes, the old filehandle is closed and the file is reopened.
Shouldn't we read the old filehandle to the end first?

> utils_parse.diff
>    A utility library for parsing a line of text.

I've renamed that module to `utils_match' and changed the functionality:
- The object is initialized with a regular expression and a callback
  function. You can then pass lines of text to the object, using
  `match_apply'. If the line matches, the provided callback is called
  with the part of the string that matched.
- A second constructor, `match_create_simple', takes takes a ``ds_type''
  and creates an object with a default callback function. This default
  function sums up matched numbers according to the ``ds_type''. For
  example, if you specify (the flags for) ``counter add'' here, the
  sub-match in will be interpreted as an integer value that's added to
  the internal counter. This makes it easy to sum up the file sizes of
  file processed or something like that.

> utils_logtail.diff
>    A utility library which builds on utils_tail to provide what a
> plugin which parses a logfile will need. The postfix and amavis
> plugins have all the same needs other than how to parse each
> line from the log file. This library meets them.

I've renamed this module to `utils_tail_match'. It's structure is as
follows:
- The constructor takes a `utils_tail' object which it uses to read
  lines.
- The `tail_match_add_match' method takes a `utils_match' object and a
  callback function. Every line read is passed to the `match_apply'
  method of _every_ match, which in turn calls its internal callback if
  it matches. The callback that's passed to `tail_match_add_match' is
  called after all lines have been read - regardless whether this match
  object matched any of the lines or not. This may be used to dispatch
  values to the daemon, naturally.
- The `tail_match_add_match_simple' is an alternative method to add a
  match to the `tail_match' object: It takes a regular expression and a
  ``ds_type'', which are used to create a `match' object using the
  `match_create_simple' constructor. The default callback function
  that's used in this case dispatches the values calculated by the
  `match' object's callback function using the plugin(-instance) and
  type(-instance) passed to this method.

The idea behind these general and ``simple'' functions is of course to
make simple things easy while complex stuff should still be possible.
So, for example, assume you have a logfile with file sizes in there.
Possibly the logfile of a anti virus software. You want to create a
graph of the number bytes that were handled by the software. You could
do this as follows (just the meaty bits of course, all error handling is
omitted):
  static cu_tail_match_t *tm;
  static int init (void)
  {
    tm = tail_match_create ("/var/log/av.log");
    tail_match_add_match_simple (tm, "file size = ([1-9][0-9]*)",
      UTILS_MATCH_DS_TYPE_COUNTER | UTILS_MATCH_CF_COUNTER_ADD,
      "av_software", NULL, "ipt_bytes", "total");
    return (0);
  }
  static int read (void)
  {
    tail_match_read (tm);
    return (0);
  }
  static int shutdown (void)
  {
    tail_match_destroy (tm);
    return (0);
  }

As you can see the tail/parsing stuff is _completely_ out of the plugin
now. Now, since it comes down to a regular expression - why not let the
user do this? This way he can parse logfiles we don't dream about. So
I've written a new `tail' plugin which is basically a configuration
frontend for the `tail_match_add_match_simple' function.

The config looks like this (simply copied from the collectd.conf(5)
manpage):
  <Plugin "tail">
    <File "/var/log/exim4/mainlog">
      Instance "exim"
      <Match>
        Regex "S=([1-9][0-9]*)"
        DSType "CounterAdd"
        Type "ipt_bytes"
        Instance "total"
      </Match>
      <Match>
        Regex "\\<R=local_user\\>"
        DSType "CounterInc"
        Type "email_count"
        Instance "local_user"
      </Match>
    </File>
  </Plugin>

Nice-to-have features would be:
- Make the submatch to use configurable. Then you could do something
  like:
  "relay=(cyrus|imap|pop3), delay=([1-9][0-9]*)", use submatch #2
- Use a submatch as the type-instance. E. g.:
  "R=([A-Za-z0-9_-]+)", ds_type = CounterInc, type_instance = #1
  (Automatically count how often each of the Exim `routers' was used;
  use its name as type-instance)

> {amavis,postfix,powerdns}.diff:
>    The main plugins.

These plugins are only needed, if the data cannot be collected using the
`tail' plugin. Two situations come to mind, where this may become
necessary:
- The regex may match multiple times in one line. E. g. there are
  multiple file sizes in one line. This could be implemented in the
  `match' object by reapplying the regular expression to the remainder
  of the string until it doesn't match anymore.
- You need to see multiple lines for one data point. For example the
  size of an email is in one line and the type, e. g. ham vs. spam, is
  on another.

Unfortunately I don't have amavis or postfix running anywhere, so I
can't tell if such a special plugin is necessary. Judging from the
sources the plugins could be turned into sample config files, though,
which I would of course love to include in contrib/.

I'll take a look at your `powerdns' plugin next.

Regards,
-octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20080224/a59c9fd1/attachment.pgp 


More information about the collectd mailing list