[collectd] tail plugin (was: Plugins for postfix, amavisd-new and powerdns)
Florian Forster
octo at verplant.org
Sun Feb 24 17:31:05 CET 2008
Hi Luke,
I've looked at your patches yesterday and today and made some changes to
them. I feel a bit bad for changing your code so much. While looking at
the actual plugins I though ``hm, using regular expressions this should
be a lot simpler'' and started writing a `match' so simplify the
handling of regular expressions. And then I got carried away from there
:/ I hope you forgive me ;)
On Mon, Feb 18, 2008 at 03:24:04PM -0800, Luke Heberling wrote:
> utils_tail.diff
> A utility library for watching the end of a log file.
I've basically kept `utils_tail.[ch]' as they were, except for some
improvements in error handling.
Maybe the logic could be improved a bit though: If the inode of a file
changes, the old filehandle is closed and the file is reopened.
Shouldn't we read the old filehandle to the end first?
> utils_parse.diff
> A utility library for parsing a line of text.
I've renamed that module to `utils_match' and changed the functionality:
- The object is initialized with a regular expression and a callback
function. You can then pass lines of text to the object, using
`match_apply'. If the line matches, the provided callback is called
with the part of the string that matched.
- A second constructor, `match_create_simple', takes takes a ``ds_type''
and creates an object with a default callback function. This default
function sums up matched numbers according to the ``ds_type''. For
example, if you specify (the flags for) ``counter add'' here, the
sub-match in will be interpreted as an integer value that's added to
the internal counter. This makes it easy to sum up the file sizes of
file processed or something like that.
> utils_logtail.diff
> A utility library which builds on utils_tail to provide what a
> plugin which parses a logfile will need. The postfix and amavis
> plugins have all the same needs other than how to parse each
> line from the log file. This library meets them.
I've renamed this module to `utils_tail_match'. It's structure is as
follows:
- The constructor takes a `utils_tail' object which it uses to read
lines.
- The `tail_match_add_match' method takes a `utils_match' object and a
callback function. Every line read is passed to the `match_apply'
method of _every_ match, which in turn calls its internal callback if
it matches. The callback that's passed to `tail_match_add_match' is
called after all lines have been read - regardless whether this match
object matched any of the lines or not. This may be used to dispatch
values to the daemon, naturally.
- The `tail_match_add_match_simple' is an alternative method to add a
match to the `tail_match' object: It takes a regular expression and a
``ds_type'', which are used to create a `match' object using the
`match_create_simple' constructor. The default callback function
that's used in this case dispatches the values calculated by the
`match' object's callback function using the plugin(-instance) and
type(-instance) passed to this method.
The idea behind these general and ``simple'' functions is of course to
make simple things easy while complex stuff should still be possible.
So, for example, assume you have a logfile with file sizes in there.
Possibly the logfile of a anti virus software. You want to create a
graph of the number bytes that were handled by the software. You could
do this as follows (just the meaty bits of course, all error handling is
omitted):
static cu_tail_match_t *tm;
static int init (void)
{
tm = tail_match_create ("/var/log/av.log");
tail_match_add_match_simple (tm, "file size = ([1-9][0-9]*)",
UTILS_MATCH_DS_TYPE_COUNTER | UTILS_MATCH_CF_COUNTER_ADD,
"av_software", NULL, "ipt_bytes", "total");
return (0);
}
static int read (void)
{
tail_match_read (tm);
return (0);
}
static int shutdown (void)
{
tail_match_destroy (tm);
return (0);
}
As you can see the tail/parsing stuff is _completely_ out of the plugin
now. Now, since it comes down to a regular expression - why not let the
user do this? This way he can parse logfiles we don't dream about. So
I've written a new `tail' plugin which is basically a configuration
frontend for the `tail_match_add_match_simple' function.
The config looks like this (simply copied from the collectd.conf(5)
manpage):
<Plugin "tail">
<File "/var/log/exim4/mainlog">
Instance "exim"
<Match>
Regex "S=([1-9][0-9]*)"
DSType "CounterAdd"
Type "ipt_bytes"
Instance "total"
</Match>
<Match>
Regex "\\<R=local_user\\>"
DSType "CounterInc"
Type "email_count"
Instance "local_user"
</Match>
</File>
</Plugin>
Nice-to-have features would be:
- Make the submatch to use configurable. Then you could do something
like:
"relay=(cyrus|imap|pop3), delay=([1-9][0-9]*)", use submatch #2
- Use a submatch as the type-instance. E. g.:
"R=([A-Za-z0-9_-]+)", ds_type = CounterInc, type_instance = #1
(Automatically count how often each of the Exim `routers' was used;
use its name as type-instance)
> {amavis,postfix,powerdns}.diff:
> The main plugins.
These plugins are only needed, if the data cannot be collected using the
`tail' plugin. Two situations come to mind, where this may become
necessary:
- The regex may match multiple times in one line. E. g. there are
multiple file sizes in one line. This could be implemented in the
`match' object by reapplying the regular expression to the remainder
of the string until it doesn't match anymore.
- You need to see multiple lines for one data point. For example the
size of an email is in one line and the type, e. g. ham vs. spam, is
on another.
Unfortunately I don't have amavis or postfix running anywhere, so I
can't tell if such a special plugin is necessary. Judging from the
sources the plugins could be turned into sample config files, though,
which I would of course love to include in contrib/.
I'll take a look at your `powerdns' plugin next.
Regards,
-octo
--
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20080224/a59c9fd1/attachment.pgp
More information about the collectd
mailing list