[collectd] Negative lookahead/lookbehind

Florian Forster octo at verplant.org
Sun Jun 21 11:10:36 CEST 2009

Hi Brian,

On Sat, Jun 20, 2009 at 10:51:28AM -0700, Brian Long wrote:
> Does collectd support negative lookahead or lookbehind in regex's for
> the tail plugin?

the `tail' plugin and all other parts of collectd which use regular
expressions use the extended POSIX regular expressions (sometimes also
called “modern REs”). The syntax and features are described in the
regex(7) manual page.

> Lookbehind failure from logs:
> Compiling the regular expression "(?<!IncompatibleRemoteService)Exception"
> failed.

Just so we're sure to talk about the same thing: You want to count all
lines matching `Exception' except those that also match

> Lookahead failure from logs:
> Compiling the regular expression "Exception(?!.*This application is
> out of date)" failed.

This is: All lines matching `Exception' except those also matching
`Exception.*This application is out of date'.

Neither syntax is included in the POSIX regexen and therefore not
supported by the `tail' plugin or collectd in general.

(Of course both cases are *theoretically* possible with the POSIX
regular expressions, the theoretical reason being: “Regular languages
are closed under complement and difference”. Unfortunately such insights
are not helpful ;)

We have thought about using the PCRE library for the tail plugin, but
ultimately decided against it. If we use the library there, we should
use it everywhere. We would therefore have a hard dependency on the
library (something we definitely don't want) or we'd have differing
behavior depending whether the daemon was built with or without the
library. So in favor of consistent behavior all around we decided not to
use PCRE.

Maybe the following is a possibility to stay consistent and provide the
features offered by PCRE at the same time: Introduce a configuration
option, `RegexFlavor' that can be set to either `POSIX' or `PCRE'. If
set to `PCRE', collectd (or the `tail' plugin) tries to load the `PCRE'
library at runtime and uses it. This option could either be a global one
(for all of collectd) or within the <Match /> blocks (then other plugins
using those blocks, for example the `curl' plugin, would benefit as
well). Any ideas?

Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20090621/1e97e1dc/attachment.pgp 

More information about the collectd mailing list