[collectd] 4.5.0 on solaris with gcc and the postgresql plugin
Admin
collectd-info at internode.com.au
Tue Oct 14 06:17:17 CEST 2008
Hi all
Here are a few things I've come across with regards to running 4.5.0 on
solaris
Building with gcc fails for two reasons.
The first is that gcc will throw pragma errors during compilation (as it does
with most builds on solaris), and the AM_CFLAGS setting (-Wall -Werror)
causes the build to fail. Fixed by changing this to -Werror only
--- src/Makefile.am.orig Thu Sep 4 22:31:09 2008
+++ src/Makefile.am Tue Oct 7 14:39:57 2008
@@ -10,7 +10,7 @@
endif
if COMPILER_IS_GCC
-AM_CFLAGS = -Wall -Werror
+AM_CFLAGS = -Werror
endif
--- src/Makefile.in.orig Fri Sep 5 18:23:51 2008
+++ src/Makefile.in Tue Oct 7 14:40:22 2008
@@ -1150,7 +1150,7 @@
top_builddir = @top_builddir@
top_srcdir = @top_srcdir@
SUBDIRS = $(am__append_1) $(am__append_2) $(am__append_3)
- at COMPILER_IS_GCC_TRUE@AM_CFLAGS = -Wall -Werror
+ at COMPILER_IS_GCC_TRUE@AM_CFLAGS = -Werror
AM_CPPFLAGS = -DPREFIX='"${prefix}"' \
-DCONFIGFILE='"${sysconfdir}/${PACKAGE_NAME}.conf"' \
-DLOCALSTATEDIR='"${localstatedir}"' \
Any reference to networking headers (specifically sys/strft.h which most of
them include) will cause building with gcc to fail with an error like the
following
In file included from /usr/include/sys/stream.h:28,
from /usr/include/netinet/in.h:66,
from /usr/include/sys/socket.h:45,
from interface.c:32:
/usr/include/sys/strft.h:112:4: attempt to use poisoned "sprintf"
/usr/include/sys/strft.h:114:4: attempt to use poisoned "sprintf"
*** Error code 1
Stop.
Here is the offending code from sys/strft.h
for (_ix = 0; _tot != 0ll && _ix < tdelta_t_sz; _ix++) { \
if ((_ix + 1) == tdelta_t_sz) { \
*_t = '>'; \
} else if (_ix < 8) { \
sprintf(_t, "< 0.%09llds", _ns); \
} else { \
sprintf(_t, "< %lld.%.*ss", _ns / 1000000000ll, \
9 - (_ix - 8), "000000000"); \
} \
_n = ((tv[_ix][0] * 10000ll / _toc) + 5ll) / 10ll; \
_nl = _n / 10ll; \
Fixed by surrounding the offending include with the DONT_POISON_SPRINTF_YET
trick from the perl plugin. I don't know if this is the correct fix but it
works.
eg for the interface plugin
--- src/interface.c.orig Wed Sep 10 15:26:45 2008
+++ src/interface.c Wed Sep 10 15:27:27 2008
@@ -20,7 +20,9 @@
* Sune Marcher <sm at flork.dk>
**/
+#define DONT_POISON_SPRINTF_YET 1
#include "collectd.h"
+#undef DONT_POISON_SPRINTF_YET
#include "common.h"
#include "plugin.h"
#include "configfile.h"
@@ -32,6 +34,10 @@
# include <sys/socket.h>
#endif
+#if __GNUC__
+# pragma GCC poison sprintf
+#endif
+
/* One cannot include both. This sucks. */
#if HAVE_LINUX_IF_H
# include <linux/if.h>
The postgresql plugin is missing a couple of PQclear() calls
--- src/postgresql.c.orig 2008-10-13 15:10:21.648228533 +1030
+++ src/postgresql.c 2008-10-13 17:26:08.884228200 +1030
@@ -417,8 +417,10 @@
}
rows = PQntuples (res);
- if (1 > rows)
+ if (1 > rows) {
+ PQclear (res);
return 0;
+ }
cols = PQnfields (res);
if (query->cols_num != cols) {
@@ -442,6 +444,7 @@
submit_gauge (db, col.type, col.type_instance,
value);
}
}
+ PQclear (res);
return 0;
} /* c_psql_exec_query */
On the subject of the postgresql plugin on solaris, I have it working nicely
on one machine (an 8/07 install of solaris) but not so well on an earlier
system (11/06). On the 11/06 system, collectd seems to be reinitializing the
postgresql plugin upon noticing some sort of kstat related change but keeping
the original database connections open.
ie after enabling loglevel debug, I see the following in the log
[2008-10-13 19:40:56] postgresql: Sucessfully connected to database template1
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19121)
[2008-10-13 19:40:56] postgresql: Sucessfully connected to database database1
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19122)
[2008-10-13 19:40:56] postgresql: Sucessfully connected to database database2
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19123)
[2008-10-13 19:40:56] postgresql: Sucessfully connected to database database3
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19124)
[2008-10-13 19:40:56] postgresql: Sucessfully connected to database database4
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19125)
[2008-10-13 19:43:46] kstat chain has been updated
[2008-10-13 19:43:46] postgresql: Sucessfully connected to database template1
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19305)
[2008-10-13 19:43:46] postgresql: Sucessfully connected to database database1
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19306)
[2008-10-13 19:43:46] postgresql: Sucessfully connected to database database2
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19307)
[2008-10-13 19:43:46] postgresql: Sucessfully connected to database database3
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19308)
[2008-10-13 19:43:46] postgresql: Sucessfully connected to database database4
(user pgsql) at server localhost:5432 (server version: 8.3.1, protocol
version: 3, pid: 19309)
After the kstat change message and the notification of the new successful
database connections being made, the old connections are still open. So every
2 to 5 minutes or so the daemon makes (in this case) an additional 5
connections to the database.
This is the section of collectd.c where the message comes from
#if HAVE_LIBKSTAT
static void update_kstat (void)
{
if (kc == NULL)
{
if ((kc = kstat_open ()) == NULL)
ERROR ("Unable to open kstat control structure");
}
else
{
kid_t kid;
kid = kstat_chain_update (kc);
if (kid > 0)
{
INFO ("kstat chain has been updated");
plugin_init_all ();
}
else if (kid < 0)
ERROR ("kstat chain update failed");
/* else: everything works as expected */
}
return;
} /* static void update_kstat (void) */
#endif /* HAVE_LIBKSTAT */
I don't see the 'kstat chain has been updated' message on the 8/07 system so
it could be a solaris bug, but I wonder about whether the postgresql plugin
is missing something to tell it to drop the old connections (or whether
multiple copies of the plugin are being initialised or something like that.)
Also on the subject of the postgresql plugin, would it be possible to make it
an option as to whether the connections to the database are persistent or
not?
Thanks for your input
Admin.
More information about the collectd
mailing list