[collectd] New plugin for collectd : write_mysql

Daniel Hilst danielhilst at gmail.com
Thu May 3 20:05:32 CEST 2012


On 05/03/2012 03:46 PM, Daniel Hilst wrote:
> On 02/14/2012 09:54 AM, Cyril Feraudet wrote:
>> Hi all,
>>
>> Here the write_mysql output plugin patch (from 5.0.2) with modifications
>> suggested by Octo.
>>
>> By the way, notification are handled, all memory used is freed on
>> shutdown (thx Valgrind).
>>
>> Regards,
>>
>> Cyril
>>
>> Le 04.01.2012 10:49, Cyril Feraudet a écrit :
>>
>>> Hi Octo, Hi all,
>>>
>>> Thanks for your time. It will permit to me to improve this plugin and
>>> my
>>> knowledge of C.
>>>
>>> Le 22.12.2011 01:06, Florian Forster aécrit :
>>>> Hi Cyril, thank you very much for your patches! I'm very sorry, but I
>>>> can't give the plugin the thorough review it deserves right now. But
>>>> maybe you find the following comments helpful. On Tue, Dec 20, 2011
>>>> at 03:03:43PM +0200, Cyril Feraudet wrote:
>>>>> - with_mysql_libs=`$with_mysql_config --libs 2>/dev/null` +
>>>>> with_mysql_libs=`$with_mysql_config --libs_r 2>/dev/null`
>>>> This looks like a bug fix. Would it make sense to add this change to
>>>> the 4.10 branch? If so, would you be willing to send a separate patch
>>>> for this?
>>> You are right, I will submit a patch soon. write_mysql is often called
>>> than
>>> mysql plugin and non thread safe library make collectd crashing
>>> quickly.
>>>>> +CREATE TABLE `data` ( + `id` bigint(20) NOT NULL auto_increment, +
>>>>> `timestamp` double NOT NULL, + `host_id` int(11) NOT NULL, +
>>>>> `plugin_id` int(11) NOT NULL, + `plugin_instance` varchar(255)
>>>>> default NULL, + `type_id` int(11) NOT NULL, + `typeinstance`
>>>>> varchar(255) default NULL, + `dataset_id` int(11) NOT NULL, +
>>>>> `value` double NOT NULL,
>>>> One of the reasons why I didn't write a "write_dbi" or "write_mysql"
>>>> plugin before is because I was uncertain about the schema to use. Why
>>>> did you chose to have a separate table for host, plugin,
>>>> plugin_instance, type, and type_instance? I currently think that just
>>>> having an "identifier" table with five columns would be a better
>>>> trade- off between normalization and complexity.
>>> The idea was to save space in `data` table and improve querying in it.
>>> For now, I've 800MB `data` table (MEMORY type) just for the last value
>>> of
>>> each metrics (4207 servers / 462,395 metrics) using REPLACE instead of
>>> INSERT
>>> statement. I think that SQL join are faster on numeric id than varchar.
>>> But all of this things are questionable and dependent of the final use.
>>> I think about a customizable SQL statement in plugin configuration.
>>>>> ATE TABLE `dataset` (
>>>> Why do you store datasets in the database (in addition to the
>>>> types.db file)? Having the definition in more than one place is bound
>>>> to create inconsistencies at some point
>>> This table is for readonly purpose and to make SQL query independent of
>>> number
>>> of metrics in a type. It also contain type added on the way by a
>>> plugin.
>>>>> diff --git a/src/write_mysql.c b/src/w
>>>> /blockquote> typedef struct host_s host_t; +struct host_s { + char
>>>>> name[DATA_MAX_NAME_LEN]; + int id; + host_t *next_host; +};
>>>> You will be doing a lot of lookups in this cache(es). I think using a
>>>> binary search tree (an implementation if provided in
>>>> src/utils_avltree.h) would improve performance here.
>>> Thanks for the tips, Il will do that.
>>>> int port =
>>>>> tatic int write_mysql_config (const char *key, const char *value) {
>>>>> [...] + } else if (strcasecmp ("Port", key) == 0) { + port = value;
>>>>> You
>>>> a char* to an int. This won't produce what you expect. Use the
>>>> function service_name_to_port_number() from src/common.h. +static
>>>> int wri
>>> Your'e right I will fix it.
>>>>> t (void) { + conn = mysql_init(NULL); + if (mysql_real_connect(conn,
>>>>> host, user, passwd, database, port, NULL, 0) == NULL) { The conn
>>>> be (re-)established in the write() callback, when it is needed.
>>>> Otherwise you will never recover from connection failures. + if
>>>> (!mysql_th
>>> Your'e right, I just issued it. I will fix it too.
>>>>> { + ERROR("write_mysql plugin: mysqlclient Thread Safe OFF"); I wasn't
>>>> s function. Good to know, thanks! +static int add_host_id (
>>>>> me) { [...] + len = ssnprintf (query, sizeof (query), "SELECT id
>>>>> FROM host WHERE name = '%s'", hostname); I think we should
>>>> statement with bound arguments (or whatever the terminology). This
>>>> way (a) MySQL doesn't have to parse and optimize the query each time
>>>> we need to insert a host and (b) quoting is done for us by the mysql
>>>> library. +static int add_host_id (char *
>>> I will looking for. I never used it before.
>>>>> [...] + pthread_mutex_lock(&mutex); [...] + if (row =
>>>>> mysql_fetch_row(result)) { + id = atoi(row[0]); +
>>>>> mysql_free_result(result); + pthread_mutex_unlock(&mutex); [...] +
>>>>> pthread_mutex_unlock(&mutex); + return id; +} Mutex is released twice
>>>> +static int add_plugin_id (char
>>> I'll fix that.
>>>>> { +static int add_type_id (char *typename) { A *lot* of duplicate code
>>>> ite a function like: int add_string (const char *table, const char
>>>> *str); /* returns id on success, less than zero on failure. */
>>>> +static int write_mysql_write(con
>>>>> t *ds, const value_list_t *vl, + user_data_t __attribute__((unused))
>>>>> *user_data) { [...] + len = ssnprintf (tmpquery, sizeof (tmpquery),
>>>>> "INSERT INTO data " +
>>> "(timestamp,host_id,plugin_id,plugin_instance,type_id,typeinstance,dataset_id,value)"
>>>
>>>>> + "VALUES (%.3f,%d,%d,'%s',%d,'%s',%d,%%lf)", CDTIME_T_TO_DOUBLE
>>>>> (vl->time), host_id, + plugin_id, vl->plugin_instance, type_id,
>>>>> vl->type_instance, dataset_id );
>>>> You really should prepare this statement. Parsing this several
>>>> thousand times per second will be a major performance problem.
>>>>> +if (dso->type == DS_TYPE_GAUGE) { + len = ssnprintf (query, sizeof
>>>>> (query), tmpquery, vl->values[i].gauge); +} else { + if (rates ==
>>>>> NULL) { + rates = uc_get_rate (ds, vl); + } + if (isnan(rates[i])) {
>>>>> continue; } + len = ssnprintf (query, sizeof (query), tmpquery,
>>>>> rates[i]); +} +//INFO("toto: %d", toto);
>>>>> +pthread_mutex_lock(&mutex); +DEBUG("write_mysql plugin: %s",
>>>>> query); +mysql_real_query(conn, query, len);
>>>>> +pthread_mutex_unlock(&mutex);
>>>> So you're inserting each data source in a element? How can a client
>>>> distinguish between the different data sources? At least, you should
>>>> add a "ds_index" column or so. Thanks again and best
>>> There are dataset_id and type_id in `dataset` for that. Here a query
>>> sample to get
>>> all metric in a type in right order :
>>>
>>> select data.timestamp AS date, dataset.name AS dataset_name, data.value
>>> AS value
>>> from data, host, plugin, type, dataset
>>> where host.id = data.host_id
>>> and plugin.id = data.plugin_id
>>> and type.id = data.type_id
>>> and data.dataset_id = dataset.id
>>> and host.name = 'eqds3pcold001'
>>> and plugin.name = 'interface'
>>> and data.plugin_instance = 'eth0'
>>> and type.name = 'if_octets'
>>>
>>> In my case, write_mysql is used to make, for example, sum of cpu used
>>> on several thousand
>>> of server and re-inject it in collectd to graph it.
>>>
>>> I've made a more flexible than Ganglia application based on Collectd at
>>> work. My compagny
>>> allow me to share all my work on collectd in GPL like licence. I will
>>> Release a beta of
>>> my unamed application soon a possible.
>>>
>>> Regards,
>>>
>>> Cyril Feraudet
>>>
>>>
>>>
>>> _______________________________________________
>>> collectd mailing list
>>> collectd at verplant.org <mailto:collectd at verplant.org>
>>> http://mailman.verplant.org/listinfo/collectd
>>
>>
>>
>> _______________________________________________
>> collectd mailing list
>> collectd at verplant.org
>> http://mailman.verplant.org/listinfo/collectd
>
> Hello, I'm trying to compile collectd 5.0.2 with write_mysql pluing
>
> I've done this: http://pastebin.com/dPc6zgUH
> I'm on rhel 5.5 x86_64
>
> Any idea?
>
I have to run libtoolize --force before compile

here is:
tar vxf collectd-5.0.2.tar.bz2
cd collectd-5.0.2
patch -p1 -i ../0001-Adding-write_mysql-output-plugin.patch
aclocal
autoconf
automake
libtoolize  --force
./configure
make
make install all


Now I'm facing this error:
error] write_mysql plugin: Failed to bind param to statement : Statement 
not prepared / INSERT INTO data 
(date,host_id,plugin_id,plugin_instance,type_id,type_instance,dataset_id,value)VALUES 
(?,?,?,?,?,?,?,?)

What that means????

I tried Replace "false" without success

Thanks in advance,

Hilst

-- 
Follow the white rabbit!



More information about the collectd mailing list