[collectd] [PATCH] lpar plugin update

Aurélien Reynaud collectd at wattapower.net
Sat Sep 11 15:00:28 CEST 2010


Hi Florian,


> thanks for the update :) I've pushed the changes to the "ar/lpar"
> branch. [...] makes a lot more sense now that I understand what's going on ;)

No problem, I've been throught this myself! Your will to thoroughly
understand every part of the code is IMHO the main reason behind
collectd's excellent quality.

> > - the "consumed" metric might seem superfluous at first sight […]
> >   But I thought it might come in handy when dealing with dedicated
> >   partitions, where donated and stolen values are no easy concepts.
> 
> Would it make sense to activate this metric only if the partition is a
> dedicated partition and donations have been enabled? Likewise, would it make sense to submit "entitled" capacity only if the
> partition is a shared partition?

I'm not sure. I have been thinking about this since my last post, and
here is what I finally came out with:

'consumed' is a generic concept, materializing the hardware processing
power really allocated to the partition by the hypervisor. 

Be it a shared or dedicated partition, comparing this value to the
hard/soft limits you configured (entitlement, capping, donation) is
possibly what admins care the most about. Many will be satisfied with
just having the 'entitled' and 'consumed' metrics.

When you see things this way, knowing which part of these two main
metrics are in fact spent in io, system, user, stolen from other
partitions, donated to them, or just idling could appear as mere
details, useful but not essential.

So maybe we could collect only entitled and consumed by default, and
provide some 'Details = true/false' option to enable the collection of
the different processor states.


Does it make sense to you?

> > I posted a fix ("Fix errno thread-safety under AIX") [...] I applied
> the fix to the collectd-4.9 branch and will merge it to master
> eventually.

Thanks.

> > +		ssnprintf (typinst, sizeof (typinst), "pool-%X-total", lparstats.pool_id);
> > +		lpar_submit (typinst, (double) pool_max_ns / XINTFRAC / (double) ticks);
> 
> I'd prefer to account "busy" and "used" (non-busy) rather than "busy"
> and "total". Do you see any problem with changing that?

None. I can provide a patch, unless you want to code it yourself. I just
find a little confusing to have busy/used as metrics. Maybe busy/idle or
used/unused would be more straightforward?

> 
> > +	save_last_values (&lparstats);
> 
> I think it might be easier to keep a (module) global
> "perfstat_partition_total_t" around and simply do
> 
>   memcpy (&lparstats_old, lparstats_new, sizeof (lparstats_old));
> 
> in the "save values" function. What do you think?

Fine with me. Just need to make sure no struct members are pointers to
other structs. We would have to recursively duplicate them or document
that some members of the copy are not accessible. IMHO the
implementation requiring the fewest lines of code will be the best
here...


Regards,

Aurélien Reynaud






More information about the collectd mailing list