[collectd] [plugin:virt] - Confusing metrics value type.

Wed Sep 15 18:46:21 CEST 2021

Hi Matthias,

No worries about the answer timing, I’m busy too ;-)

My use case isn’t for billing but for our projects to be able to get basic
monitoring of their VMs without having to deploy yet another monitoring
system additionally to what we already have.

The problem with cpu shares is that I can’t get the host real cpu
availability as we’re on a hyperconverged mode which have reserved
ressources for storage.

Currently we enabled the following extrastats:

- perf
- cpu_util
- pcpu
- vcpu
- vcpupin
- disk_err
- job_stats_background
- domaine_stats

My problem with collectd metrics for now is more about the units than the
data type as data type is well documented and/or exposed by the prometheus
endpoint.

For instance, those counters of cpu shares used that accumulate, are they
amount of ns? Seconds? Instructions ? What type of units do they represents?

Kind regards!

Le sam. 11 sept. 2021 à 15:04, Matthias Runge <mrunge at matthias-runge.de> a
écrit :

>
>
> > Am 09.09.2021 um 16:35 schrieb Gaël THEROND <gael.therond at bitswalk.com>:
> >
> > Hi everyone,
> >
> > I'm currently building some monitoring pages for our Openstack platform.
> > To do so I leverage collectd with collectd-virt plugin and
> write_prometheus
> > plugin.
> >
> > Everything is fine, I'm receiving all metrics and I'm able to graph them
> > from Grafana using prometheus as our datasource.
> >
> > The only issue that I'm facing is to manipulate those metrics.
> > Our platform is CentOS 8 based and uses Collectd 5.11 provided by CentOS.
> >
> > The first issue that I've got is that few metrics labels exposed to
> > prometheus aren't listed in here or are having a different name (We do
> not
> > relabeled them yet): https://collectd.org/wiki/index.php/Plugin:virt
> >
> > The second issue that I face is to find out the appropriate unit type of
> > those metrics as, yes, I've the data type, but are those float and int
> > expressing ms? ns? ms/s? time since epoch? or something else?
> >
> > Here are the problematic metrics:
> > collectd_virt_virt_cpu_total_total - prometheus type counter
> > collectd_virt_ps_cputime_user_total - prometheus type counter
> > collectd_virt_ps_cputime_syst_total - prometheus type counter
>
> Hi,
>
> Sorry for not getting back to you sooner.
>
> Since these metrics are total counters, I’d personally look only at
> differences over time.
> Are you using these values for billing purposes, like: your vm used
> 14533 cpu cycles, that’ll be 43,23 €?
>
> >
> > Subsidiary question:
> > Is there any way to get the vm internal memory usage?
> > Because within the wiki page there is memory-rss etc but our collectd
> with
> > collectd-virt plugin is just exposing allocated memory.
> >
>
> I would think we are exposing these metrics, but didn’t check lately. In
> any case, I
> See value in having those metrics.
> What kind of extrastats config do you use? See
> https://github.com/collectd/collectd/blob/main/src/collectd.conf.pod#plugin-virt
> You can pass the memory option.
>
> Matthias
>
> > Additional information:
> > OS Version: CentOS 8.4.2105
> > Kernel: 4.18.0-305.7.1
> > Collectd: 5.11.0-2
> > Collectd-virt: 5.11.0-2
> > Collectd-write_prometheus: 5.11.0-2
> >
> > Feel free to tell me if I missed something important.
> > Cheers!
> > _______________________________________________
> > collectd mailing list
> > collectd at verplant.org
> > https://mailman.verplant.org/listinfo/collectd
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.verplant.org/pipermail/collectd/attachments/20210915/cf734d32/attachment.html>