[collectd] VMware Guest SDK plugin for Collectd 5

Keith Chambers chambers_keith at yahoo.com
Sun Sep 18 19:24:11 CEST 2011


Thank you very much for the thorough review Octo!  I've committed the VMware plugin to Gerrit:  https://collectd.org/gerrit/#change,23

I've switched to using existing types only.

  I am using a mix of 'virt_vcpu GAUGE' and 'vcpu DERIVE'. Is 
this desirable?






Why I'm using a mix of 'virt_vcpu GAUGE' and 'vcpu DERIVE':




--Some VMware Guest SDK counters return CPU Quality of Service (QoS) policies applied to the VM.  i.e., 1000 MHz reserved for this VM.  For these 
CPU QoS statistics the type should be GAUGE.

--Other VMware Guest 
SDK counters provide CPU performance / consumption statistics in milliseconds.  These counters typically increase but will reset to 0 if the VM is live migrated from one host to another.  For these CPU performance / consumption statistics the type 
should be DERIVE.




NOTE: VMware Guest 
SDK counters for both Memory QoS and Memory performance / consumption use type 'memory GAUGE'.

Summary of types and VMware Guest SDK statistics:

    Collectd type:    total_time_in_ms    DERIVE
        VMware Guest SDK statistic:    elapsedMs

    Collectd type:    virt_vcpu    DERIVE
        VMware Guest SDK statistic:    cpuUsedMs
        VMware Guest SDK statistic:    cpuStolenMs

    Collectd type:    vcpu    GAUGE
        VMware Guest SDK statistic:    cpuReservationMHz  (QoS) 
        VMware Guest SDK statistic:    cpuLimitMHz              (QoS)
        VMware Guest SDK statistic:    cpuShares                  (QoS) 

    Collectd type:    cpufreq    GAUGE    
        VMware Guest SDK statistic:    hostMHz

    Collectd type:    memory    GAUGE
        VMware Guest SDK statistic:    memTargetSizeMB
        VMware Guest SDK statistic:    memUsedMB
        VMware Guest SDK statistic:    memMappedMB
        VMware Guest SDK statistic:    memActiveMB
        VMware Guest SDK statistic:    memOverheadMB
        VMware Guest SDK statistic:    memSharedMB
        VMware Guest SDK statistic:    memSharedSavedMB
        VMware Guest SDK statistic:    memBalloonedMB
        VMware Guest SDK statistic:    memSwappedMB

    Collectd type:    memory    GAUGE
        VMware Guest SDK statistic:    memReservationMB  (QoS) 
        VMware Guest SDK statistic:    memLimitMB              (QoS) 
        VMware Guest SDK statistic:    memShares                (QoS) 




----- Original Message -----
From: Florian Forster <octo at collectd.org>
To: Keith Chambers <chambers_keith at yahoo.com>
Cc: "collectd at verplant.org" <collectd at verplant.org>
Sent: Monday, August 15, 2011 4:26 AM
Subject: Re: [collectd] VMware Guest SDK plugin for Collectd 5

Hi Keith,

thank you very much for your work!

On Wed, Aug 03, 2011 at 01:23:14PM -0700, Keith Chambers wrote:
> I understand that this will not get merged in to the mainline due to
> the associated VMware licensing. 

I don't see any licensing problems regarding including this plugin in
the main distribution. Since we're not distributing binary packages,
even an incompatible license wouldn't necessarily keep us from including
that file in the distribution. That being said, having compatible
licenses is obviously desirable ;)

> diff -Nur a/configure b/configure

This is an automatically generated file.

> diff -Nur a/configure.in b/configure.in
> --- a/configure.in    1970-01-01 00:00:00.000000000 +0000
> +++ b/configure.in    1970-01-01 00:00:00.000000000 +0000
> @@ -4350,6 +4352,7 @@
> AC_PLUGIN([users],       [$plugin_users],      [User statistics])
> AC_PLUGIN([uuid],        [yes],                [UUID as hostname plugin])
> AC_PLUGIN([vmem],        [$plugin_vmem],       [Virtual memory statistics])
> +AC_PLUGIN([vmware],      [$plugin_vmware],     [VMware client statistics])
> AC_PLUGIN([vserver],     [$plugin_vserver],    [Linux VServer statistics])
> AC_PLUGIN([wireless],    [$plugin_wireless],   [Wireless statistics])
> AC_PLUGIN([write_http],  [$with_libcurl],      [HTTP output plugin])

You're not checking for the availability of the library to the linker,
but loading all required symbols yourself using dlsym(3) and friends.
I'd prefer if we could actually link against that library and let ld.so
do all the work. I don't have any experience with the VMWare API, but
from a quick glance it looks like this would be possible. What do you
think?

If there is a check for the library, instead of $plugin_vmware you
should pass $have_libvmware (or something along these lines) to
automatically enable the plugin *if* the library has been found.

> diff -Nur a/src/config.h.in b/src/config.h.in

This file is automatically generated.

> diff -Nur a/src/types.db b/src/types.db
> --- a/src/types.db    1970-01-01 00:00:00.000000000 +0000
> +++ b/src/types.db    1970-01-01 00:00:00.000000000 +0000
> @@ -150,6 +151,7 @@
> threads            value:GAUGE:0:U
> time_dispersion        value:GAUGE:-1000000:1000000
> timeleft        value:GAUGE:0:3600
> +time            value:DERIVE:0:U
> time_offset        value:GAUGE:-1000000:1000000
> total_bytes        value:DERIVE:0:U
> total_connections    value:DERIVE:0:U

Please use the already existing type "total_time_in_ms" instead.

> @@ -168,6 +170,8 @@
> vmpage_faults        minflt:DERIVE:0:U, majflt:DERIVE:0:U
> vmpage_io        in:DERIVE:0:U, out:DERIVE:0:U
> vmpage_number        value:GAUGE:0:4294967295
> +vmware_qos        value:GAUGE:0:U
> +vmware_host_cpu    value:GAUGE:0:U
> volatile_changes    value:GAUGE:0:U
> voltage_threshold    value:GAUGE:U:U, threshold:GAUGE:U:U
> voltage            value:GAUGE:U:U

I think it'd make sense to reuse the existing type "vcpu" rather than
introducing a VMWare-specific type. What do you think?

> diff -Nur a/src/vmware.c b/src/vmware.c
> --- a/src/vmware.c    2011-05-22 02:15:32.179086658 +0000
> +++ b/src/vmware.c    2011-05-22 02:12:35.340762567 +0000

This file is missing a license and copyright header.

> @@ -0,0 +1,395 @@
> +#include <stdlib.h>
> +#include <stdarg.h>
> +#include <stdio.h>
> +#include <dlfcn.h>
> +#include "collectd.h"
> +#include "common.h"
> +#include "plugin.h"

"collectd.h" *must* be the first file to be included. It may set defines
which configure the standard library.

> +/* handle for use with shared library */
> +void *dlHandle = NULL;

This vaiable is not used outside of "LoadFunctions". Please make it an
automatically scoped (stack) variable.

> +VMGuestLibError glError;

This variable doesn't need to be global. Please make is an automatically
scoped (stack) variable.

> +         return FALSE;                                    \

Please don't use the TRUE and FALSE macros. Use 1 and 0 instead.

> +Bool
> +LoadFunctions(void)

Please use the C99 built-in type "_Bool" instead of "Bool". Also, this
function should be static.

> +static int vmware_init (void)
> +{
> +    if (!LoadFunctions()) {
> +        ERROR ("vmware guest plugin: Unable to load GuistLib functions");

s/GuistLib/GuestLib/

> +        return (-1);
> +    }
> +
> +    /* try to load the library */
> +    glError = GuestLib_OpenHandle(&glHandle);
> +    if (glError != VMGUESTLIB_ERROR_SUCCESS) {
> +        ERROR ("OpenHandle failed: %s", GuestLib_GetErrorText(glError));

Please use a common prefix for all messages, for example "vmware guest
plugin:".

> +    if (sessionId == 0) {
> +        sessionId = tmpSession;
> +        DEBUG ("Initial session ID is 0x%"FMT64"x", sessionId);

Since the API makes no guarantee about the type of VMSessionId, I'd
prefer to case here. Also, FMT64 is some VMWare trickery, I'd rather use
the C99 standard:

  DEBUG ("Initial session ID is %#"PRIx64, (uint64_t) sessionId);

> +    submit_vmw_counter ("cpu", "used_ms", value);
> +    submit_vmw_counter ("cpu", "stolen_ms", value);

I'd drop the "…_ms" in the type instance.

> +    submit_vmw_gauge ("vmware_qos", "reservation_mhz", value);
> +    submit_vmw_gauge ("vmware_qos", "limit_mhz", value);

> +    if (glError != VMGUESTLIB_ERROR_SUCCESS) {
> +        DEBUG ("Failed to get cpu shares: %s\n", GuestLib_GetErrorText(glError));
> +    }
> +    value = (gauge_t) cpuShares;
> +    submit_vmw_gauge ("vmware_qos", "shares", value);

The first two "vmware_qos" values appear to be in MHz, the last seems to
be something else (fraction of CPUs?). They shouldn't have the same
type.

"value" is of type "derive_t", but the third argument to
"submit_vmw_gauge" is of type "gauge_t". Please don't make this an
implicit cast. I'd use "derive_t dval" and "gauge_t gval" instead or
just add the cast to the submit_vmw_{derive,gauge}() invocation, for
example:

  submit_vmw_gauge ("vmware_qos", "reservation_mhz",
      (gauge_t) cpuReservationMHz);

> +    submit_vmw_gauge ("vmware_host_cpu", "clock_speed_mhz", value);

Please use the existing type "cpufreq" instead. It records the CPU
frequency in Hertz, so the call should be something like:

  submit_vmw_gauge ("cpufreq", "", 1.0e6 * value);

> +    submit_vmw_gauge ("memory", "target_mb", value);
> +    submit_vmw_gauge ("memory", "used_mb", value);
> +    submit_vmw_gauge ("memory", "mapped_mb", value);
> +    submit_vmw_gauge ("memory", "active_mb", value);
> +    submit_vmw_gauge ("memory", "overhead_mb", value);
> +    submit_vmw_gauge ("memory", "shared_mb", value);
> +    submit_vmw_gauge ("memory", "shared_saved_mb", value);
> +    submit_vmw_gauge ("memory", "ballooned_mb", value);
> +    submit_vmw_gauge ("memory", "swapped_mb", value);

Please convert to bytes, i.e. multiply by 2^20. The graphing front-ends
should take care to convert that back to MByte / GByte. Also, lose the
"…_mb" suffix.

> +    submit_vmw_gauge ("vmware_qos", "reservation_mb", value);
> +    submit_vmw_gauge ("vmware_qos", "limit_mb", value);

This time "vmware_qos" is MBytes. Please remove the "…_mb" suffix and
use a type that describes the type of data. You could use "memory" again
and set a different plugin instance, for example.

> +    submit_vmw_gauge ("vmware_qos", "shares", value);

A value with this name has already been submitted. What kind of data is
is this time? Fractions of memory? Number of pages?

Best regards and thanks again for all the work you've put into this!
—octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x0C705A15
http://octo.it/




More information about the collectd mailing list