[collectd] overflow of procstat_t cpu_user/system_counter in processes
Florian Forster
octo at verplant.org
Wed Jul 15 09:42:29 CEST 2009
Hi James,
On Tue, Jul 14, 2009 at 10:30:13AM -0700, james at jwarner.org wrote:
> However, when I was reading the source for the processes plugin I
> noticed that the cpu_user_counter and cpu_system_counter value in
> ps_read_process are unsigned long long values and that the procstat_t
> values for cpu_user_counter and cpu_system_counter are unsigned long
> only.
this is done on purpose, but I wouldn't be at all surprised if there was
a bug in there somewhere..
The base problem here is that we went to add counters to one another. If
all counters have the same size, all works well enough:
32bit = (32bit + 32bit + ... + 32bit) mod 2^32
This works, if the counters being added up are larger than the
destination, too:
32bit = (64bit + 64bit + ... + 64bit) mod 2^32
What does not work is if there is one counter which is smaller than the
destination counter:
64bit = (32bit + 32bit + ... + 32bit) mod 2^64 <--- WRONG!
If one of the 32bit counters overflows, the code will think the 64bit
counter overflowed, too, resulting in a huge spike.
It isn't a problem if a counter is added, too: You can assume it was
there all along but was zero all the time. It *is* a problem if a
counter is removed, though. And that's the problem I currently don't see
how it's handled (if it's handled at all).
Maybe it'd be easiest and most straight forward method to simply
calculate a rate for each PID and then add all those rates to a private
counter.
Another problem is that `unsigned long' may be 64bit wide on 64bit
architectures. If the counters provided by the operating system are only
32bit wide, we will have problems as described above.
Regards,
-octo
--
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20090715/df17c55c/attachment.pgp
More information about the collectd
mailing list