[collectd] possible bug in disk.c module - value overflow

James Valente dev at zupercomputer.net
Tue Jun 13 20:41:34 CEST 2006


collectd     - 3.9.2
Fedora Core  - 3
Linux Kernel - 2.6.11-1.14_FC3smp

I want to report a possible bug in the disk.c module.  I am not exactly
sure what is causing it.

At some point, on a server that has significant disk write activity,
disk.c logs "write bytes" as values of 0.  This occurs, even though
writes continue to take place on the server.  When I reboot the server,
values in /proc/diskstats get reset, and naturally, disk.c begins saving
the correct values to the rrd file again.

1. A snapshot of the graphs that exhibit the problem is located here.

    http://www.zupercomputer.net/misc/collectd-bug/

While the hourly graph no longer shows the time period in question, the 
daily graph does.  Just before 04:50 am on Tuesday, write activity was 
no longer being recorded.  (The read/write activity observed in the 
graphs is generated by a disk-load utility that runs continuously.)


2. The next two lines of data were captured from the kernel's
/proc/diskstats at 04:46:52 and 04:47:55, the window when the disk.c
module began logging "write" values of 0.

8  0  sda     4,000,155 3,460,313   898,693,495 33,076,264     8,053,734
   260,301,302 2,146,843,980 1,279,735,825            99    26,224,099
1,312,825,360

8  0  sda     4,004,585 3,464,317   899,754,079 33,153,610     8,074,680
   260,978,724 2,152,430,156 1,282,563,658             1    26,279,900
1,315,718,119


3. Next, a snapshot of the rrd file dump, for disk-8-0, showing the
progression of values to the point when they began logging values of 0.
  (If you prefer, I can also send the full dump of the rrd file, as well
as all continuous data I have captured from /proc/diskstats.)

You'll notice that between 04:47:10 and 04:47:20, RRD begins recording
value of 0.0000000000e+00, continuously.  It never recovers.  I suspect
that the values within /proc/diskstats are causing some kind of number
overflow.  Either within disk.c or within rrdtool.  I don't know enough
about either to determine which it is.  Even after stopping collectd and 
deleting all the RRD files, the problem continues.  Write bytes continue 
to be recorded as value 0.

It may be easier to read the rrd dump at this URL:
http://www.zupercomputer.net/misc/collectd-bug/rrd-dump.txt


<!-- 2006-06-13 04:45:30 PDT / 1150199130 --> <row><v> 3.4944000000e+02
</v><v> 3.3788000000e+02 </v><v> 4.4367462400e+07 </v><v>
1.1025800000e+03 </v><v> 2.0940000000e+01 </v><v> 1.1860000000e+01
</v><v> 1.3451264000e+05 </v><v> 1.0313600000e+03 </v></row>

<!-- 2006-06-13 04:45:40 PDT / 1150199140 --> <row><v> 3.5688000000e+02
</v><v> 3.4424000000e+02 </v><v> 4.5192970240e+07 </v><v>
1.1443400000e+03 </v><v> 1.1800000000e+01 </v><v> 6.5800000000e+00
</v><v> 7.5284480000e+04 </v><v> 3.9832000000e+02 </v></row>

<!-- 2006-06-13 04:45:50 PDT / 1150199150 --> <row><v> 2.0826000000e+02
</v><v> 1.8614000000e+02 </v><v> 2.4807096320e+07 </v><v>
7.4966000000e+02 </v><v> 7.7200000000e+00 </v><v> 5.8000000000e+00
</v><v> 5.5377920000e+04 </v><v> 2.7622000000e+02 </v></row>

<!-- 2006-06-13 04:46:00 PDT / 1150199160 --> <row><v> 2.2112000000e+02
</v><v> 2.9200000000e+01 </v><v> 5.3672345600e+06 </v><v>
1.3236800000e+03 </v><v> 5.2800000000e+01 </v><v> 5.2980000000e+01
</v><v> 4.3327488000e+05 </v><v> 6.3996800000e+03 </v></row>

<!-- 2006-06-13 04:46:10 PDT / 1150199170 --> <row><v> 6.2360000000e+01
</v><v> 0.0000000000e+00 </v><v> 4.8316416000e+05 </v><v>
3.9448000000e+02 </v><v> 6.6520000000e+01 </v><v> 1.6264000000e+03
</v><v> 6.9551718400e+06 </v><v> 6.7595400000e+03 </v></row>

<!-- 2006-06-13 04:46:20 PDT / 1150199180 --> <row><v> 4.5400000000e+00
</v><v> 0.0000000000e+00 </v><v> 6.6109440000e+04 </v><v>
2.0950000000e+02 </v><v> 3.8398000000e+02 </v><v> 1.4032400000e+04
</v><v> 5.9076362240e+07 </v><v> 1.1444134000e+05 </v></row>

<!-- 2006-06-13 04:46:30 PDT / 1150199190 --> <row><v> 1.6000000000e-01
</v><v> 0.0000000000e+00 </v><v> 8.8473600000e+03 </v><v>
4.6380000000e+01 </v><v> 4.5360000000e+02 </v><v> 1.6521100000e+04
</v><v> 6.9519687680e+07 </v><v> 1.2407434000e+05 </v></row>

<!-- 2006-06-13 04:46:40 PDT / 1150199200 --> <row><v> 4.8000000000e+00
</v><v> 4.0800000000e+00 </v><v> 2.0709376000e+05 </v><v>
9.6792000000e+02 </v><v> 4.7152000000e+02 </v><v> 1.7347260000e+04
</v><v> 7.2984494080e+07 </v><v> 1.6330636000e+05 </v></row>

<!-- 2006-06-13 04:46:50 PDT / 1150199210 --> <row><v> 1.9200000000e+00
</v><v> 1.0200000000e+00 </v><v> 5.5377920000e+04 </v><v>
1.0367800000e+03 </v><v> 3.3084000000e+02 </v><v> 1.1426240000e+04
</v><v> 4.8128081920e+07 </v><v> 7.9184760000e+04 </v></row>

<!-- 2006-06-13 04:47:00 PDT / 1150199220 --> <row><v> 6.6000000000e-01
</v><v> 1.6000000000e-01 </v><v> 2.0889600000e+04 </v><v>
4.1670000000e+02 </v><v> 4.8528000000e+02 </v><v> 1.7297900000e+04
</v><v> 2.7680645120e+07 </v><v> 9.5287500000e+04 </v></row>

<!-- 2006-06-13 04:47:10 PDT / 1150199230 --> <row><v> 4.5200000000e+00
</v><v> 1.2400000000e+00 </v><v> 1.2689408000e+05 </v><v>
9.7274000000e+02 </v><v> 2.6228000000e+02 </v><v> 9.1928000000e+03
</v><v> 4.8369971200e+06 </v><v> 5.0399680000e+04 </v></row>

<!-- 2006-06-13 04:47:20 PDT / 1150199240 --> <row><v> 1.0300000000e+01
</v><v> 8.0600000000e+00 </v><v> 1.0878976000e+06 </v><v>
8.8580000000e+02 </v><v> 3.6956000000e+02 </v><v> 1.2430100000e+04
</v><v> 0.0000000000e+00 </v><v> 3.9692300000e+04 </v></row>

<!-- 2006-06-13 04:47:30 PDT / 1150199250 --> <row><v> 1.2654000000e+02
</v><v> 1.1354000000e+02 </v><v> 1.5540142080e+07 </v><v>
1.4431800000e+03 </v><v> 3.9010000000e+02 </v><v> 1.0269400000e+04
</v><v> 0.0000000000e+00 </v><v> 3.5053860000e+04 </v></row>

<!-- 2006-06-13 04:47:40 PDT / 1150199260 --> <row><v> 1.4754000000e+02
</v><v> 1.3510000000e+02 </v><v> 1.8334187520e+07 </v><v>
1.3901800000e+03 </v><v> 1.6462000000e+02 </v><v> 4.6180200000e+03
</v><v> 0.0000000000e+00 </v><v> 1.5310560000e+04 </v></row>

<!-- 2006-06-13 04:47:50 PDT / 1150199270 --> <row><v> 1.2744000000e+02
</v><v> 1.1920000000e+02 </v><v> 1.6036741120e+07 </v><v>
1.4152000000e+03 </v><v> 2.3946000000e+02 </v><v> 7.8937000000e+03
</v><v> 0.0000000000e+00 </v><v> 2.9200320000e+04 </v></row>

<!-- 2006-06-13 04:48:00 PDT / 1150199280 --> <row><v> 1.0642000000e+02
</v><v> 9.8620000000e+01 </v><v> 1.3367214080e+07 </v><v>
9.7938000000e+02 </v><v> 2.4606000000e+02 </v><v> 7.6081000000e+03
</v><v> 0.0000000000e+00 </v><v> 1.8369940000e+04 </v></row>

<!-- 2006-06-13 04:48:10 PDT / 1150199290 --> <row><v> 1.2758000000e+02
</v><v> 1.2040000000e+02 </v><v> 1.6189603840e+07 </v><v>
4.9320000000e+02 </v><v> 2.4320000000e+02 </v><v> 7.7751400000e+03
</v><v> 0.0000000000e+00 </v><v> 1.0256640000e+04 </v></row>

<!-- 2006-06-13 04:48:20 PDT / 1150199300 --> <row><v> 1.1990000000e+02
</v><v> 1.1346000000e+02 </v><v> 1.4981447680e+07 </v><v>
1.0237000000e+03 </v><v> 2.4034000000e+02 </v><v> 8.0873800000e+03
</v><v> 0.0000000000e+00 </v><v> 1.9117940000e+04 </v></row>

<!-- 2006-06-13 04:48:30 PDT / 1150199310 --> <row><v> 8.3840000000e+01
</v><v> 7.9220000000e+01 </v><v> 1.0366648320e+07 </v><v>
1.2098400000e+03 </v><v> 2.8236000000e+02 </v><v> 9.1867000000e+03
</v><v> 0.0000000000e+00 </v><v> 2.2236340000e+04 </v></row>

<!-- 2006-06-13 04:48:40 PDT / 1150199320 --> <row><v> 1.5986000000e+02
</v><v> 1.5222000000e+02 </v><v> 1.9941376000e+07 </v><v>
1.9680400000e+03 </v><v> 2.8014000000e+02 </v><v> 9.5538600000e+03
</v><v> 0.0000000000e+00 </v><v> 3.3767060000e+04 </v></row>

<!-- 2006-06-13 04:48:50 PDT / 1150199330 --> <row><v> 1.1482000000e+02
</v><v> 1.1048000000e+02 </v><v> 1.4502051840e+07 </v><v>
7.0730000000e+02 </v><v> 2.2274000000e+02 </v><v> 7.4730000000e+03
</v><v> 0.0000000000e+00 </v><v> 1.7301980000e+04 </v></row>

<!-- 2006-06-13 04:49:00 PDT / 1150199340 --> <row><v> 1.7806000000e+02
</v><v> 1.7004000000e+02 </v><v> 2.2318940160e+07 </v><v>
1.5447200000e+03 </v><v> 2.7192000000e+02 </v><v> 9.0075800000e+03
</v><v> 0.0000000000e+00 </v><v> 3.0953320000e+04 </v></row>

<!-- 2006-06-13 04:49:10 PDT / 1150199350 --> <row><v> 1.8648000000e+02
</v><v> 1.7896000000e+02 </v><v> 2.3477534720e+07 </v><v>
8.6108000000e+02 </v><v> 2.1600000000e+02 </v><v> 7.1583400000e+03
</v><v> 0.0000000000e+00 </v><v> 1.5901060000e+04 </v></row>

<!-- 2006-06-13 04:49:20 PDT / 1150199360 --> <row><v> 1.0296000000e+02
</v><v> 9.3940000000e+01 </v><v> 1.2547522560e+07 </v><v>
9.1720000000e+02 </v><v> 2.5154000000e+02 </v><v> 8.3847200000e+03
</v><v> 0.0000000000e+00 </v><v> 2.5104540000e+04 </v></row>

<!-- 2006-06-13 04:49:30 PDT / 1150199370 --> <row><v> 1.7368000000e+02
</v><v> 1.5946000000e+02 </v><v> 2.1283880960e+07 </v><v>
1.3132600000e+03 </v><v> 2.0258000000e+02 </v><v> 6.4919800000e+03
</v><v> 0.0000000000e+00 </v><v> 2.2207940000e+04 </v></row>

<!-- 2006-06-13 04:49:40 PDT / 1150199380 --> <row><v> 2.8608000000e+02
</v><v> 2.7532000000e+02 </v><v> 3.6203479040e+07 </v><v>
1.0423600000e+03 </v><v> 4.5240000000e+01 </v><v> 1.1860800000e+03
</v><v> 0.0000000000e+00 </v><v> 4.3408000000e+03 </v></row>

<!-- 2006-06-13 04:49:50 PDT / 1150199390 --> <row><v> 3.2818000000e+02
</v><v> 3.1594000000e+02 </v><v> 4.1479372800e+07 </v><v>
1.0179600000e+03 </v><v> 6.6800000000e+00 </v><v> 5.6600000000e+00
</v><v> 0.0000000000e+00 </v><v> 2.0362000000e+02 </v></row>

<!-- 2006-06-13 04:50:00 PDT / 1150199400 --> <row><v> 2.6814000000e+02
</v><v> 2.5428000000e+02 </v><v> 3.3394032640e+07 </v><v>
8.6700000000e+02 </v><v> 7.7400000000e+00 </v><v> 7.2600000000e+00
</v><v> 0.0000000000e+00 </v><v> 2.3384000000e+02 </v></row>

<!-- 2006-06-13 04:50:10 PDT / 1150199410 --> <row><v> 2.5702000000e+02
</v><v> 2.4660000000e+02 </v><v> 3.2066191360e+07 </v><v>
9.2914000000e+02 </v><v> 1.7400000000e+01 </v><v> 1.0020000000e+01
</v><v> 0.0000000000e+00 </v><v> 7.6438000000e+02 </v></row>

<!-- 2006-06-13 04:50:20 PDT / 1150199420 --> <row><v> 3.6606000000e+02
</v><v> 3.4896000000e+02 </v><v> 4.5922713600e+07 </v><v>
1.2527000000e+03 </v><v> 2.3860000000e+01 </v><v> 2.4760000000e+01
</v><v> 0.0000000000e+00 </v><v> 1.1065600000e+03 </v></row>

<!-- 2006-06-13 04:50:30 PDT / 1150199430 --> <row><v> 1.9396000000e+02
</v><v> 1.3592000000e+02 </v><v> 1.8265374720e+07 </v><v>
7.3758000000e+02 </v><v> 1.1300000000e+01 </v><v> 1.2940000000e+01
</v><v> 0.0000000000e+00 </v><v> 4.1210000000e+02 </v></row>

<!-- 2006-06-13 04:50:40 PDT / 1150199440 --> <row><v> 1.8572000000e+02
</v><v> 1.5760000000e+01 </v><v> 3.3942732800e+06 </v><v>
1.2231600000e+03 </v><v> 7.5820000000e+01 </v><v> 6.8060000000e+01
</v><v> 0.0000000000e+00 </v><v> 8.2602400000e+03 </v></row>

<!-- 2006-06-13 04:50:50 PDT / 1150199450 --> <row><v> 4.0020000000e+01
</v><v> 1.2000000000e-01 </v><v> 3.3841152000e+05 </v><v>
2.8212000000e+02 </v><v> 1.3984000000e+02 </v><v> 4.3652000000e+03
</v><v> 0.0000000000e+00 </v><v> 1.6755080000e+04 </v></row>

<!-- 2006-06-13 04:51:00 PDT / 1150199460 --> <row><v> 5.2000000000e-01
</v><v> 0.0000000000e+00 </v><v> 5.3248000000e+03 </v><v>
4.7202000000e+02 </v><v> 4.2744000000e+02 </v><v> 1.5663240000e+04
</v><v> 0.0000000000e+00 </v><v> 1.1500156000e+05 </v></row>

<!-- 2006-06-13 04:51:10 PDT / 1150199470 --> <row><v> 1.7600000000e+00
</v><v> 8.0000000000e-01 </v><v> 5.8736640000e+04 </v><v>
4.6340000000e+02 </v><v> 4.5240000000e+02 </v><v> 1.6712340000e+04
</v><v> 0.0000000000e+00 </v><v> 1.5014364000e+05 </v></row>

<!-- 2006-06-13 04:51:20 PDT / 1150199480 --> <row><v> 6.3836363636e+00
</v><v> 2.6727272727e+00 </v><v> 2.8983296000e+05 </v><v>
1.4613490909e+03 </v><v> 2.9322545455e+02 </v><v> 1.0697625455e+04
</v><v> 0.0000000000e+00 </v><v> 1.1446194727e+05 </v></row>

<!-- 2006-06-13 04:51:30 PDT / 1150199490 --> <row><v> 2.3763636364e+00
</v><v> 9.2727272727e-01 </v><v> 1.0379264000e+05 </v><v>
6.8912090909e+02 </v><v> 5.1176454545e+02 </v><v> 1.7872574545e+04
</v><v> 0.0000000000e+00 </v><v> 1.6839390273e+05 </v></row>

<!-- 2006-06-13 04:51:40 PDT / 1150199500 --> <row><v> 3.2800000000e+00
</v><v> 1.6800000000e+00 </v><v> 2.5944064000e+05 </v><v>
7.4490000000e+02 </v><v> 4.8571000000e+02 </v><v> 1.7000200000e+04
</v><v> 0.0000000000e+00 </v><v> 1.3837673000e+05 </v></row>

<!-- 2006-06-13 04:51:50 PDT / 1150199510 --> <row><v> 4.5066666667e+00
</v><v> 3.1000000000e+00 </v><v> 4.6394026667e+05 </v><v>
1.4024233333e+03 </v><v> 3.3599333333e+02 </v><v> 1.1851906667e+04
</v><v> 0.0000000000e+00 </v><v> 6.1147620000e+04 </v></row>

<!-- 2006-06-13 04:52:00 PDT / 1150199520 --> <row><v> 3.5193333333e+01
</v><v> 3.2480000000e+01 </v><v> 4.3280520533e+06 </v><v>
1.5280866667e+03 </v><v> 2.7060666667e+02 </v><v> 9.2191733333e+03
</v><v> 0.0000000000e+00 </v><v> 3.3531560000e+04 </v></row>

<!-- 2006-06-13 04:52:10 PDT / 1150199530 --> <row><v> 1.4644000000e+02
</v><v> 1.3746000000e+02 </v><v> 1.8141757440e+07 </v><v>
1.2226600000e+03 </v><v> 2.0186000000e+02 </v><v 5.2211600000e+03
</v><v> 0.0000000000e+00 </v><v> 1.6387280000e+04 </v></row>

<!-- 2006-06-13 04:52:20 PDT / 1150199540 --> <row><v> 1.1116000000e+02
</v><v> 1.0406000000e+02 </v><v> 1.3746503680e+07 </v><v>
9.3252000000e+02 </v><v> 3.2246000000e+02 </v><v> 9.0490000000e+03
</v><v> 0.0000000000e+00 </v><v> 2.5585400000e+04 </v></row>

<!-- 2006-06-13 04:52:30 PDT / 1150199550 --> <row><v> 1.5018000000e+02
</v><v> 1.4474000000e+02 </v><v> 1.9046645760e+07 </v><v>
5.9924000000e+02 </v><v> 2.3572000000e+02 </v><v> 7.5410800000e+03
</v><v> 0.0000000000e+00 </v><v> 1.4281340000e+04 </v></row>

<!-- 2006-06-13 04:52:40 PDT / 1150199560 --> <row><v> 1.2606000000e+02
</v><v> 1.1768000000e+02 </v><v> 1.5927541760e+07 </v><v>
9.6674000000e+02 </v><v> 2.4914000000e+02 </v><v> 7.7688600000e+03
</v><v> 0.0000000000e+00 </v><v> 1.5147200000e+04 </v></row>

<!-- 2006-06-13 04:52:50 PDT / 1150199570 --> <row><v> 1.4094000000e+02
</v><v> 1.3264000000e+02 </v><v> 1.7870929920e+07 </v><v>
1.0318000000e+03 </v><v> 2.5456000000e+02 </v><v> 8.7017600000e+03
</v><v> 0.0000000000e+00 </v><v> 2.3799120000e+04 </v></row>

<!-- 2006-06-13 04:53:00 PDT / 1150199580 --> <row><v> 1.7334000000e+02
</v><v> 1.6426000000e+02 </v><v> 2.1756887040e+07 </v><v>
1.2245800000e+03 </v><v> 2.4722000000e+02 </v><v> 8.3767800000e+03
</v><v> 0.0000000000e+00 </v><v> 2.4053820000e+04 </v></row>

<!-- 2006-06-13 04:53:10 PDT / 1150199590 --> <row><v> 1.4576000000e+02
</v><v> 1.3558000000e+02 </v><v> 1.7786798080e+07 </v><v>
1.0554600000e+03 </v><v> 2.2854000000e+02 </v><v> 7.3302800000e+03
</v><v> 0.0000000000e+00 </v><v> 1.7958680000e+04 </v></row>

<!-- 2006-06-13 04:53:20 PDT / 1150199600 --> <row><v> 9.2580000000e+01
</v><v> 8.9540000000e+01 </v><v> 1.1738152960e+07 </v><v>
3.7098000000e+02 </v><v> 2.7220000000e+02 </v><v> 8.9677800000e+03
</v><v> 0.0000000000e+00 </v><v> 1.3195500000e+04 </v></row>

<!-- 2006-06-13 04:53:30 PDT / 1150199610 --> <row><v> 1.6360000000e+02
</v><v> 1.5750000000e+02 </v><v> 2.0686520320e+07 </v><v>
9.2692000000e+02 </v><v> 2.5470000000e+02 </v><v> 8.5426400000e+03
</v><v> 0.0000000000e+00 </v><v> 2.0314200000e+04 </v></row>

<!-- 2006-06-13 04:53:40 PDT / 1150199620 --> <row><v> 9.9520000000e+01
</v><v> 9.5740000000e+01 </v><v> 1.2565708800e+07 </v><v>
4.5402000000e+02 </v><v> 2.3864000000e+02 </v><v> 7.7053800000e+03
</v><v> 0.0000000000e+00 </v><v> 1.6218100000e+04 </v></row>

<!-- 2006-06-13 04:53:50 PDT / 1150199630 --> <row><v> 1.5040000000e+02
</v><v> 1.4376000000e+02 </v><v> 1.8909511680e+07 </v><v>
1.0378000000e+03 </v><v> 2.6438000000e+02 </v><v> 8.9482200000e+03
</v><v> 0.0000000000e+00 </v><v> 2.4949780000e+04 </v></row>

<!-- 2006-06-13 04:54:00 PDT / 1150199640 --> <row><v> 1.4756000000e+02
</v><v> 1.3576000000e+02 </v><v> 1.8108907520e+07 </v><v>
1.1153000000e+03 </v><v> 2.3734000000e+02 </v><v> 8.0448000000e+03
</v><v> 0.0000000000e+00 </v><v> 2.5243160000e+04 </v></row>




-- 
----------------------------------------------------------------
antispam signature key:  a0b7b92f29276f726b2741205caf799c
----------------------------------------------------------------



More information about the collectd mailing list