[collectd] patch for email stats from postfix and amavisd-new, dns stats from powerdns
Luke Heberling
collectd at c-ware.com
Sun Dec 2 09:28:21 CET 2007
Florian Forster wrote:
> Why primes? 2^k-1 I would have understood, basically anything that
> has to with padding and avoiding extra bytes, but primes? What do
> they have to do with it?
I've done some experimenting. Here are some results.
I've abstracted the calc_table_len code into a function that is passed
into the creating of the hash table. I then added two more hash
algorithms and pulled the hash related functions into their own source
file, utils_hash.{c,h}. Then did some testing of the hash distribution
for each hash/table_length algorithm with the 98569 words in the
american english list from the scowl project
(http://wordlist.sourceforge.net). The mru cache uses a chained hash
table so it's easy to look at the number of collisions.
Collisions
Hash/Table_Length Algo 0 1 2 3 4 5 6
sdbm/prime 60883 14826 2293 257 23 2
sdbm/power_two 46313 17353 4528 815 125 10 3
sdbm/power_two_minus_1 46180 17615 4348 839 129 19
djb2/prime 60846 14606 2411 288 24 1
djb2/power_two 46913 17408 4354 787 114 10
djb2/power_two_minus_1 46684 17470 4351 805 110 18 2
jenkins/prime 60949 14593 2369 290 31 2
jenkins/power_two 46254 17602 4292 846 148 15 3
jenkins/power_two_minus_1 60883 14826 2293 257 23 2
Clearly the prime table length provides the best distribution among all
the hash algorithms. But the jenkins hash performed pretty much as well
with the power-of-two-minus-one table length, indicating that it is a
preferred hash function, at least with this data set.
I like the idea of a separate utils_hash file, as well as the pluggable
table length calculation functions. If you have the time, let me know
anything that jumps out at you about that organization or the utils_hash
and utils_mrucache modules. Meantime, I'm continuing to test the
rewritten amavis/postfix/powerdns plugins.
Luke
-------------- next part --------------
A non-text attachment was scrubbed...
Name: tests.patch
Type: text/x-diff
Size: 9843 bytes
Desc: not available
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20071202/3e089517/attachment-0003.patch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: utils_hash.patch
Type: text/x-diff
Size: 14497 bytes
Desc: not available
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20071202/3e089517/attachment-0004.patch
-------------- next part --------------
A non-text attachment was scrubbed...
Name: utils_mrucache.patch
Type: text/x-diff
Size: 17177 bytes
Desc: not available
Url : http://mailman.verplant.org/pipermail/collectd/attachments/20071202/3e089517/attachment-0005.patch
More information about the collectd
mailing list