Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"If memory overhead (load factor) isn't a big issue, readers may also find cuckoo hashing rather interesting."

The issue with memory overhead (load factor) is actually something that Cuckoo hashing solves compared to linear probing. Bucketized Cuckoo hashing supports much higher load factors than linear probing. You can have your cake and eat it too.

I have implemented hash tables with open addressing and tombstones in the past (linear probing, triangular probing etc.) and I much prefer Cuckoo hashing now.

When Cuckoo hashing is combined with a bloom filter in the first position to reduce cache misses to the second position, it's almost perfect in every way: fast (drastically reduced cache misses), constant lookup time in the worst case (not so for linear probing), higher load factors (80% or more for Cuckoo hashing vs at most 50% for linear probing - this also affects amortized resize latency), not to mention elegant simplicity.

In case anyone is interested, here are some design details and decisions for an optimized bucketized Cuckoo hash table with bloom filters implemented for Node.js:

https://github.com/ronomon/hash-table



How does cuckoo hashing work for 80% load factor. It may be possible that you fill out both the positions and your eviction algorithm is then no longer constant time for worst case.

AFAIK, cuckoo works best when load factor is less than 50%. Wikipedia seems to also agree with me.


Did you read the link?

> Each bucket contains up to 8 elements to support a hash table load factor of 80% or higher, unlike linear or chained hash tables which support a load factor (elements / capacity) of at most 50% and which waste the other 50% of memory.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: