Double hashing

Author

Clayton Cafiero

Published

2025-01-05

Double hashing is designed to reduce clustering. It does this by calculating the stride for a given key using a second, independent hash function. Thus, two objects will have the same probe sequence only if there is a collision in the output of both the primary hash function and the secondary hash function. If these functions are well-designed, then the probability of this occurring should be very small—on the order of 1 / (n^2) where n is the table size.

Supplemental materials:

Comprehension check:

With double hashing we calculate not only the index where we wish to insert (that is, the start of our probe sequence), but we calculate the _____________ as well.
Our secondary hash function should never return the value ________ because if it did, we would never probe beyond the initial position.
Ideally, our primary and secondary hash functions should be ___________________ of one another.
If our primary function is f(x) = x \mod 7, and our secondary hash function is g(x) = 5 - (x \mod 5), then for a key value of 8 the first three steps in our probe sequence would be ___________, ____________, ___________.
Double hashing is designed to prevent _______________.

Answers: ƃuᴉɹǝʇsnlɔ / ϛ ’Ɛ ’Ɩ / ʇuǝpuǝdǝpuᴉ ǝsᴉʍɹᴉɐd / oɹǝz / ǝpᴉɹʇs

No generative AI was used in producing this material. This was written the old-fashioned way.

Reuse

CC BY-NC-SA 4.0