Double hashing

Published

2023-08-05

Double hashing is designed to reduce clustering. It does this by calculating the stride for a given key using a second, independent hash function. Thus, two objects will have the same probe sequence only if there is a collision in the output of both the primary hash function and the secondary hash function. If these functions are well-designed, then the probability of this occurring should be very small—on the order of 1 / (n^2) where n is the table size.

Supplemental materials:

Comprehension check:

  1. With double hashing we calculate not only the index where we wish to insert (that is, the start of our probe sequence), but we calculate the _____________ as well.
  2. Our secondary hash function should never return the value ________ because if it did, we would never probe beyond the initial position.
  3. Ideally, our primary and secondary hash functions should be ___________________ of one another.
  4. If our primary function is f(x) = x \mod 7, and our secondary hash function is g(x) = 5 - (x \mod 5), then for a key value of 8 the first three steps in our probe sequence would be ___________, ____________, ___________.
  5. Double hashing is designed to prevent _______________.

Answers: ƃuᴉɹǝʇsnlɔ / ϛ ’Ɛ ’Ɩ / ʇuǝpuǝdǝpuᴉ ǝsᴉʍɹᴉɐd / oɹǝz / ǝpᴉɹʇs

Original author: Clayton Cafiero < [given name] DOT [surname] AT uvm DOT edu >

No generative AI was used in producing this material. This was written the old-fashioned way.

All materials copyright © 2020–2023, The University of Vermont. All rights reserved.