Linear probing

Published

2023-08-05

Linear probing

Linear probing is a collision resolution strategy. When a collision occurs on insert, we probe the hash table, in a linear, stepwise fashion, to find the next available space in which to store our new object. The sequence of indices we visit during this procedure is called the “probe sequence.” We follow the same probe sequence when finding and removing objects. It turns out there’s a little twist when it comes to removing objects from our hash table—that is, if we restore a particular element in our hash table to an empty state we can break the probe sequence. Accordingly, we will mark these positions as deleted rather than empty. This way we preserve our probe sequence.

Rehashing

When our hash table becomes too crowded, performance degrades due to increased frequency of collisions and longer probe sequences. Therefore, when our hash table is too full, it’s time to rehash. When we rehash we create a new, larger hash table, an then insert all the objects from the old hash table into the new hash table, rehashing to produce a new index for each object. When this is complete, we delete the old table. Now our data reside in a less crowded hash table. That’s the idea behind rehashing.

Where we implement linear probing and rehashing in C++.

Resources and supplemental materials:

Comprehension check:

  1. We’ve seen how we must mark positions as deleted rather than empty in order to preserve probe sequences that might include the deleted position. On inserts is it safe to overwrite a deleted position with a new object? Yes or no.
  2. True or false? When we rehash we increase the table size to the next prime after 2 x table size, and then we copy entries from the old table into the new table at the same indices.
  3. If we have a hash table with linear probing, If we remove items from the hash table and don’t mark the position as “removed” or “deleted”, then we can break the ________________________.
  4. The amount by which we increment our index when linear probing is called the ______________. This is typically set to 1.
  5. True or false? We check to see if we need to rehash on each insertion where we are not overwriting an index previously flagged as “removed.”
  6. True or false? When probing on an insert we are looking for a position that is flagged as “empty” or “removed.”
  7. True or false? When we remove an object from our hash table we reduce our hash table size by one.

Answers: ǝslɐɟ / ǝnɹʇ / ǝnɹʇ / ǝpᴉɹʇs / ǝɔuǝnbǝs ǝqoɹd / ǝslɐɟ / sǝʎ

Original author: Clayton Cafiero < [given name] DOT [surname] AT uvm DOT edu >

No generative AI was used in producing this material. This was written the old-fashioned way.

All materials copyright © 2020–2023, The University of Vermont. All rights reserved.