Hashtable
# What
- Associate array or dictionary
std::unordered_map
in C++ STL- Good for looking up
# Implementation
The size of the bucket will be power of 2, since it guarantee to be all 1 in binary. What we only need to do is a Bitwise AND operation and that’s the module trick.
E.g. 16 is a power of 2. If you subtract 1 from it, you get 15, whose binary representation is 1111. Now, if you do a bitwise AND of any number with 1111, you’re going to get the last 4 bits of the number which, in other words, is equivalent to the modulo of the number by 16 (Division operation is usually an expensive operation. Hence, bitwise operation is usually preferred over division). These last 4 bits will evaluate to any number from 0 to 15 which are the indexes of your underlying array.
# Load Factor
The number of keys in bucket.
# Collision Resolution
Use Separate Chaining or Open Addressing to deal with it.
# Why Use a Prime Number as a Mod in a Hashing Function
We can not control the output of the hash function, so if it is a biased hash function that are likely to return multiple of n
, and after we mod n
it all goes to the same bucket which is not good.
- language agnostic - Why should hash functions use a prime number modulus? - Stack Overflow
- data structures - Why is it best to use a prime number as a mod in a hashing function? - Computer Science Stack Exchange