What
A space-efficient probabilistic data structure to test whether an element is a member of a set.
Only false positive, no false negative.
Implementation
Use bits array and hashing functions. Apply all functions to the input and , then we can set all results bits to 1.
Design
- Use random keys in order to minimize the chance that a false positive for one user is a false positive for all.
- If it returns negative, then we can guarantee that it’s not in the bloom filter.
- If it returns positive, then it may be in bloom filter.
Links
- chrome/browser/safe_browsing/bloom_filter.cc - chromium/chromium - Git at Google
- Bloom filter calculator
- Algorithm
- Probability
- Hashing
- Counting Filter / Cuckoo Filter
There is also another filter called cuckoo filter, which should be practically better than bloom filter since it supports deletion.