← Lessons

quiz vs the machine

Gold1440

Algorithms

Bloom Filter Deep Dive

A bit array and several hashes give compact set membership with no false negatives.

6 min read · core · beat Gold to climb

A compact probabilistic set

A Bloom filter answers membership questions using a bit array and several independent hash functions. It can say an item is definitely not present or possibly present, trading a small false positive rate for tiny memory.

Insert and query

To insert an element, hash it with each function, map each hash to a bit position, and set those bits to one. To query, hash the same way and check those bits.

  • If any bit is zero, the item is certainly absent.
  • If all bits are one, the item is probably present.

False positives happen when other insertions happen to set all the queried bits.

Tuning the parameters

The false positive rate depends on the number of bits, the number of items, and the number of hash functions. For a target rate there is an optimal hash count that balances filling the array too fast against checking too few positions.

  • More bits per item lowers the error.
  • Too many hashes saturate the array and raise errors.

What it cannot do

A standard Bloom filter has no deletion, since clearing a bit could erase other elements. Counting variants store small counters instead of single bits to allow removal.

Key idea

A Bloom filter sets bits chosen by several hashes per item, so a zero bit proves absence and all ones suggests presence, giving compact membership with tunable false positives but no deletion in the basic form.

Check yourself

Answer to earn rating on the learn ladder.

1. What can a Bloom filter never produce?

2. Why does a basic Bloom filter not support deletion?

3. What happens if too many hash functions are used?