The query
A rider opens the app and the system must return the nearest available drivers within a small radius, fast, while drivers move constantly. The hard part is doing this for millions of rapidly changing positions.
The standard recipe
- Index by cell. Bucket each driver into a spatial cell using a geohash, quadtree, or S2 cell id.
- Read by cell. For a rider, compute the covering cell and pull candidates from that cell and its neighbors.
- Refine. Compute exact great circle distance to each candidate, filter by radius, and rank.
Keeping it fast under churn
- Store positions in an in memory store keyed by cell so writes are cheap and reads touch few keys.
- Expire stale positions so a driver who went offline disappears quickly.
- Choose cell size so a typical cell holds a manageable candidate count, not too few and not too many.
Key idea
Proximity search is a coarse cell lookup over neighbors followed by an exact distance refine, all over a fast moving in memory index.