← Lessons

quiz vs the machine

Gold1400

Machine Learning

Vector Database Architecture

The moving parts that turn an index into a queryable service.

5 min read · core · beat Gold to climb

More than an index

An ANN index finds nearby vectors, but a vector database wraps that index in the machinery needed for real use. It must ingest vectors, persist them, serve queries, and stay consistent as data changes.

The core components

  • Ingestion: vectors and their metadata arrive and are written to storage.
  • Index builder: constructs and updates the ANN structure such as HNSW or IVF.
  • Query engine: embeds or accepts a query vector, searches the index, and applies filters.
  • Storage: keeps both the raw vectors and the index, often with persistence to disk.

Operational concerns

  • Updates: graph indexes can be costly to modify, so many systems batch or rebuild.
  • Sharding: large collections are split across nodes, and each shard is searched in parallel.
  • Consistency: newly added vectors must become searchable within an acceptable delay.

Why this matters

Choosing a vector database is not only about ANN speed. Filtering, updates, scaling, and durability often decide whether it fits production.

Key idea

A vector database surrounds an ANN index with ingestion, storage, a query engine, and scaling logic, so production fit depends on far more than raw search speed.

Check yourself

Answer to earn rating on the learn ladder.

1. What distinguishes a vector database from a bare ANN index?

2. Why is updating graph indexes an operational concern?