The Feature Store

What it is

A feature store is a central place where teams define, store, and serve the input features for machine learning models. Instead of every project re computing features from raw data, they pull ready features from one shared system.

Two access patterns

A feature store usually has two halves.

The offline store holds large histories of features for training. It is optimized for big batch reads.
The online store holds the latest feature values for low latency lookups during serving.

The same feature definition feeds both halves, which is the main reason feature stores exist.

Why it helps

It removes duplicate pipelines, so two teams compute customer spend the same way
It reduces training serving skew because training and serving read the same logic
It supports point in time joins, so training rows only see data that existed when the label was created

A simple flow

A pipeline computes a feature such as average order value, writes the history to the offline store and the freshest value to the online store. A model trains on the offline data and at request time looks up the online value.

Key idea

A feature store defines a feature once and serves it consistently to both training and production.

What it is

Two access patterns

Why it helps

A simple flow

Key idea

Check yourself