Online vs Offline Features
Features come in two flavors based on when and how they are computed. Understanding the split is central to designing a serving system.
Offline features
Offline features are computed in batch over historical data, often nightly. They power training and can be expensive to calculate because latency does not matter. Examples include a customer's average order value over the last ninety days.
Online features
Online features must be available at prediction time with very low latency, often milliseconds. They are served from a fast store such as an in memory cache. Examples include the number of clicks in the last minute, which changes constantly.
The bridge
A common architecture computes features in batch and materializes them into an online store so serving reads are fast. Fresh real time signals are computed on the fly and merged with these precomputed values. A feature store is the system that manages both paths and guarantees the same definition is used everywhere.
- Offline favors completeness and cost efficiency.
- Online favors speed and freshness.
Key idea
Offline features are batch computed for completeness while online features are served fast for freshness, often unified by a feature store.