The Point In Time Correctness

Why training features must reflect only what was known at the moment of each event.

The leakage trap

When you build a training table, each row pairs a label with the features as they stood at the moment that event happened. If you accidentally attach a feature value computed after the event, you leak the future into training. This is a point in time correctness failure.

A concrete example

You predict whether a customer churns on day thirty.
A feature is total lifetime spend.
If you compute lifetime spend over the full history, it includes purchases made after day thirty, which the model could never see at prediction time.

The correct join

For every label timestamp, you must look up feature values as of that exact timestamp or earlier.
This is called a point in time join or an as of join. It walks each feature back to its latest value before the event.

Why it matters

Without it, offline metrics are wildly optimistic and collapse in production.
It is one of the most common and damaging data bugs in real ML systems.

How feature stores help

They version every feature value with a timestamp and provide as of joins by default, so training rows respect the timeline automatically.

Key idea

Point in time correctness means each training row uses only feature values known before its event, enforced with as of joins to avoid leaking the future.