The Training Serving Skew

When features are computed differently in training and serving, accuracy quietly drops.

What it is

Training serving skew is a difference between how features are produced during training and how they are produced during serving. The model learns on one version of a feature but sees a slightly different version in production, so it underperforms.

Where it comes from

Code skew when training uses one library or query and serving uses a different one
Data skew when training reads a clean batch table but serving reads a noisier live source
Time skew when a training feature accidentally uses information that is not available at serving time

That last case is a hidden form of leakage that makes offline scores look great and online results disappoint.

Why it is dangerous

The skew is silent. Offline metrics still look fine because they use the training pipeline. The gap only shows up as worse than expected live performance, which is hard to trace.

How to prevent it

The strongest fix is to share one feature computation between training and serving, which is exactly what a feature store provides. Logging the actual features served and replaying them for training also keeps the two paths aligned.

Key idea