← Lessons

quiz vs the machine

Gold1410

Machine Learning

The Training Serving Skew

When features are computed differently in training and serving, accuracy quietly drops.

5 min read · core · beat Gold to climb

What it is

Training serving skew is a difference between how features are produced during training and how they are produced during serving. The model learns on one version of a feature but sees a slightly different version in production, so it underperforms.

Where it comes from

  • Code skew when training uses one library or query and serving uses a different one
  • Data skew when training reads a clean batch table but serving reads a noisier live source
  • Time skew when a training feature accidentally uses information that is not available at serving time

That last case is a hidden form of leakage that makes offline scores look great and online results disappoint.

Why it is dangerous

The skew is silent. Offline metrics still look fine because they use the training pipeline. The gap only shows up as worse than expected live performance, which is hard to trace.

How to prevent it

The strongest fix is to share one feature computation between training and serving, which is exactly what a feature store provides. Logging the actual features served and replaying them for training also keeps the two paths aligned.

Key idea

Training serving skew comes from computing features two ways; sharing one feature path is the durable fix.

Check yourself

Answer to earn rating on the learn ladder.

1. What causes training serving skew?

2. What is the strongest fix for training serving skew?