Start dumb on purpose
A baseline is the simplest model that produces a valid prediction. Always majority class, the mean target, last known value, or a tiny logistic regression. It sets the bar every later model must clear.
- It tells you whether the problem is learnable at all.
- It exposes pipeline bugs before complexity hides them.
- It gives a reference point for measuring real gains.
What baselines reveal
If a fancy model barely beats predicting the mean, the features carry little signal. If the baseline is already strong, extra accuracy may not be worth the complexity and serving cost.
- A human baseline shows the achievable ceiling.
- A trivial baseline shows the floor.
- The gap between them is your room to improve.
The flow
Only after a baseline exists does added model complexity earn its keep.
Key idea
A simple baseline reveals whether the problem is learnable, catches pipeline bugs early, and defines the bar that every more complex model must beat.