Two levers, one goal
You can raise performance by changing the model or by improving the data. Model centric work tunes architecture, loss, and hyperparameters on fixed data. Data centric work fixes labels, adds examples, and sharpens definitions on a fixed model.
- Model centric: new layers, regularization, optimizer changes.
- Data centric: relabeling, deduping, balancing, better collection.
- Both are valid; the question is which pays more now.
When data centric wins
On many real systems the data is messier than the model is weak. Inconsistent labels and missing slices cap accuracy no matter how clever the architecture.
- Noisy or inconsistent labels confuse any model.
- Missing slices leave whole subgroups unlearned.
- A small clean dataset often beats a large dirty one.
When model centric wins
If labels are clean and data is plentiful, the bottleneck is capacity or inductive bias, and model changes help most.
Diagnose the bottleneck before choosing a lever.
Key idea
Improving data and improving the model are complementary levers; error analysis tells you whether noisy data or limited model capacity is the binding constraint right now.