← Lessons

quiz vs the machine

Gold1410

Machine Learning

Bias Mitigation Preprocessing

Fixing fairness by transforming the data before training.

5 min read · core · beat Gold to climb

Fix the data first

Preprocessing mitigation changes the training data so that any model trained on it tends to be fairer. The model and training algorithm stay untouched, which makes these methods flexible and easy to bolt onto an existing pipeline.

Common techniques

  • Reweighting: assign sample weights so each group and label combination is balanced, countering historical imbalance.
  • Resampling: oversample underrepresented group outcomes or undersample overrepresented ones.
  • Relabeling: flip a small number of labels near the boundary to remove bias.
  • Representation learning: transform features into a space where the protected attribute is hard to recover.

Strengths and limits

Preprocessing is model agnostic and keeps the downstream training simple. But it acts blindly to what the model will do, so it cannot guarantee a specific fairness metric is met, and aggressive edits can distort the data.

Key idea

Preprocessing mitigation reweights, resamples, relabels, or transforms the data so any downstream model is fairer, offering a model agnostic fix that cannot fully guarantee a chosen fairness metric.

Check yourself

Answer to earn rating on the learn ladder.

1. What does preprocessing mitigation modify?

2. Which is a preprocessing technique?