← Lessons

quiz vs the machine

Platinum1850

Machine Learning

Differential Privacy In Training

Adding calibrated noise so no single record changes the model much.

6 min read · advanced · beat Platinum to climb

A formal guarantee

Differential privacy gives a precise promise: the output of an algorithm should be almost the same whether or not any single individual is included. If one person's record can barely change the result, then the result cannot reveal much about that person.

The privacy budget

The strength is controlled by a parameter often called epsilon. A small epsilon means strong privacy and more noise; a large epsilon means weaker privacy and less noise. The budget is spent as the data is queried.

How it enters training

  • During gradient descent, each example gradient is clipped to bound its influence.
  • Calibrated noise is added to the summed gradients before the update.
  • This bounds how much any one record can move the model, giving a per example guarantee.

What it buys and costs

  • Buys: a mathematical bound on leakage that holds even against future attacks.
  • Costs: noise lowers accuracy, especially for rare patterns, so there is a real privacy utility tradeoff tuned through epsilon.

Key idea

Differential privacy bounds how much any single record can change a model by clipping gradients and adding calibrated noise tuned by an epsilon budget, giving a provable leakage guarantee at the cost of some accuracy.

Check yourself

Answer to earn rating on the learn ladder.

1. What does the parameter epsilon control in differential privacy?

2. How is differential privacy added during gradient descent?

3. What is the main cost of differential privacy in training?