A formal guarantee
Differential privacy gives a precise promise: the output of an algorithm should be almost the same whether or not any single individual is included. If one person's record can barely change the result, then the result cannot reveal much about that person.
The privacy budget
The strength is controlled by a parameter often called epsilon. A small epsilon means strong privacy and more noise; a large epsilon means weaker privacy and less noise. The budget is spent as the data is queried.
How it enters training
- During gradient descent, each example gradient is clipped to bound its influence.
- Calibrated noise is added to the summed gradients before the update.
- This bounds how much any one record can move the model, giving a per example guarantee.
What it buys and costs
- Buys: a mathematical bound on leakage that holds even against future attacks.
- Costs: noise lowers accuracy, especially for rare patterns, so there is a real privacy utility tradeoff tuned through epsilon.
Key idea
Differential privacy bounds how much any single record can change a model by clipping gradients and adding calibrated noise tuned by an epsilon budget, giving a provable leakage guarantee at the cost of some accuracy.