← Lessons

quiz vs the machine

Platinum1760

Machine Learning

Privacy Preserving ML

Training useful models without exposing individuals' sensitive data.

5 min read · advanced · beat Platinum to climb

The privacy problem

Models trained on personal data can leak it. A model may memorize rare records, and attackers can sometimes recover training examples or test whether a specific person was in the data. Privacy preserving ML aims to learn useful patterns while limiting what any individual reveals.

Threats to guard against

  • Membership inference: deciding whether a given record was in the training set.
  • Model inversion: reconstructing sensitive features from model behavior.
  • Memorization leakage: a generative model reciting verbatim training text.

The main toolbox

  • Differential privacy: add calibrated noise so no single record changes the output much.
  • Federated learning: keep raw data on devices and only share model updates.
  • Secure computation: compute on encrypted data so the server never sees raw values.

The core tradeoff

Stronger privacy usually means more noise or coordination cost, which can lower accuracy. The goal is to bound individual exposure while keeping aggregate utility.

Key idea

Privacy preserving ML defends against membership inference, inversion, and memorization by using differential privacy, federated learning, and secure computation, trading some accuracy to bound any individual exposure.

Check yourself

Answer to earn rating on the learn ladder.

1. What is membership inference?

2. What is the core tradeoff in privacy preserving ML?