← Lessons

quiz vs the machine

Gold1320

Machine Learning

Binning and Discretization

Group continuous values into discrete buckets to capture nonlinearity and reduce noise.

4 min read · core · beat Gold to climb

Binning and Discretization

Binning, or discretization, converts a continuous feature into a small set of ordered buckets. Age might become child, adult, and senior groups instead of an exact number.

Common binning schemes

  • Equal width splits the value range into intervals of the same size.
  • Equal frequency chooses bin edges so each bucket holds roughly the same count of rows.
  • Supervised binning places edges where the target behavior changes, optimizing predictive power.

Why bin at all

  • It can capture nonlinear effects for models that only fit linear relationships.
  • It reduces the influence of small fluctuations and noise.
  • It produces interpretable groups that stakeholders understand.

The cost is lost resolution. Collapsing fine detail into a few buckets discards information, and bad edge choices can hide real structure. Equal width bins are also sensitive to outliers that stretch the range, leaving most data in one crowded bucket.

Key idea

Binning groups continuous values into discrete buckets to capture nonlinearity and reduce noise, trading resolution for robustness and interpretability.

Check yourself

Answer to earn rating on the learn ladder.

1. How does equal frequency binning choose its edges?

2. What is the main cost of binning?