← Lessons

quiz vs the machine

Gold1420

Machine Learning

Gaussian Mixture Clustering

Modeling data as a blend of Gaussian components with soft assignments.

5 min read · core · beat Gold to climb

Gaussian Mixture Clustering

A Gaussian mixture model, or GMM, assumes the data was generated by a mixture of several Gaussian distributions. Each Gaussian is one cluster, with its own mean, covariance, and weight.

Soft assignments

Unlike k means, which gives each point a single label, a GMM produces soft assignments. Every point receives a probability of belonging to each component, called the responsibility. A point near the boundary might be sixty percent in one cluster and forty percent in another.

Fitting with EM

GMMs are fit with the expectation maximization algorithm, which mirrors the k means loop.

  • E step: compute each point responsibility for every component given current parameters.
  • M step: update each component mean, covariance, and weight using those responsibilities.

This alternation increases the data likelihood until it converges.

Why covariance matters

Because each component has its own covariance matrix, a GMM can model stretched and tilted ellipses, not just spheres. That flexibility lets it fit clusters that k means would split or merge incorrectly. The cost is more parameters and sensitivity to initialization.

Key idea

A Gaussian mixture models data as overlapping Gaussians fit by expectation maximization, giving soft probabilistic cluster memberships.

Check yourself

Answer to earn rating on the learn ladder.

1. What is a responsibility in a Gaussian mixture model?

2. Which algorithm is used to fit a GMM?

3. Why can a GMM fit stretched clusters that k means cannot?