← Lessons

quiz vs the machine

Gold1330

Machine Learning

The Silhouette Score

Measuring how well each point fits its cluster versus the nearest other.

4 min read · core · beat Gold to climb

The Silhouette Score

The silhouette score measures how good a clustering is without using labels. It captures whether points sit comfortably in their own cluster or near the boundary with another.

The per point silhouette

For each point we compute two distances.

  • a is the average distance to other points in its own cluster, a measure of cohesion.
  • b is the average distance to points in the nearest other cluster, a measure of separation.

The silhouette of the point is b minus a, divided by the larger of the two. The value ranges from minus one to plus one.

Reading the values

  • Near plus one the point is well inside its cluster and far from others.
  • Near zero the point sits on the boundary between two clusters.
  • Negative values suggest the point may be in the wrong cluster.

Using it to choose k

Averaging the silhouette over all points gives a single score for the clustering. Trying several values of k and keeping the one with the highest average silhouette is a more principled alternative to the elbow method, since it balances cohesion against separation.

Key idea

The silhouette score compares within cluster cohesion to nearest cluster separation per point, giving a label free way to judge and tune clustering.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a silhouette value near plus one indicate?

2. What do the values a and b represent?