← Lessons

quiz vs the machine

Gold1460

Machine Learning

UMAP for Visualization

A faster manifold embedding that preserves more global structure than t SNE.

5 min read · core · beat Gold to climb

UMAP for Visualization

UMAP, uniform manifold approximation and projection, is a nonlinear dimensionality reduction method often used in place of t SNE. It tends to be faster and to preserve more global structure.

How it works

UMAP builds a graph of nearest neighbors in the high dimensional space, modeling the data as a fuzzy connected manifold. It then optimizes a low dimensional layout whose neighbor graph is as similar as possible to the original.

  • A fuzzy graph captures both close and slightly weaker connections.
  • The layout is found by an attraction and repulsion optimization, similar in spirit to a force directed graph.

Key hyperparameters

  • Number of neighbors controls the balance between local detail and global shape. Small values emphasize fine structure, large values reveal the big picture.
  • Minimum distance sets how tightly points may pack, affecting how clumped the clusters appear.

Compared to t SNE

UMAP usually runs faster on large datasets and keeps the relative arrangement of clusters more faithfully. Still, like t SNE, the absolute distances in a UMAP plot remain approximate, so it is a tool for exploration rather than precise measurement.

Key idea

UMAP embeds data by matching a fuzzy neighbor graph, running faster than t SNE and preserving more global structure while distances stay approximate.

Check yourself

Answer to earn rating on the learn ladder.

1. What structure does UMAP build first?

2. How does UMAP typically compare to t SNE?