← Lessons

quiz vs the machine

Gold1350

Machine Learning

The Contrastive Learning

Teaching a model to pull similar items together and push different ones apart.

5 min read · core · beat Gold to climb

The goal of the objective

Contrastive learning trains an encoder so that semantically related inputs map to nearby vectors and unrelated inputs map to distant ones. It learns a useful embedding space without needing explicit class labels for every example.

Positives and negatives

Each anchor example is paired with:

  • A positive, something that should be close, such as an augmented view or a paraphrase.
  • One or more negatives, things that should be far.

The loss rewards high similarity to the positive and low similarity to the negatives.

The InfoNCE loss

A common choice is the InfoNCE loss, which treats the problem as picking the true positive out of a batch. For each anchor it forms a softmax over similarities, where the positive should get the most weight. Because every other item in the batch acts as a negative, large batches give more negatives and stronger signal.

Why it works

By repeatedly contrasting, the encoder discovers the features that actually distinguish meaningfully different inputs, producing a space where downstream similarity search and classification are easy.

Key idea

Contrastive learning shapes an embedding space by pulling positives together and pushing negatives apart, and the InfoNCE loss with many in batch negatives is the workhorse that makes this scale.

Check yourself

Answer to earn rating on the learn ladder.

1. What does a contrastive loss do with a positive and its negatives?

2. Why do larger batches often help InfoNCE training?