The Contrastive Learning

The goal of the objective

Contrastive learning trains an encoder so that semantically related inputs map to nearby vectors and unrelated inputs map to distant ones. It learns a useful embedding space without needing explicit class labels for every example.

Positives and negatives

Each anchor example is paired with:

A positive, something that should be close, such as an augmented view or a paraphrase.
One or more negatives, things that should be far.

The loss rewards high similarity to the positive and low similarity to the negatives.

The InfoNCE loss

A common choice is the InfoNCE loss, which treats the problem as picking the true positive out of a batch. For each anchor it forms a softmax over similarities, where the positive should get the most weight. Because every other item in the batch acts as a negative, large batches give more negatives and stronger signal.

Why it works

By repeatedly contrasting, the encoder discovers the features that actually distinguish meaningfully different inputs, producing a space where downstream similarity search and classification are easy.

Key idea