The Hard Negative Mining

How surfacing the toughest negatives sharpens decision boundaries in retrieval and metric learning.

Easy negatives stop teaching

Once a model can separate obvious cases, random negatives mostly produce near zero loss and waste compute. Hard negative mining instead feeds the model negatives that it currently confuses with the positive, where the gradient is large and learning is fast.

What a hard negative is

A negative that the model scores almost as high as the true positive.
In retrieval, an item that is close to the query in embedding space but is actually wrong.
These sit right at the decision boundary the model has not yet learned.

How to mine them

Offline mining, where you periodically search the corpus with the current model to find high scoring negatives, then train on them.
In batch mining, where within a training batch you pick the hardest negative for each anchor cheaply.

The danger of going too hard

The very hardest item may actually be a false negative, a correct answer that was never labeled positive. Training against it pushes the model the wrong way.
Semi hard negatives, harder than the positive margin but not the absolute hardest, are often the safest and most effective choice.

Key idea

Hard negative mining trains on confusing near boundary negatives for fast learning, but the hardest can be mislabeled positives, so semi hard negatives are often safest.

The Hard Negative Mining

Easy negatives stop teaching

What a hard negative is

How to mine them

The danger of going too hard

Key idea

Check yourself