The Learning to Rank

Ranking is not regression

Ordering items is different from predicting a single number. Learning to rank methods optimize the relative order of items, and they come in three families that differ in what unit of data the loss looks at.

Pointwise

Treats each item independently, predicting a score or relevance label.
Loss is per item, like squared error or cross entropy.
Simple and scalable, but it ignores that ranking only cares about order, not exact scores.

Pairwise

Looks at pairs of items and learns which should rank higher.
Loss penalizes pairs that are ordered wrongly, as in RankNet.
Captures relative preference directly, but treats all misordered pairs as equally costly.

Listwise

Considers the whole ranked list at once.
Optimizes a list level objective, often a smooth surrogate for a ranking metric, as in ListNet or LambdaMART style approaches.
Aligns best with metrics like normalized discounted cumulative gain, but is more complex to train.

Choosing a family

Pointwise is a fast baseline. Pairwise improves order sensitivity cheaply. Listwise squeezes out the most metric aligned gains when the ranking quality justifies the added complexity.

Key idea

Learning to rank optimizes order, with pointwise, pairwise, and listwise losses trading simplicity for ever tighter alignment to ranking metrics.