Grading an ordered list
Search and recommendation systems return a ranked list, and what matters is putting relevant items near the top. Plain accuracy ignores order, so we need metrics that reward good positions.
Mean average precision
Average precision for one query computes precision at each rank where a relevant item appears, then averages those values. Mean average precision, or MAP, averages this across all queries. It rewards retrieving relevant items and placing them early.
Normalized discounted cumulative gain
NDCG handles graded relevance, where items can be highly, mildly, or not relevant.
- Cumulative gain sums the relevance of the returned items.
- Discounting divides each item gain by a logarithm of its rank, so lower positions count less.
- Normalizing divides by the best possible ordering so the score lands between zero and one.
NDCG shines when relevance has degrees, while MAP suits binary relevant or not judgments.
Key idea
MAP and NDCG grade ranked lists by rewarding relevant items placed high. MAP fits binary relevance, while NDCG handles graded relevance and discounts items that sit lower in the list.