Link Prediction

Predict which edges are missing or will form, the heart of graph recommendation.

The task

Link prediction asks whether an edge should exist between two nodes. Will these two users become friends? Should this user be connected to this product? Recommendation is itself link prediction on a user item graph.

How it is framed

Learn an embedding for each node, often with a GNN.
Score a candidate pair by combining their two embeddings, commonly a dot product or a small network over their concatenation.
A high score means the link is likely.

Training with negatives

Real edges are positives. But the graph has only a few edges out of all possible pairs, so you must sample negatives: non existent edges treated as negative examples. The model learns to score true edges above sampled non edges, a ranking objective.

Evaluation pitfalls

Leakage is dangerous: if a test edge stays in the graph during training, the model cheats. Remove test edges from the message passing graph.
Metrics use ranking measures such as area under the curve and mean reciprocal rank against sampled negatives.

Key idea

Link prediction scores node pairs from their embeddings to find missing or future edges, trained with sampled negatives and guarded against edge leakage.

The task

How it is framed

Training with negatives

Evaluation pitfalls

Key idea

Check yourself