The intuition
PageRank scores the importance of each node in a directed graph. The recursive idea is elegant: a node is important if important nodes point to it. Originally built to rank web pages, it now ranks nodes in many graphs.
The random surfer
Imagine a surfer who, at each step, either follows a random outgoing link or, with a small probability, teleports to a random node. PageRank is the long run fraction of time the surfer spends at each node, the stationary distribution of this walk.
The update rule
- Each node passes its current score evenly across its outgoing links.
- A node new score sums the contributions it receives from nodes pointing to it.
- A damping factor, typically around 0.85, mixes in the teleport probability so the walk does not get stuck in dead ends or loops.
- Iterate this update until the scores converge.
Why teleport matters
Without teleportation, dangling nodes with no out links and isolated loops would trap all the score. The teleport term guarantees a well defined, unique stationary distribution.
Key idea
PageRank ranks nodes by a random surfer stationary distribution, where importance flows along links and a damping teleport ensures convergence.