A unifying framework
Almost every GNN variant fits one template called message passing. Understanding it lets you read graph convolutional networks, GraphSAGE, and attention models as instances of the same loop.
The three steps per layer
- Message: for each edge, compute a message from the neighbor vector, possibly transformed by a weight matrix.
- Aggregate: combine all messages arriving at a node with a permutation invariant function such as sum, mean, or max, since neighbors have no order.
- Update: merge the aggregate with the node current vector, often through a small neural layer, to produce the next vector.
Why permutation invariance matters
Neighbors form a set, not a sequence. The aggregate must give the same answer no matter the order, so sum and mean qualify while a plain concatenation does not.
Depth and its dangers
Each layer reaches one more hop. But stacking too many layers causes over smoothing: every node vector converges to the same value and distinctions vanish. Most practical GNNs use only two or three layers.
Variations
Graph attention learns weights on each neighbor instead of a flat mean. GraphSAGE samples a fixed number of neighbors to scale to huge graphs.
Key idea
Message passing runs message, aggregate, and update per layer with permutation invariant aggregation; too many layers cause over smoothing.