Node Classification

The task

Node classification assigns a label to each node. Classify a paper by topic in a citation graph, a user as bot or human in a social network, or a protein by function. The signal comes from the node features plus the structure around it.

The key assumption

Many graphs show homophily: connected nodes tend to share labels. A paper citing many machine learning papers is probably a machine learning paper. GNNs exploit this by smoothing information across edges.

The transductive setting

Often you have one big graph where some nodes are labeled and most are not. Training and prediction happen on the same graph; you learn from the labeled nodes and propagate to the unlabeled ones. This differs from the usual split into separate train and test sets.

How a GNN does it

Run several message passing layers so each node absorbs its neighborhood.
Attach a classifier head that maps the final node vector to label probabilities.
Train with cross entropy on the labeled nodes only, letting structure carry information to the rest.

Key idea

Node classification labels graph nodes from features and structure, exploiting homophily and often training transductively on one partly labeled graph.

Node Classification

The task

The key assumption

The transductive setting

How a GNN does it

Key idea

Check yourself