← Lessons

quiz vs the machine

Gold1400

Machine Learning

Node Classification

Label nodes in a graph using their features and the labels of their neighbors.

5 min read · core · beat Gold to climb

The task

Node classification assigns a label to each node. Classify a paper by topic in a citation graph, a user as bot or human in a social network, or a protein by function. The signal comes from the node features plus the structure around it.

The key assumption

Many graphs show homophily: connected nodes tend to share labels. A paper citing many machine learning papers is probably a machine learning paper. GNNs exploit this by smoothing information across edges.

The transductive setting

Often you have one big graph where some nodes are labeled and most are not. Training and prediction happen on the same graph; you learn from the labeled nodes and propagate to the unlabeled ones. This differs from the usual split into separate train and test sets.

How a GNN does it

  • Run several message passing layers so each node absorbs its neighborhood.
  • Attach a classifier head that maps the final node vector to label probabilities.
  • Train with cross entropy on the labeled nodes only, letting structure carry information to the rest.

Key idea

Node classification labels graph nodes from features and structure, exploiting homophily and often training transductively on one partly labeled graph.

Check yourself

Answer to earn rating on the learn ladder.

1. What graph property does node classification typically exploit?

2. What is special about the transductive setting?