The core task
Text classification assigns a label to a document. It powers spam filtering, topic routing, and intent detection. The shape of the problem decides the model and metric.
Problem variants
- Binary assigns one of two labels, such as spam or not spam.
- Multiclass picks exactly one label from many, such as a news topic.
- Multilabel allows several labels at once, since an article can be both sports and politics.
The output layer differs: a softmax for one of many, independent sigmoids for multilabel.
From features to transformers
- Classic pipelines use bag of words or TF IDF features into a linear model, which is fast and strong on simple tasks.
- Modern systems fine tune a transformer encoder and read a pooled vector into a classifier head.
Handling imbalance and metrics
- Real data is often skewed, with rare positive classes.
- Accuracy misleads under imbalance, so report precision, recall, and F1 per class.
- Class weighting or resampling helps the model attend to rare labels.
Key idea
Text classification spans binary, multiclass, and multilabel settings with matching output layers, and under class imbalance you trust precision, recall, and F1 rather than accuracy.