From line to probability
Logistic regression is a classifier despite the name. It computes a linear score from the features, then squashes it through the sigmoid to produce a probability between zero and one.
The loss
Logistic regression is trained with cross entropy, also called log loss. It rewards confident correct predictions and heavily punishes confident wrong ones. This loss is convex, so gradient descent finds a single best solution.
Reading the weights
- A positive weight pushes the probability toward class one as its feature grows.
- The weights act on the log odds, the logarithm of the odds ratio.
- Exponentiating a weight gives how the odds multiply per unit of the feature.
Why it is a workhorse
- Fast, calibrated, and interpretable.
- A strong baseline before reaching for complex models.
- Extends to many classes with the softmax generalization.
Key idea
Logistic regression maps a linear score through the sigmoid to a probability, trained with convex cross entropy loss. Its weights describe effects on the log odds.