What it is
Logistic regression is a classifier despite its name. It computes a linear score from the features, then squashes that score into a probability between zero and one using the sigmoid function.
From score to probability
- The linear part produces a number called the logit that can range over all real values.
- The sigmoid maps large positive logits near one and large negative logits near zero.
- A threshold, often one half, turns the probability into a class label.
The loss
We do not use squared error here. Instead we minimize log loss, also called cross entropy. It heavily penalizes a confident prediction that turns out wrong.
- Predicting probability near zero for a true positive gives a huge loss.
- The loss is convex in the weights, so optimization is reliable.
Interpreting weights
Each weight shifts the log odds of the positive class. A positive weight means that increasing the feature raises the chance of the positive label. This makes logistic regression easy to explain to non experts.
Key idea
Logistic regression squashes a linear score through a sigmoid into a probability, trained with convex log loss for stable classification.