What it is
A support vector machine finds the boundary that separates two classes with the largest margin, the widest gap between the boundary and the nearest points of each class.
Why margin matters
A wide margin tends to generalize better. A boundary jammed against the data is fragile, while one centered in a gap leaves room for new points.
- The closest points that touch the margin are the support vectors.
- Only these points define the boundary. Removing far away points changes nothing.
- This makes the solution sparse and stable.
Hard versus soft margin
Real data overlaps, so a perfectly clean split is rare.
- A hard margin demands zero errors and fails on overlapping data.
- A soft margin allows some violations, controlled by a penalty knob.
- A larger penalty pushes toward fewer errors but a narrower, riskier margin.
The loss
SVMs minimize hinge loss plus a regularization term. Hinge loss is zero for points safely on the correct side and grows linearly for points inside the margin or misclassified.
Key idea
A support vector machine maximizes the margin between classes, defined only by the support vectors, with a soft margin to tolerate overlap.