The maximum margin idea
A support vector machine picks the separating hyperplane with the largest margin, the widest possible gap to the nearest points of each class. A wide margin tends to generalize better than a boundary that hugs the data.
Support vectors
- Only the closest points, the support vectors, define the boundary.
- Points far from the margin can move without changing the solution.
- This makes the model compact and memory efficient at prediction time.
Soft margins
Real data overlaps, so a hard separation is impossible. A soft margin allows some points inside or across the margin, controlled by a penalty often called C.
- Large C punishes violations hard, a narrow margin that may overfit.
- Small C tolerates violations, a wide margin that may underfit.
Strengths and limits
- Effective in high dimensions and with clear margins.
- Pairs with kernels for nonlinear boundaries.
- Slower to train on very large datasets and sensitive to C.
Key idea
A support vector machine maximizes the margin between classes, with only support vectors shaping the boundary. The soft margin penalty C trades violations against margin width.