A linear separator with a twist
A support vector machine finds the hyperplane that maximizes the margin between classes. On data that is not linearly separable, the kernel trick lets it draw curved boundaries.
The kernel trick
The SVM optimization depends on data only through inner products between points. A kernel function computes the inner product in a high dimensional feature space without ever mapping the points there explicitly. This makes nonlinear boundaries affordable.
Common kernels
- Linear the plain inner product, best when features are already informative or very high dimensional.
- Polynomial captures feature interactions up to a chosen degree.
- RBF or Gaussian the workhorse, with a gamma parameter setting how far each point influences.
Tuning the RBF kernel
- C trades margin width against training errors. Large C fits tightly, small C allows more slack.
- Gamma sets the reach of each support vector. Large gamma makes tight, wiggly boundaries that can overfit.
- Both interact, so tune them together with a grid or random search.
Key idea
SVMs maximize the margin and use kernels to compute inner products in a high dimensional space implicitly, drawing nonlinear boundaries. The RBF kernel is the default, tuned through C for margin slack and gamma for reach.