The SVM Kernels Deep

How the kernel trick lets support vector machines draw nonlinear boundaries without explicit feature maps.

A linear separator with a twist

A support vector machine finds the hyperplane that maximizes the margin between classes. On data that is not linearly separable, the kernel trick lets it draw curved boundaries.

The kernel trick

The SVM optimization depends on data only through inner products between points. A kernel function computes the inner product in a high dimensional feature space without ever mapping the points there explicitly. This makes nonlinear boundaries affordable.

Common kernels

Linear the plain inner product, best when features are already informative or very high dimensional.
Polynomial captures feature interactions up to a chosen degree.
RBF or Gaussian the workhorse, with a gamma parameter setting how far each point influences.

Tuning the RBF kernel

C trades margin width against training errors. Large C fits tightly, small C allows more slack.
Gamma sets the reach of each support vector. Large gamma makes tight, wiggly boundaries that can overfit.
Both interact, so tune them together with a grid or random search.

Key idea

SVMs maximize the margin and use kernels to compute inner products in a high dimensional space implicitly, drawing nonlinear boundaries. The RBF kernel is the default, tuned through C for margin slack and gamma for reach.

The SVM Kernels Deep

A linear separator with a twist

The kernel trick

Common kernels

Tuning the RBF kernel

Key idea

Check yourself