Cross Validation K Fold

K fold cross validation estimates how well a model generalizes by splitting the data into k equal parts and rotating which part is held out for validation.

The procedure

Split the training data into k folds of roughly equal size.
For each fold, train on the other k minus one folds and validate on the held out fold.
Average the k validation scores to get a robust performance estimate.

Why it helps

A single split can be lucky or unlucky. Averaging k results gives a lower variance estimate.
Every example is used for validation exactly once and for training k minus one times.
It uses data efficiently, which matters most on small datasets.

Practical tips

Common choices are 5 or 10 folds, trading cost against estimate stability.
Use stratified folds for classification so class ratios stay balanced.
Keep a separate test set untouched. Cross validation tunes choices, the test set gives the final unbiased number.

Key idea

K fold cross validation rotates the validation slice across k folds and averages the scores, giving a lower variance, data efficient estimate of generalization.

Cross Validation K Fold

Cross Validation K Fold

The procedure

Why it helps

Practical tips

Key idea

Check yourself