The problem
You need to choose hyperparameters like tree depth or regularization strength. Picking them by test set performance secretly leaks the test set, giving you a falsely optimistic estimate.
K fold cross validation
K fold splits the training data into k equal parts.
- Train on k minus one folds and validate on the held out fold.
- Rotate so every fold is the validation set exactly once.
- Average the k scores for a stable estimate that uses all the data.
Tuning safely
- Run cross validation for each hyperparameter setting in a search.
- Pick the setting with the best average validation score.
- Keep a separate untouched test set for the final honest estimate.
Variants
- Stratified folds preserve class balance for classification.
- For time series, use forward chaining so you never train on the future.
- Nested cross validation separates tuning from evaluation when both are needed.
Key idea
Cross validation rotates validation folds to estimate generalization, letting you tune hyperparameters while a separate test set stays untouched.