The search problem
Models have hyperparameters, settings like learning rate or tree depth that you fix before training and that strongly affect quality. Tuning them means trying combinations and scoring each with cross validation. Grid and random search are the two simplest strategies.
Grid search
Grid search defines a fixed set of values per hyperparameter and tries every combination:
- Thorough and easy to reason about
- The number of runs grows exponentially with the count of hyperparameters, a curse of dimensionality
- Effort is wasted on parameters that barely affect the result
Random search
Random search instead samples combinations at random from chosen ranges. Surprisingly, for a fixed budget it usually finds better settings, because most of the gain comes from a few important parameters and random sampling explores more distinct values of each one.
Beyond the basics
When each run is expensive, Bayesian optimization uses past results to choose the next trial intelligently, beating both grid and random.
Key idea
Grid search tries every combination and scales badly, while random search samples the space and usually finds better hyperparameters for the same budget.