Hyperparameter Tuning Grid Search
Grid search tunes hyperparameters by defining a finite set of candidate values for each one and evaluating every combination. It is simple, exhaustive, and easy to parallelize.
How it works
- Choose a list of candidate values for each hyperparameter.
- Form the Cartesian product, giving every possible combination.
- Evaluate each combination with cross validation and keep the best.
Strengths
- Exhaustive within the grid, so it cannot miss a listed combination.
- Fully parallel, since each combination is independent.
- Transparent and easy to reason about.
Weaknesses
- The number of combinations grows exponentially with the count of hyperparameters, called the curse of dimensionality.
- It wastes effort on unimportant hyperparameters, evaluating many useless values.
- Resolution is limited to the grid points you chose in advance.
For more than a couple of hyperparameters, random search or Bayesian optimization usually find good values for less compute.
Key idea
Grid search exhaustively evaluates every combination on a predefined grid, which is simple and parallel but scales exponentially and wastes effort on unimportant hyperparameters.