Principle:Scikit learn Scikit learn Grid Search
Overview
A model selection strategy that evaluates every combination of hyperparameters using cross-validation to find the optimal configuration.
Description
Exhaustive Search with Cross-Validation
Grid search with cross-validation (GridSearchCV) is the canonical approach to hyperparameter tuning in scikit-learn. Given a parameter grid that specifies a finite set of candidate values for each hyperparameter, the algorithm evaluates every combination from the Cartesian product of those value sets. For each combination, the estimator is trained and scored using k-fold cross-validation, producing a mean score and standard deviation across folds.
The procedure works as follows:
- Define the parameter grid -- specify which hyperparameters to tune and their candidate values.
- Choose a cross-validation strategy -- decide how to split the data (e.g., 5-fold stratified CV for classification).
- For each parameter combination:
- Clone the base estimator and set the candidate parameters.
- Split the data into k train/test folds.
- Fit the estimator on each training fold and score it on the held-out test fold.
- Record the scores across all folds.
- Select the best configuration -- rank candidates by their mean cross-validated score.
- Refit -- train a final model on the full dataset using the best parameters.
Why Grid Search Works
Grid search is effective because it provides exhaustive coverage of the specified parameter space. When the grid is well-designed (i.e., the true optimum lies within or near the specified values), grid search is guaranteed to find the best configuration among the candidates. The cross-validation wrapper ensures that the selected configuration generalizes well by using held-out data for evaluation rather than training data.
When Grid Search is Feasible vs. When Random Search is Better
Grid search is practical when:
- The number of hyperparameters is small (typically 1-3).
- Each parameter has a small number of candidate values.
- The estimator is fast to fit.
Grid search becomes impractical when:
- The parameter space is high-dimensional -- the number of combinations grows exponentially.
- Parameters are continuous -- discretizing them into a grid requires arbitrary choices about granularity.
- Computation is expensive -- each combination requires k full fit-score cycles.
In such cases, RandomizedSearchCV is preferred because it samples a fixed number of configurations regardless of dimensionality, and it can draw from continuous distributions without discretization.
Theoretical Basis
Nested Resampling
Grid search with cross-validation implements nested resampling: the outer loop iterates over parameter candidates while the inner loop (cross-validation) iterates over data splits. This prevents the selection bias that would arise from evaluating candidates on the same data used to choose between them. The mean cross-validated score provides an approximately unbiased estimate of the generalization performance for each configuration.
Cross-Validation Within Search
The cross-validation strategy determines how training data is partitioned. scikit-learn defaults to 5-fold cross-validation. For classifiers with imbalanced classes, StratifiedKFold is used automatically to preserve class proportions in each fold. The choice of CV strategy affects both the bias-variance tradeoff of performance estimates and the computational cost of the search.
Refit Strategy
After identifying the best parameter combination, the refit step trains a final estimator on the entire dataset (not just the training folds). This is important because:
- Cross-validation uses only (k-1)/k of the data for training in each fold.
- The final model benefits from seeing all available training data.
- The refitted model is what gets deployed or used for predictions.
The refit behavior can be customized: it can be disabled (refit=False), set to a specific metric name for multi-metric evaluation, or set to a callable that implements custom selection logic (e.g., choosing the simplest model within one standard error of the best).