One of the crucial phases of machine learning is hyperparameter tweaking. Due to the fact that ML algorithms may not always give the maximum accuracy. To reach the highest level of accuracy, you must tweak their hyperparameters.
I’ll talk about Grid Search CV in this post. Cross-validation is the CV’s abbreviation. Grid Search CV tests every possible combination of the parameter values you provide and selects the best one.
Take the example below. It will attempt every combination if you provide it a list of values for three hyperparameters to try. All combinations below refer to the 5X2X2 = 20 hyperparameter combinations. Therefore, increasing the number of alternatives to test by one more hyperparameter will exponentially increase the time required. Selecting only the most crucial factors to tweak requires caution.
1
2
3
4
|
# Parameters to try
Parameter_Trials={‘n_estimators’:[100,200,300,500,1000],
‘criterion’:[‘gini’,‘entropy’],
‘max_depth’: [2,3]}
|
The GridSearchCV function runs all of the possible parameter combinations in the example below. There are 20 choices in this case.
GridSearchCV additionally conducts cross-validation for each combination. Using the ‘cv’ argument, you may define the Cross-Validation depth.
cv=5 denotes that the data will be split into five equal halves, one of which will be utilized for training and the other four for testing. K-fold Cross-validation of the model, where K=5, is another name for this. The test data will be altered each time, and this will be done five times. The average of these five times represents the ultimate accuracy.
For cross-validation, any number between 5 and 10 is suitable. Keep in mind that the computation will take longer the larger the value.