Cliff Whitworth

Adaboost Classifier

# Optimize AdaBoostClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import make_scorer

clf = AdaBoostClassifier(base_estimator=DecisionTreeClassifier())

# Provide hyperparameters
h_params = {'n_estimators':[50,120],
           'learning_rate':[0.1,0.5,1.],
           'base_estimator__min_samples_split':np.arange(2,8,2),
           'base_estimator__max_depth':np.arange(1,4,1)}

base_estimator uses DecisionTreeClassifier by default with min_samples_split default set to 2. See DecisionTreeClassifier

scorer = make_scorer(fbeta_score,beta=0.5)
grid_obj = GridSearchCV(clf, h_params, scorer)
grid_fit = grid_obj.fit(X_train, y_train)
best_clf = grid_fit.best_estimator_

make_scorer wraps scoring functions for GridSearchCV. See make_scorer. fbeta_score is the weighted harmonic mean of precision and recall. The beta parameter controls the weighting. See fbeta_score

predictions = best_clf.predict(X_test)

print(f'Accuracy: {accuracy_score(y_test, predictions):.4f}')
print(f'F-score: {fbeta_score(y_test, predictions, 0.5):.4f}')
print('\nBest Model\n-----')
print(best_clf)

Notes

Recent Notes

AdaBoost Classifier