Subscribe to DSC Newsletter

Document Classification: how to boost your classifier

ADaBoost.M1 tries to improve step by step the accuracy of the classifier analyzing its behavior on training set. (Of course you cannot try to improve the classifier working with the test set!!).
Here lays the problem, because if we choose as "weak algorithm" an SVM, we know that almost always it returns excellent accuracy on the training set with results closed to 100% (in term of true positive).
In this scenario, try to improve the accuracy of classifier assigning different weights to the training set doesn't work!

The idea of AdaBoost.M1 is still valid also with SVM classifiers, but I noticed that instead of to improve step by step the single SVM classifier,  we can train different SVM classifiers (with different parameters) and assign the score to each classifier based on the error committed on a part of the training set not used to train the SVM.
Let me explain better the procedure:

  1. Split the training set in two different sets: training set (80% of the volume of the original training set), boosting set (20% of the volume of the original training set).
  2. train SVM classifiers with different parameters.
  3. test the above SVM on the boosting set and calculate the beta score as done in the original implementation of ADAboost.M1.
  4. The final answer of the boosted classifier is obtained as proposed by the original implementation of the Adaboost.M1

As you can notice, the method basically assigns to the k classifiers a "trustability score". here to read all post.

Precision/Recall for the four weak classifier

 Precision and Recall for the boosted classifier.

Views: 372


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service