Subscribe to DSC Newsletter

Hi,

My question is more related to modeling :)
Are too many predictors are good for a model? We usually end-up pushing too many predictor in a decision tree as the algorithm selects only the important ones? This may be true, but any judgments on this???

Thanks,
Zabi

Views: 198

Reply to This

Replies to This Discussion

I agree, as I know algorithms like decision trees in tools like Clementine has inbuilt splitting and prunning criteria which helps use only important predictors in the model. The problem as I see here is that just the algorithm decides when to stop growing trees that it self will not give us optimal predictors in the model. Many a times, using too many predictors will slowly reduces predictive accuracy and there by increase mis classification error on real time dataset.

Thanks,
Zabi
Not sure I am understanding your question. Wouldn't you want only the important predictors in a decision tree? In most cases less predictors account for a more robust model. But If you are saying that there are generally more predictors that needed, you can always prune the tree itself, that's where the human judgment comes in. Also use features such as k-fold cross validation to help you pick the optimal predictors.

-Ralph Winters
@Ralph - Thanks for mentioning k-fold cross validation approach. If I am unable to convey the issue properly - What I have seen many people doing in case of classification trees is that they push as many variables as possible into the classification model, assuming that the algorithm will pick the best predictors at the end. Is this right? Based on what I learned is that variable selection in the decision tree is based on the choice of the variable selected in the previous step. If this is true, shouldn't we be cautious in selecting our predictors and push them into the algorithm wisely?
Thanks Tom, based on your comment I will dig deeper into this concept. Do you have any online source from which I can get more clarifications on this?

RSS

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service