Subscribe to DSC Newsletter

Hi all, I would like to get the group's view on the advantages and disadvantages of Random Forests and MARS modelling vs Linear regression. It would be interesting to compare them both at a statistical principles level, but also in their usefulness to econometrics.

Tags: Econometrics, MARS, Random Analytics, Regression

Views: 2932

Reply to This

Replies to This Discussion

There's no way to give you a good answer within a forum posts so I'll summarize my thoughts in a few small sentences.  RF can be considered a very powerful modeling approach but is pretty much a black box.  To put it in terms of linear regression, it is like building 200 linear regression models, with predictors and data chosen at random for each tree, and letting the overall prediction being an average (or voted) prediction of all 200 models.  With linear regression, you have one model built on all predictors, or predictors chosen by a modeling approach whether selection, stepwise or best subsets.  You can also see with that example how different the prediction equations would be, with linear regression fairly easy to understand.  With RF...well...there really isn't an equation per se.  The utility really comes down to what your purpose is.  Are you primarily focused on accurate predictions?  If so, RF may be your answer.  Do you need to understand how the variables work together towards a prediction?  If so, you may need linear regression (or an easily interpretable model).

Linear regression is difficult to interpret, subject over-fitting, sensitive to outliers, and only work in contexts in which associations are nearly linear. If you only have a few variables and hundred observations, it might be enough.

RSS

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service