Looking for validation on this one:
Is it legitimate / useful to calculate R-squared on a decision tree model - overall and specifically with following methodology (with example):
1) Calculate the SSTO on the testing set as SUM(yi - y-bar)^2
2) Calculate the SSE on the testing set by calculating SSE for every leaf node SUM(yi - ybar at leaf node)^2 and then simply adding up the SSE.
3) Calculate R-squared as 1- (SSE/SSTO).
For example I have a testing set with SSTO of 199375602089438
Adding up the SSE of 24 leaf nodes is 7460083730039.81
So R-squared is 1- (7460083730039.81/199375602089438) = 0.96
1) Used a CHAID MODEL with a numeric dependent and a couple predictors
2) 70% of the records have 0 sales in the period - which is the dependent variable - I think that decision trees are not affected by mass at zero....thoughts here is appreciated)