A Data Science Central Community
presently m building a short stay model for inpatient claims....the dependent variables for the same is defined as the claims for which we recovered amount(that claim can be a waste/fraud claim) by conducting manual audit last year.
Though C stats and other stats are good for the model but the problem starts when i look onto significance of intercept term which is as below:
|Parameter||DF||Estimate||Standard||Wald||Pr > ChiSq|
M not sure what to do....
can any one please suggest me..what can be done...i understand ..it means that the average value of the dependent variable when all the other independent variables are equal to zero.....but should i include intercept while scoring my model or what else.....?
Thanks in advance..!!!!
You should always include the intercept. The intercept is just an error term and generally means nothing. All the model is saying is that the intercept happens to be close to zero, but that's it. You can safely ignore the significance
In my opinion, intercept means unknown factors and error when estimating coefficients in the linear phase . This may be caused by the noise in your training data set if it is mainly error. I think you can try to clean the data and try to add more factors.
I may be late to providing a course of action, but I agree with Sagar, cross-validation is probably the best approach, build your model with about 90 percent of your observations and use the model to predict the other 10 percent with and without the intercept, use the model that predicts most accurately.
Though I think you'll be safe with not using the intercept, its zero anyway.