A Data Science Central Community
I am now studying an economic binary dependent variable with the Logit Regression Analysis. My data is large (N = 340.000) but the yes cases are only about 1% of the data, so the goodness of fit of my model is very low, mainly because of that. Could you please help me understanding if there are another Binary Regression Models that I should use to obtain better results? Or do you think I should transform my data, or do you have any other idea for this type of situation?... :)
The standard approach to have atleast 2% response rate in your data. You can do boosting here e.g Oversampling is an approach where one can increase the response rate by repeating the no. of rows of responders to a considerable level resulting in increased response rate.
Also you can run other decision tree techniques to remove some nonresponders by not considering a segment of customers which will help you increase your response rate.