"Since you are dealing with categorical variables, and also since you have only two, chi-square test of independence would be an automatic choice for simplicity of application and interpretation.
If you want more details, you can find some…"
I agree totally that in a larger dataset the probabilities of finding spurious/accidental correlations are higher than in a smaller dataset. In this context, "larger" implies higher k. But when you state "curse of big…"
Thanks for your explanation. We have also struggled with this issue and when we tried support vector machines, they were quite sensitive to imbalanced data. We balanced the data by undersampling but the results were sub-par.
I am not sure…"
"I second this opinion. There are also a couple of excellent discussion items in the LinkedIn sister group which indicate this trend among practioners.
One key challenge that people seem to be facing is determining answers to qustions like what data…"
Appreciate the response. Thanks.
Are you (or anyone) aware of the performance of such models? We have used several models/techniques, but unfortunately, the speed of business limits us from really quantifying the ROI and helping us…"