I have been working on developing a churn prediction model (in telco market) for the last 2-3 months. Using logistic regression (NN and DT was also used but Log Reg gave the best results) I made a model with a very high predictive accuracy. All seemed to go well until i observed that most of the predicted churners had churned even before the prediction.
The structure of my model is as follows:
1) churn is defined as inactivity of 20 consecutive days since this is not a subscription account
2) high revenue customers (>=75USD revenue per month)
3) 3 prior months of historical usage data for each customer (usage aggregates for each of the 3 months)
the data set is trained on a sample of 100k subs with 50k churners and 50k non churners. The total high revenue base is 2.7 Million. The lift of the model is very good but inaccuracy is very high because of 'in-actionable' churners. Nothing seems to predict these people 'before' they churn.
Please advice if any one has faced a similar problem in churn. Thanks