Subscribe to DSC Newsletter


I have been working on developing a churn prediction model (in telco market) for the last 2-3 months. Using logistic regression (NN and DT was also used but Log Reg gave the best results)  I made a model with a very high predictive accuracy. All seemed to go well until i observed that most of the predicted churners had churned even before the prediction.

The structure of my model is as follows:

1) churn is defined as inactivity of 20 consecutive days since this is not a subscription account
2) high revenue customers (>=75USD revenue per month)
3) 3 prior months of historical usage data for each customer (usage aggregates for each of the 3 months)

the data set is trained on a sample of 100k subs with 50k churners and 50k non churners. The total high revenue base is 2.7 Million. The lift of the model is very good but inaccuracy is very high because of 'in-actionable' churners. Nothing seems to predict these people 'before' they churn.

Please advice if any one has faced a similar problem in churn. Thanks


Views: 6333

Reply to This

Replies to This Discussion

because P(already churned to churn)=1
Hi Talha,

I am also trying to build a similar model for a telecom operator.
I will be glad if we cab exchange some though process.
If you dont have problem then please sahre your email id....mine is [email protected]
I am developing the below steps:
1. Calulation of LTV -Life time value of each customer
2. Assign churn score based on Logit model.
3. Segmentation based on LTV, RFM and usage.
4. develop campaigns based to stop churn.
Churn is a concept very similar to Survival Theory or Lifetime Value Analyses.
To predict the probability of Churn using a Logistic may not be as robust since like you say, it's tough to predict someone before they churn. Probably, if you had more powerful data, one that assesses churn itself, you could end up with better predictive power using Logisitc.

In the case that you're not able to use Logistic, it's better to predict the Lifetime Value of the customer. In other words, to understand how long is he/she going to be associated with you as a customer.
There are many ways to do this, and using Survival Probability is one way. Using a Proc Phreg helps do this in SAS, but I've never done one myself.
Customer Lifetime Value (CLV) is another method in itself that tells you what the customer is worth by computing his 'discounted future' value using the concept of time value of money. That lets you find out the most probably churn customers even before the churn happens.

I'd be glad if you share details if you do anything like this.

I really didn't see the replies to this already. I see that you've actually tried Survival Prediction already!! Can you share how you did it, since I believe that Survival should best establish a churn possibility using an Exponential Distribution. If that's not working as it should, probably the pattern/graph of churn needs to be analysed for other shapes/distributions.

Beyond that, having read the other replies, I think the way you've defined your churn, and hence your 'Y' variable are leading you to such erroneous prediction. You say you want to predict churn in 70-90 days interval, when most of your customers have already churned much before.

Ideally, if you used a good survival fitted model, you should be able to predict something like
P(survival>90 days | customer survives till 70 days).

Hopefully, this should answer your question. I'd be glad to help if you have any specific questions.


On Data Science Central

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service