Issues with Predicting Churn - AnalyticBridge2020-06-04T02:44:54Zhttps://www.analyticbridge.datasciencecentral.com/forum/topics/issues-with-predicting-churn?id=2004291%3ATopic%3A72341&feed=yes&xn_auth=noI really didn't see the repli…tag:www.analyticbridge.datasciencecentral.com,2010-08-23:2004291:Comment:768862010-08-23T17:23:12.103ZArunhttps://www.analyticbridge.datasciencecentral.com/profile/Arun
I really didn't see the replies to this already. I see that you've actually tried Survival Prediction already!! Can you share how you did it, since I believe that Survival should best establish a churn possibility using an Exponential Distribution. If that's not working as it should, probably the pattern/graph of churn needs to be analysed for other shapes/distributions.<br />
<br />
Beyond that, having read the other replies, I think the way you've defined your churn, and hence your 'Y' variable are…
I really didn't see the replies to this already. I see that you've actually tried Survival Prediction already!! Can you share how you did it, since I believe that Survival should best establish a churn possibility using an Exponential Distribution. If that's not working as it should, probably the pattern/graph of churn needs to be analysed for other shapes/distributions.<br />
<br />
Beyond that, having read the other replies, I think the way you've defined your churn, and hence your 'Y' variable are leading you to such erroneous prediction. You say you want to predict churn in 70-90 days interval, when most of your customers have already churned much before.<br />
<br />
Ideally, if you used a good survival fitted model, you should be able to predict something like<br />
P(survival>90 days | customer survives till 70 days).<br />
<br />
Hopefully, this should answer your question. I'd be glad to help if you have any specific questions. Churn is a concept very simil…tag:www.analyticbridge.datasciencecentral.com,2010-08-23:2004291:Comment:768822010-08-23T17:04:53.618ZArunhttps://www.analyticbridge.datasciencecentral.com/profile/Arun
Churn is a concept very similar to Survival Theory or Lifetime Value Analyses.<br />
To predict the probability of Churn using a Logistic may not be as robust since like you say, it's tough to predict someone before they churn. Probably, if you had more powerful data, one that assesses churn itself, you could end up with better predictive power using Logisitc.<br />
<br />
In the case that you're not able to use Logistic, it's better to predict the Lifetime Value of the customer. In other words, to understand…
Churn is a concept very similar to Survival Theory or Lifetime Value Analyses.<br />
To predict the probability of Churn using a Logistic may not be as robust since like you say, it's tough to predict someone before they churn. Probably, if you had more powerful data, one that assesses churn itself, you could end up with better predictive power using Logisitc.<br />
<br />
In the case that you're not able to use Logistic, it's better to predict the Lifetime Value of the customer. In other words, to understand how long is he/she going to be associated with you as a customer.<br />
There are many ways to do this, and using Survival Probability is one way. Using a Proc Phreg helps do this in SAS, but I've never done one myself.<br />
Customer Lifetime Value (CLV) is another method in itself that tells you what the customer is worth by computing his <u>'discounted future' value</u> using the concept of time value of money. That lets you find out the most probably churn customers even before the churn happens.<br />
<br />
I'd be glad if you share details if you do anything like this.<br />
<br />
Thanks,<br />
Arun Hi Talha,
I am also trying t…tag:www.analyticbridge.datasciencecentral.com,2010-08-23:2004291:Comment:768492010-08-23T09:09:39.606ZRanit Sinhahttps://www.analyticbridge.datasciencecentral.com/profile/RanitSinha
Hi Talha,<br />
<br />
I am also trying to build a similar model for a telecom operator.<br />
I will be glad if we cab exchange some though process.<br />
If you dont have problem then please sahre your email id....mine is ranit_20012001@yahoo.co.uk<br />
I am developing the below steps:<br />
1. Calulation of LTV -Life time value of each customer<br />
2. Assign churn score based on Logit model.<br />
3. Segmentation based on LTV, RFM and usage.<br />
4. develop campaigns based to stop churn.
Hi Talha,<br />
<br />
I am also trying to build a similar model for a telecom operator.<br />
I will be glad if we cab exchange some though process.<br />
If you dont have problem then please sahre your email id....mine is ranit_20012001@yahoo.co.uk<br />
I am developing the below steps:<br />
1. Calulation of LTV -Life time value of each customer<br />
2. Assign churn score based on Logit model.<br />
3. Segmentation based on LTV, RFM and usage.<br />
4. develop campaigns based to stop churn. Paul,
because P(already churn…tag:www.analyticbridge.datasciencecentral.com,2010-07-06:2004291:Comment:731892010-07-06T01:16:20.844ZJozo Kovachttps://www.analyticbridge.datasciencecentral.com/profile/JozoKovac
Paul,<br />
because P(already churned to churn)=1
Paul,<br />
because P(already churned to churn)=1 Talha,
I guess in theory tha…tag:www.analyticbridge.datasciencecentral.com,2010-07-06:2004291:Comment:731882010-07-06T01:11:12.196Zpaul dhttps://www.analyticbridge.datasciencecentral.com/profile/pauld
Talha,<br />
<br />
I guess in theory that a multinomial logit model would be ok, because what we are talking about here is states of potential (probability of outcome) for churn stemming from a pre-condition of dormancy.<br />
<br />
Hidden markov models spring to mind, but are not that effective in practice.<br />
<br />
Since you cannot predict dormancy from the variables you have, to some degree this variable can be thought of as being stochastic - random (it is a state of mind essentially before any action is undertaken).…
Talha,<br />
<br />
I guess in theory that a multinomial logit model would be ok, because what we are talking about here is states of potential (probability of outcome) for churn stemming from a pre-condition of dormancy.<br />
<br />
Hidden markov models spring to mind, but are not that effective in practice.<br />
<br />
Since you cannot predict dormancy from the variables you have, to some degree this variable can be thought of as being stochastic - random (it is a state of mind essentially before any action is undertaken). That does not mean you cannot apply some basic analysis to dormancys such as modelling it via a simple weibull curve, examining the distribution etc.<br />
<br />
Possible experiment<br />
Could you instead undertake some experimentation on the efficacy of your existing churn strategy. Each time a customer enters the dormant stage based on your defn, assign them to one of two conditions (retention or no retention strategy), deploy your retention strategy on 50% of the sample, and examine the impact on churn rates.<br />
<br />
HTH Paul<br />
<br />
Jozo,<br />
<br />
Just out of curiosity, why not use churners in the modelling process? Given that churn is the outcome you are trying to predict, and each of the variables in logistic, random forest, SVM models can be developed to attempt to predict this outcome .... What other outcome variable would you train your model on if not churn?<br />
<br />
cheers Paul Talha,
I'm really interested…tag:www.analyticbridge.datasciencecentral.com,2010-07-05:2004291:Comment:731622010-07-05T20:51:49.388ZJozo Kovachttps://www.analyticbridge.datasciencecentral.com/profile/JozoKovac
Talha,<br />
I'm really interested about what do you do for retaining your customers?<br />
... if they throw SIM card away what's your plan to prevent them to do so?
Talha,<br />
I'm really interested about what do you do for retaining your customers?<br />
... if they throw SIM card away what's your plan to prevent them to do so? @Jozo
You need a real-time sc…tag:www.analyticbridge.datasciencecentral.com,2010-07-01:2004291:Comment:729302010-07-01T16:57:38.076ZTalha Nur Omerhttps://www.analyticbridge.datasciencecentral.com/profile/TalhaNurOmer
@Jozo<br />
<b><u>You need a real-time scoring predicting - will this customers visit us again in the next 20 days?</u></b><br />
Well maybe :)<br />
<b><u>Or just send an email to (all) inactive customers after 15 days of inactivity.</u></b><br />
We cannot do that because most of the people who are 15 days dormant never return so after 15 days it is very likely that the prepaid customer has thrwon away his sim card and switched to another user....so we need to predict this 15 day inactivity in advance…
@Jozo<br />
<b><u>You need a real-time scoring predicting - will this customers visit us again in the next 20 days?</u></b><br />
Well maybe :)<br />
<b><u>Or just send an email to (all) inactive customers after 15 days of inactivity.</u></b><br />
We cannot do that because most of the people who are 15 days dormant never return so after 15 days it is very likely that the prepaid customer has thrwon away his sim card and switched to another user....so we need to predict this 15 day inactivity in advance :)<br />
<b><u>Simple solution may be the most effective.</u></b><br />
I agree but so far no simple solution could be found<br />
<br />
@Paul<br />
Thanks paul for you feedback and referring me to this website :) I ll look into it BUT actually i did use survival analysis. Unfortuanatley initial results from the rough cut model were not very encouraging :( How about Multi Nominal regression? may be i can make 4 categories of target variable 1) dormant in marketing gap, dormant in churn period 2) active in marketing gap, active in churn period 3) active in Marketing gap, dormant in churn period 4)dormant in marketing gap and active in churn period....What do you think about that? Thanks :) are you not predicting two th…tag:www.analyticbridge.datasciencecentral.com,2010-07-01:2004291:Comment:729112010-07-01T15:01:20.209Zpaul dhttps://www.analyticbridge.datasciencecentral.com/profile/pauld
are you not predicting two things here, who and also when, since the dependent variable is in part time to event (churn), could you not run survival analysis on this<br />
<br />
time to event modelling is very common in telecoms churn<br />
<br />
if you have open source r available, here is a starting point<br />
<br />
easily accomodate100, 000 via mle estimation<br />
<br />
<a href="http://gking.harvard.edu/zelig/docs/index.html" target="_blank">http://gking.harvard.edu/zelig/docs/index.html</a><br />
<br />
Models for Continous Bounded Dependent…
are you not predicting two things here, who and also when, since the dependent variable is in part time to event (churn), could you not run survival analysis on this<br />
<br />
time to event modelling is very common in telecoms churn<br />
<br />
if you have open source r available, here is a starting point<br />
<br />
easily accomodate100, 000 via mle estimation<br />
<br />
<a href="http://gking.harvard.edu/zelig/docs/index.html" target="_blank">http://gking.harvard.edu/zelig/docs/index.html</a><br />
<br />
Models for Continous Bounded Dependent Variables is what you need<br />
<br />
hth paul d You need a real-time scoring…tag:www.analyticbridge.datasciencecentral.com,2010-06-29:2004291:Comment:727372010-06-29T10:02:16.264ZJozo Kovachttps://www.analyticbridge.datasciencecentral.com/profile/JozoKovac
You need a real-time scoring predicting - will this customers visit us again in the next 20 days?<br />
<br />
Or just send an email to (all) inactive customers after 15 days of inactivity.<br />
<br />
Simple solution may be the most effective.
You need a real-time scoring predicting - will this customers visit us again in the next 20 days?<br />
<br />
Or just send an email to (all) inactive customers after 15 days of inactivity.<br />
<br />
Simple solution may be the most effective. Hi Talha,
So, your churners…tag:www.analyticbridge.datasciencecentral.com,2010-06-29:2004291:Comment:727322010-06-29T05:32:47.926ZTalha Nur Omerhttps://www.analyticbridge.datasciencecentral.com/profile/TalhaNurOmer
Hi Talha,<br />
<u><b><br />
So, your churners are customers who are active in the 10 days marketing period, and become inactive in the next 20 days.</b></u><br />
Yes that is correct :)<br />
<u><b>And according to you,<br />
<br />
"Only very very few subscribers who churn are active in the Marketing."<br />
<br />
That means the churners have already left before your churn window definition of inactivity for 20 days. That also means, you don't have transaction data for these churners for the 10 days.</b></u><br />
Yes we do not have the…
Hi Talha,<br />
<u><b><br />
So, your churners are customers who are active in the 10 days marketing period, and become inactive in the next 20 days.</b></u><br />
Yes that is correct :)<br />
<u><b>And according to you,<br />
<br />
"Only very very few subscribers who churn are active in the Marketing."<br />
<br />
That means the churners have already left before your churn window definition of inactivity for 20 days. That also means, you don't have transaction data for these churners for the 10 days.</b></u><br />
Yes we do not have the transaction data available for these 10 days. neither in the training and of course not in the scoring data sets.<br />
<br />
<u><b>Therein lies the problem :-) Why don't you try defining churn as customers who are inactive for 30 days? Any churn model should predict when customers are about to leave. And from your explanation, my understanding is that most of your customers have already decided to leave about 10 days before you define them as churners. Try changing the definition.</b></u><br />
<br />
So do you mean that i should remove the marketing gap and include that in the churn period. meaning churn period is from 60th to 90th day. OR should i leave the markteing gap as is, and define my churn window from 70th to 100th day? I did do the former a few days back BUT still the problem remains the same. A lot of miss-hits (around 60%) and of the 40% correctly predicted churners (34%) already dormant :(