Subscribe to DSC Newsletter

I am working on a project to forecast CLTV and would like to get suggestions regarding possible approach I could adopt. Below is a description of my project:

  1. Our client is an e-retailer specializing in Flash sales of designer apparel. A person has to register on their website to make a purchase or even to browse the sales taking place.
  2. We have defined 2 types of people:
    1. Member: A member is someone who registers on their site but has not made any purchase till now.
    2. Customer: A customer is defined as someone who has registered and made at least 1 purchase.
    3. Our client wants us to forecast CLTV for both customers and members.
    4. Also we have data from 2008 onwards.
    5. For customer we have demographic data, purchase data and website visitation data.
    6. For members we will have only demographic and website visitation data.

Here is an approach I have in mind:

  1. Define CLTV as:  Net present value of “CLTV = Probability(alive at each point in time) * Forecast Revenue at that point in time discounted at a rate of 15%”
  2. The first part is to forecast Probability(alive) or Probability(of profitable lifetime to company), I am thinking of a survival analysis (Cox proportional hazards model) using Proc PHREG in SAS.
    1. In order to implement Proc PHREG, we have defined Churn as: A customer is churned if he does not make a purchase for 12 months since his last purchase.
    2. My modeling dataset would contain data from 2008 till Feb’2011.
    3. For example let us suppose a customer A registers on the website on Feb’2008, makes his 1st purchase on Apr’2008, 2nd purchase on Sep’2008, 3rd on Feb’2009 and after that never made any purchase. For A dependent variable ‘t’ will be (Last purchase date – First purchase date) in essence describing the time during which he was profitable to the company.
    4. There are some customers who have placed on 1 order, for example let us suppose a customer B, registers in Jan’2009 makes a purchase in Feb’2009 and never makes a purchase, for such customers ‘t’ will be defined by a very small value say 1 day. But the problem is for this type of customer I can’t defined “Average Order Gap” variable which I believe will be important independent variable.
    5. If a customer has not churned as of Feb’2011, he will be censored in my model.
    6. This is an overall rough approach I have in mind for first part of the model, I have read many papers which have modeled using NBD/Pareto model and modifications of that model, but these are just based on a customer’s last purchase and my client is particular about modeling using traditional survival approach.
    7. For the 2nd part of CLTV, to forecast revenue in future, I was thinking about using a Average value defined on basis of customer cohorts. But again my client wants us to try to use modeling to predict future revenue at each point in time. I am like kind of lost as to how to accomplish this.
    8. One more major problem is how do I give a CLTV value for a member, who has never made a purchase till now? The above models can only be built on customers who have purchase information but member have only demographic and visitation information. I am thinking of rough segmentation/look alike kind of approach to map members to customer groups based on demographic and visitation data only.
    9. Would appreciate if members could share their suggestions regarding the above approach and also suggest alternatives. Thanks, Hari

Views: 1803

Reply to This

Replies to This Discussion

Good methodology.  For new customers, I think you have no other alternative that to model 'look alike' customers. Revise, as you get more history.  For revenue, I would use the forecasted revenue at that point combined with a weighted average for each defined cohort taking into account the sparsity of the data.


I guess you wrote this some time back.....but I am doing a project which is almost exactly similar to this......would be great if you could share the methodology you finally adopted and any suggestions/tips would be great!

Hope to get a reply!!

Thanks in advance



On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service