A Data Science Central Community
I'm re-evaluating the segments I have applied to my client population. Does anyone have an good resources for me read about how I should start (or re-start in this case)?
My goal with these revamped segments is to monitor profitability, attrition, etc.
Are there some articles are books that would be a good starting point?
Here is a good one:
I fully support Jason's recommendation of the book. Another book to consider even if you do use SAS is "CRM Segmentation and Clustering Using SAS Enterprise Miner" by Randall S. Collica.
You said you are re-evaluating segments. Does it mean that you have already built predictive models and now new data has arrived, so that you need to update models?
Segmentation. Technically, sorting people by hair color is segmentation.
Most best segmentation solutions contain multiple dimensions:
1. Value dimension (historical and predicted) - how much should I invest in each customer?
2. Customer status (prospect, new, ongoing, lapsed, lost) - where are they in their lifecycle?
3. Geodemographic/interests/attitudes - who are they and how should you talk to them?
4. Product ownership (single product, multi-product) - should I cross-sell/up-sell?
5. Short term likelihood to purchase (overall, and by product) - this can be used to measure attrition risk and attractiveness
6. Purchase patterns - average time between purchase for customers buying 2x or more.
Cross the dimensions and collapse into a manageable number of segments. Align marketing strategies against each group.
The art is to set up rules that make sense and to track migration patterns over time. A customer should only migrate between segments if their status / demographic attributes have changed substantially and measurably.
Cluster analysis is one of my LEAST favorite methods since clusters are fuzzy (a cluster that profiling reveals to be "older" can have people of any age in it), and cluster solutions aren't necessarily stable (people at cluster boundaries can bounce between segments on repeated scorings, which is hard to defend).
If you consider clustering, make sure that you use multiple samples (5 or 8 or 10) to generate your cluster solutions and track the performance of the descriptor variables over each iteration. You'll end up generating a buttload of solutions, but you will eventually find the best attributes for clustering.
In ongoing scoring:
1. Track the distribution of the segmentation inputs - changes will change your cluster sizes over time - understand the main attributes used for segmentation and track them. If there are changes in your underlying data structure, then your segmentation might fall apart.
2. Understand that the marketing programs you put into place WILL impact your cluster distribution over time - what WAS high value (if you're doing things right) should be redefined if you've managed to increase the lifetime value of active customers through in market action.
3. Track the migration of scored individuals over time. Segment migration or threatened migration are the places where marketing actions are most effective.
4. Chose who you lose. Attrition models may be accurate, but they don't differentiate between high value customers whose purchasing patterns have started to slow and one-time buyers who aren't likely to come back. In addition to generating a risk model, also look at purchase patterns among repeat buyers and average time between purchases - as this time increases, you have real risk.
I'm not sure that I'd recommend ANY book out there right now. Although the guides that are out there can give you insight into the proper statistical method / approach to consider, they often fall apart when it comes to ability to act on the segmentation in market or the practical considerations associated with ongoing scoring and use of a segmentation.