Subscribe to DSC Newsletter

One of the most perplexing data mining dilemmas has to do with building tools that effectively rank order net lift. Essentially, marketers will promote a group but also have a holdout group where there is no marketing treatment. The modelling dilemma is that the net lift or difference in response rate between the marketted group vs. non marketted group is the same across all the deciles when models are typically built against the marketted group. There is software offered by some vendors that purports to solve this dilemma, but I'd be interested in hearing about thoughts and ideas from the user community that can actionally be used by the data miner as opposed to simply buying expensive software.


Richard Boire

Views: 2371


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Steffen Springer on October 27, 2009 at 3:18pm
Here is a paper which helped me to get a better feeling for the challenges in this area:

Note: I studied this paper a year ago and did not do any reasearch since then. There may be some "fresher" work available
Comment by Richard Boire on October 26, 2009 at 1:22pm
Thanks Patrick for the info and I am looking at the various links that you have given me.
Comment by Patrick Surry on October 26, 2009 at 12:37pm
Oh, forgot to mention: there were a couple of good talks about uplift at the recent Predictive Analytics World event, including Eric Siegel’s keynote, Andrew Pole's Challenges of Incremental Sales Modeling in Direct Marketing, and Mike Grundhoefer’s case study from US Bank
Comment by Patrick Surry on October 26, 2009 at 12:34pm
[Disclaimer: I work for one of those vendors, Portrait Software, offering uplit solutions]

We've been working on uplift (aka net lift, incremental response, differential response) modeling for many years now. From my perspective, the most interesting feature of the problem is that you can't directly measure the incremental response of an individual. In a traditional response model you can make a prediction for probability of response, then measure the actual response (yes or no), and finally calculate an error for that individual. But in the net lift scenario you make a prediction of "change in response rate if targeted": since you can't both target and not target a single individual, there's no concept of a pointwise error estimate. This makes it non-trivial to even define the quality of an uplift model (and to compare two models) - there's no easy analogue to R^2 for example.

We've seen lots of approaches to building net lift models, but most are indirect in that they are based on estimating the response rate for targeted and untargeted customers and looking at the difference in predicted response rate if targeted vs if not. These models don't seem to work very well in practice: my intuition for why not is that they rely on the error terms from the two response models behaving nicely which isn't typically the case. Response rates are commonly ten times the incremental response rate, so the lift prediction gets completely washed away in the noise of the response predictions. We've been most successful with a direct modeling technique based on recursive segmentation.

There is some good discussion of uplift issues at and of course you can find out more about our solutions at

On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service