A Data Science Central Community
I've recently come across Attribution Modeling and it seems to be the talk of the town in many areas. I tried googling about it, and figured out much about the basic skeleton of what it is. But what I'd like to know is,
MMM can be viewed as one type of attribution modeling that started from and focused more on traditional (offline) media channels, where no granular user-level data is available (this could change in the future). It can give you at a high level what portion of your total sales was driven by each media channel. It normally require a very long history of data to gather enough information for modeling.
The "new" Attribution Modeling seems to have started more from digital media and focused on leveraging user-level (cookie-level) data to figure out conversion credit deserved by different media channels/campaigns/placements/keywords (groups)/etc. At user-level, one can construct the exact sequence of touch points leading up to a conversion. By analyzing all such sequences, one can derive the fractional conversion credit for each touch point on the conversion sequence. It has gotten popular recently as more and more (digital) advertisers and agencies realize the previous de-facto last-click attribution method (used by all ad servers) cannot accurately capture the contribution of different media channels/campaigns/placements, especially for upper-funnel channels such as regular display. Display channel can help drive users down the conversion funnel and convert through Search channel but Search channel would normally take most of the credit according to last-click attribution method.
MMM is needed when one does not have user-level data. For example, in order to consider digital channels and offline TV media together, we have to join the data at some aggregate level and use an MMM-type approach. Traditionally another main use of MMM (sometimes also called MMO, marketing mix optimization) is for forecasting future sales.
User-level Attribution Modeling is preferred when you have user-level data as models based on user-level sequences can be more accurate and more causal.
I think Shi Zhong has given a very good account of each. The only thing I would add is that I would consider MMM as a tool to measure 'push' marketing (such as TV or radio ads). The models work far, far better when they are measuring marketing that has an event (e.g. a scheduled release) from which there are customer reactions spread over a relatively short period of time. Attribution modelling is more effective on 'pull' marketing when the event is triggered by the customer (e.g. clicking on a banner or PPC ad). MMM is often sub-optimal at measuring marketing such as this that are not connected to a pre-designated time period.
However, attribution modelling is not without issue. I am yet to see a completely convincing approach to how credit is fractionally apportioned. As Shi Zhong notes last-click has been shown to be often inaccurate, but most other methods are also flawed. First-click over favours marketing at the top of the funnel; time-decay methods again overly favour lower-funnel marketing; and custom models have too much subjectivity in them and are often fudged based on the analyst's preferences or particular goals.
Secondly, attribution models do not necessarily measure users - they measure cookies. If a customer switches between devices (e.g. laptop, computer, smartphone, tablet) then they are considered as a different user each time. Again this means that the models can potentially miss key touchpoints.
Both approaches are very valuable tools for measuring marketing impact but both do have fallacies. I would consider them as methods to create insight rather than the "whole empirical truth".
Thanks Shi Zhong and MIchael for your explanation.
At a very high level, I figured that this was primarily for the user level digital media. I've never worked at that level, so I'm guessing that's one reason I'm not able to visualize.
I did check out the various attribution model theories, like last click, first click, decay etc. They're finally just theories, they're not something that is being deduced from the data. So, here's some questions that I had more specifically around the entire thing -
1. Can we arrive at the attribution through data? For eg. if we knew display -> click -> direct happens 50% of the time, and search -> click -> direct happens 30%, finally remainder being display -> search -> click -> direct. Why can't we regress the pattern to arrive at a possible attribution percentage? Rather than having a theory of linear, exponential or whatever?
2. Why is attribution so much more important in the digital world over the 'push' marketing world? So often does it happen that we hear something on the radio, followed by a TV ad, and finally an Outdoor ad to push us into the store! I understand that the user has no control unlike the Digital, but there IS interaction between media. But the traditional market mix employs good old regression to delineate the effects!
This is exactly how I would like to work on the Attribution model too, without prefixing the percent contributions of different models. But letting the regression tell us the most probable percent distribution! What am I missing here?
3. Finally, at the aggregated level, all of the Digital, Display, Search data are used alongside TV, Radio, OOH etc.. as media channels, as variables for our MMM models! Which simply means, Attribution models are not here to replace MMM. Instead, I did read someplace about Attribution models helping the marketing & advertising folks around "planning", in that, once an attribution model is chosen, the strategy of communicating to the consumer is planned around that. For eg, display -> search -> click -> conversion; here, display aims to convert as many by-standers to those who visit the sites, so the primary function of display would be to attract and redirect!
Is this true? If it is, I can understand this to be more from the planner's point of view rather than modeling point of view!
Thanks again for your views! Much appreciated.
1. Yes there is work in this way but I think the problem with much digital data is its diversity. Unlike "push" where channels are finite and access limited (e.g. if your TV isn't switched on you don't see the advert) the "always on" culture of the internet means a myriad of possible paths. Also there is a psychological aspect to it; people online have lower attention spans and propensity to click around lot more.
2. Partly answered above but I would also add that I'm not completely convinced that MMM does a great job at attributing 100% effectively in the radio > TV > outdoor example. The problem is its very hard! If you were that customer I doubt you would be able to answer accurately which of these had what effect on your purchase decision and whether one or more needn't have been there. So the model may seem to spit out acceptable numbers but in reality there isn't any real way to validate these.
To a certain extent this can be countered by experimentation. So in New York you run just radio & TV; in LA radio & posters; and so on. However, as above, online means that it isn't as easy to put up these barriers as the people in New York are using the same Google search engine and the same website as the folks in LA.
3. No I agree attribution models will never replace MMM. There are many "digital automation" tools out there which use the attribution data to plan media choice, bid amount, etc. I expect that these could be quite successful and improve the planning of ads but this, IMHO, is just that they are a slightly better guesses than something far more random. As to whether they are truly "optimisation" is another matter.
There are two other areas/approaches you may be interested in. As companies increasingly seek to create multiplatform campaigns (e.g. TV and Youtube ads, microsites, etc.) around specific themes, an approach can be to combine these within an SEM model (or combined SEM/MMM) where the overall campaign is the latent variable.
The other (digital only) is combining attribution modelling with behavioural targetting. In other words we can counter some of the issues of the diversity of digital media by segmenting our users based on their onsite behaviour. I.e. if they like this blog post they'll probably like our Facebook page; if they like our deals page then they'll probably like our deals newsletter and so on. With these segments in place then the sort of attributional percentages you discuss are much more feasible.
The next step for this for me is to then combine qualitative data with the models so that we can understand what these segments of customers want from each channel: are they on the Facebook page because they want to chat about the product or because they want to see how we deal with complaints? This is the difference for me between delivering base-level automation to highly personalised "journeys".
Thanks for your reply. I'm still not convinced on some things.
1. The exact reason that there are a million possibilities of permutation combination of these occurances of channel pattern, that I feel the Attribution model doesn't seem to be making any sense at all. For eg., let's consider the exponential attribution for every sequence. Then, in one, suppose display occurs right at the beginning of the exposure cycle, while in another, it occurs near the conversion - it ends up getting very different weights, despite the fact that display could, as a medium have a strong conversion percentage overall!
2. In the case of MMM, we generally get the TOTAL ROI/effect of each channel irrespective of the halo effects. How we validate is how well it seems to fit our data after adstock! Now, every model is as good as the modeler/hypothesis, so, to your question of no way to prove, it's the best we can get to. But so long as we can be sure it isn't violating some assumptions, we can derive insights directionally!
So, despite whether Radio -> TV -> Sales or TV -> OOH -> Sales, we'll get ROIs for all three media! And that's all that matters?! Why would it matter to me, if TV drove Radio or vice versa? They all drove sales respectively! Period.
Like you mention, as a customer if I can't tell what influenced me more, are we trying to do the impossible by modeling the unknown/non-existent?
3. I just don't get where Attribution models help?! MMM give the ROIs for each media, isn't that enough for media planning? Do I really need to know if Display drove Search which drove my Website clicks? Who asks for this?
I've never had the chance to do an SEM so far, but yes, if complicated is what they want, I believe this gets better. I prefer to keep it simple with second/third stage regressions, with their baggage of pros and cons.
I'm just so stuck with why people are interested to look at the journeys of people! And how it helps with the whole modeling/decision making thing!
Sorry if I'm asking too many stupid questions, I'm just not yet clear on what the deal is with it!!
Good questions Arun. Some of my personal opinions:
Hi Shi Zhong,
1. Do you know any paper/method to do this? Well, to answer your question of how can we validate - there's essentially two things, one, we accept this to be data specific results, OR, two, we hold out a period to test the effects of the fractional conversion credit to see if it hold true or not. It's definitely tough, but I'm just guessing it's ok. I can tell you that, in MMM, we don't even hold out validity samples, it's sometimes just data specific insights! That's what's so screwed up sometimes, that what you're saying may just not be true again.
2. Why are people paranoid about journey tracking? Especially, when they're willing to give up the complexity just to get any information on it? For eg, since there are millions of permutations, we simplify attribution to merely linear or exponential or last click! What worth is that?
I see what you're talking about effects being difficult to capture in MMM, but I'm trying to see why it needs to be done in the first place.
It seems to me, that this whole Attribution model aims at trying to get to the 'true' effect of a medium, especially since we have such granular data for Digital. For eg, if I didn't so much as to care about the journey, I'd get say for Search $4.3 as ROI and for Display $2.3 as ROI, while in reality, 50% of Search was due to Display. But in any case, we can never confirm if we do arrive at this either to be true values right?
3. I like the way you've defined them. Let me give it some thought and a little more research.It's out there in the market, and people use them. There's obviously a lot of demand and hype about it.. So, I guess I'm just missing the fun..
Anyway, thanks for your replies. If you do have any links/docs that might help, please feel free to forward. Especially with the implementation part of it. Thanks!
Hey guys, this is a great discussion on a topic I haven't seen discussed too often. First off the big difference I have seen between MMM and Response Attribution is the granularity of the data you have to work with. I recently implemented an attribution model for a client that uses up to 10 channels (both online and offline) for marketing. We used a model based approach to measure the influence each of the touch points had on a customer throughout their buying "journey". A lot of you raise points that our approach handles including consolidating customers across devices, creating a very accurate timeline of touch points leading up to purchase and we also let the data do the talking without introducing business rules.
In my opinion Response Attribution is better than a fractional allocation model. A company can have a certain fractional mix to their advertising but it truly doesn't give you insight into how effective some campaigns really are. Are your millions of emails just as effective as the small amount of social media advertising you do? Just because you do a ton of advertising for a channel doesn't mean that it deserves all that credit towards driving sales or conversions. I have seen that a model based approach captures this and handles this situation nicely when compared to the real fractional mix of the channels. We can take this a step further and do this at the channel level, campaign level, business level, ect.
A nice addition to this story is that you can really drill down into to the customer journey and see what certain segments of people are doing before they purchase and also see which channels are working together. I have fully automated my process and if you would like to read more, I have wrote an e-book about it. You can check it out on my linked in page or reach out to me directly and I will send it to you. Otherwise I would like to hear your thoughts!
GREAT!! I'm really glad I caught your attention with this post! Can I get some help with respect to the implementation piece? I'd love to grab your ebook, let me shoot your a mail real quick.
Frankly, what your call Response Attribution, is what I see good old MMM as. It'd just going to throw out what's the effectiveness of each medium, right? You're adding a new dimension, which is touch points! Anyway, I'd love to see an example. Reaching out for the same, and will post here with any explanation I get from anywhere. Thanks!
What is the ability to map users across multiple devices worth in terms of this approach? Why can't you measure, statistically, what impact advertising has on product sales? Does this change if you have the ability to map the sales back to who has seen which type of advertising (may not apply except for online)?
Here is the link to the e-book I wrote if you have not found it yet:
Let me know what you guys think!
I read your e-book. I appreciate your analysis of how some of the existing attribution models come up short. Your e-book talks about creating an approach that considers all channels and potential touch points along the consumers path. The end result is a more accurate view of the causative impact of the various channels to the end sale.
My question is how do you understand the ROI of a specific marketing campaign. If I'm running 5 different marketing campaigns within the same channel, how does Intelligent Response Attribution provide insights into the performance at a campaign level.