By Gary Angel
My post on “Numbers it’s better NOT to know” got me thinking more closely about the relationship between a theory of error and the types of web analytic process organizations should adopt. That led to a more considered post “Defending the Indefensible” where I laid out some of the most common causes of error and talked a little bit about how these errors should influence our thinking about organization and process. Jacques Warren, whose comments certainly triggered some of this, responded with the question:
“How can one bring a company that far [using analytics appropriately], while fighting subjectivity, politics, and vested interests?”
Now I’m not really a process guy. I tend to be one of those people who just want to get my hands on the data and start churning out analysis. There are people like our own Phil Kemelor of Semphonic and Eric Peterson of Web Analytics Demystified, who have thought longer and harder about appropriate analytics processes than I have or want to. But you can’t do measurement for 20+ years without seeing just about every kind of error that can be made – and getting a sense for the kinds of errors that happen over and over again.
And it seems to me that while good process consultants have always implicitly incorporated a certain amount of error theory (why analysis goes bad) into their thinking, it hasn’t always been first and foremost. Part of the reason for that is that error theory (why things go wrong) is just one piece of the process puzzle. As Jacques has pointed out several times in his comments, protecting people from error is not the first priority for the most common sort of organization where nobody uses the data at all!
On the other hand, if your organization is using data, then building processes without careful attention to a theory of error is just asking for trouble.
So what would a theory of error about web analytics look like?
Here are a few basic causes of error that I think drive the vast majority of problems in web analytics:
1. Self-Interested Measurement. This problem is hardly unique to web analytics. It is the single most pervasive problem in any truth-seeking activity. Many of the institutional practices of scientist and academics are designed to protect against the force of self-influence. Though this is sometimes portrayed as venal, it is even more commonly encountered as simple self-delusion – we all have the strong desire to find that whatever we currently believe is true.
2. Lack of Statistical Significance. Statisticians are generally, widely and rightly considered to be a royal pain in the rear. They are like gatekeepers who are constantly slamming the door in your face – usually with a snide remark to go along. They are only necessary because the rest of us are constantly and helplessly fooling ourselves into believing that a pattern is real because it “looks” real. As a football fan, I hear this kind of stuff all the time from professional analysts – stuff like “In the last nine home games after an east to west trip, the home team has only covered the spread twice.” Ohhh - that’s significant! Except it’s not. Because the funny thing about randomness is that it almost never looks quite random. Flip a coin ten times and there’s a pretty good chance you won’t get five heads and five tails. Just as detailed analysis of all these obscure variables (like east to west trips) turns up lots of opportunities for bad analysis, web analytics reporting will do the same. You are suddenly putting lots of information into everyone’s hands. If they aren’t protected from misusing it, I guarantee you that your company will soon be betting money on numbers that don’t mean a darn thing.
3. Unreliable data and what to do about it. Nothing can create a statistically significant finding faster than bad data. As every analyst knows, the first analysis pass is usually good for little more than identifying all of the interesting “facts” that turn out to be measurement artifacts. While my first two principles are completely common to every truth-seeking endeavor, number 3 is more pronounced in web analytics than in most disciplines. God knows that this isn’t because most disciplines have clean data to work with – ours is just unusually bad. The problem has been compounded by the prevalent and thoroughly misguided idea that “trending the data” somehow protects against data quality issues.
4. Siloed Optimization. Large organizations tend to create a special class of measurement issues by creating silos of measurement that focus on single issues like organic search optimization or multivariate testing. This inevitably leads to siloed optimization where the incentive to local optimization cannibalizes success in other parts of the organization. This is a shockingly common problem and it’s an unusual one because it tends to be worst in the most sophisticated companies.
5. Metric Monomania. We see a metric move and we know it’s supposed to be actionable. So we want to do something about it. But, as I’ve argued for years now, the movement in a single metric is pretty much NEVER actionable. It doesn’t matter whether it’s a KPI or even a really good KPI. In the real world, KPIs are nearly always interrelated into systems – meaning that changes in one variable are nearly always driven by changes in other variables. Unless you understand the system you don’t understand the true significance of any given change in a metric.
6. Tactical Focus. For most analysts, tactics come much easier than strategy. Analysis of data nearly always drives plenty of micro-changes that might make a web site better. But the best uses of data are often in completely unrelated problems and contexts that have nothing to do with immediate tactical problems. You can try forty different variations of a drive to registration, tweaking everything from button color to offer text. But registration rates will always be crappy if you don’t give your customers a really good reason to register. You can micro-analyze your data with powerful statistical tools, but the biggest learnings may require nothing more than looking at your overall traffic numbers.
Full article at semphonic.blogs.com