A Data Science Central Community

A/B testing is widely used for online marketing, management of Internet ads or any other usual analytics. In general, people use it in order to look for "golden features (metrics)" that are vital points for growth hacking. To validate A/B testing, statistical hypothesis tests such as t-test are used and people are trying to find any metric with a significant effect across conditions. If you successfully find a metric with a significant difference between design A and B of a click button, you'll get happy. Such a metric can provide a rule-based predictor for KGI / KPI: for example, a landing page with a button A increases conversion rate by 2%.

But unfortunately you may encounter a very bad situation: there are entirely no metrics with any significant difference between conditions. In this case, do you give up seeking golden features and its rule-based predictor?

Even if so, you don't have to give up. Multivariate modeling, such as (generalized) linear models or machine learning classifiers, can build a good model to predict KGI / KPI, without any "golden features". In the latest post of my blog, I argued about such a case in which there are entirely no golden features with any significant differences but multivariate modeling works.

This is a long article. Click here for details (data sets, R source code, statistical tests and models such as L1-Penalized logistic regression using the *glmnet* library, and Welch 2-sample t-test, in R).

The result told us that univariate stats and rule-based predictors given by usual hypothesis testing on them sometimes fail, while multivariate modelings work well given by (generalized) linear models or machine learning classifiers.

In general, multi-dimensional and multivariate features usually represent more complex information and internal structure of datasets than univariate features. But in many situations in marketing, not a few people neglect an importance of multivariate information and even persist in running univariate A/B tests and looking for "golden features or metrics".

Even when multiple features have "partial" correlations, such univariate A/B testing can be wrong because partial correlation easily affects outcome of usual univariate correlation (and also univariate testing).

If you have multivariate datasets, please try multivariate modelings and don't persist in univariate A/B testing.

**DSC Resources**

- Career: Training | Books | Cheat Sheet | Apprenticeship | Certification | Salary Surveys | Jobs
- Knowledge: Research | Competitions | Webinars | Our Book | Members Only | Search DSC
- Buzz: Business News | Announcements | Events | RSS Feeds
- Misc: Top Links | Code Snippets | External Resources | Best Blogs | Subscribe | For Bloggers

**Additional Reading**

- Data Scientist Reveals his Growth Hacking Techniques
- 10 Modern Statistical Concepts Discovered by Data Scientists
- Top data science keywords on DSC
- 4 easy steps to becoming a data scientist
- 13 New Trends in Big Data and Data Science
- 22 tips for better data science
- Data Science Compared to 16 Analytic Disciplines
- How to detect spurious correlations, and how to find the real ones
- 17 short tutorials all data scientists should read (and practice)
- 10 types of data scientists
- 66 job interview questions for data scientists
- High versus low-level data science

Follow us on Twitter: @DataScienceCtrl | @AnalyticBridge

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge