I was reading the article
Gambling versus Probability: Predictive Analytics Requires Advanced... published by Thomas Rathburn in the B-Eyed network. Here's the section that I found to be filled with wrong information:
Traditional statistical analysis is often of limited value. It is not that these tools are somehow flawed. Rather, it is that they are overly simplistic and, in many cases inappropriate for the task of modeling human behavior.
Traditional statistical techniques are overly simplistic as they are suitable for only the most basic support of our decision making. They typically assume that the interactions in our decision variables are independent of each other, when, in fact, we are bombarded with multiple inputs that are highly interrelated.
Additionally, these simple modeling techniques generally attempt to build linear relationships between the inputs and the desired output. It is often the case that the basic recognition of the non-linear aspects of a solution space will generate improved decision making.
Traditional statistical analysis is often an inappropriate choice because we are attempting to model human behavior. Human behavior is typically not normally distributed, it rarely has a stable mean and standard deviation and it never has inputs into a model that cause a particular type of behavior – conditions that are necessary for the correct application of traditional statistical tools.
My rebuttal:
- Most statistical models DO NOT assume normal distribution. None of my models rely on normal distribution, but are dealing with multimodal or highly skewed distributions (e.g. in the context of fraud detection).
- Most modern models do not assume that decision variables are independent. See e.g. my hidden decision tree technology that handles interaction, as well as many other models that include interactions.
- Models with linear relationships are just a very small subset of all models. Hierarchical Bayesian and stochastic processes are examples of non linear models.
The author seems to believe that statistics is just about linear regression and basic tests of hypotheses. This is what you actually study during the first 30 hours in any basic statistics curriculum, but there's much much more than that.
You need to be a member of AnalyticBridge to add comments!
Join AnalyticBridge