A Data Science Central Community
Here's my rebuttal to this article published in Wired in 2008.
A lot can be done with black-box pattern detection, where patterns are found but not understood. For many applications (e.g. high frequency trading) it's fine as long as your algorithm works. But in other contexts (e.g. root cause analysis for cancer eradication), deeper investigation is needed for higher success. And in all contexts, identifying and weighting true factors that explain the cause, usually allows for better forecasts, especially if good model selection, model fitting and cross-validation is performed. But if advanced modeling requires paying a high salary to a statistician for 12 months, maybe the ROI becomes negative and black-box brute force performs better, ROI-wise. In both cases, whether caring about cause or not, it is still science. Indeed it is actually data science - and it includes an analysis to figure out when/whether deeper statistical science is really required. And ill all cases, it always involves cross-validation and design of experiment. Only the statistical theoretical modeling aspect can be ignored. Other aspects, such as scalability and speed, must be considered, and this is science too: data and computer science.