A Data Science Central Community
We all know that time is money, especially when you're paying a data scientist. But the New York Times reports that...
"Data scientists, according to interviews and expert estimates, spend 50 percent to 80 percent of their time mired in [the] mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets."
- Steve Lohr, NYT
Much of the value that can be derived from data comes from combining different data sets, but these different data sources all come in different formats. According to the Co-founder of Trifacta, even the most powerful algorithms can't derive insights from raw data. This means that data scientists are forced to act more like data janitors than actual scientists. That unification process, which is commonly referred to as "data wrangling", is a shockingly large part of a data scientist's daily work.
“It’s something that is not appreciated by data civilians. At times, it feels like everything we do.”
- Monica Rogati, VP for Data Science at Jawbone
This is a major issue for the industry, because it means that more than half of all data analysis is actually not analysing anything at all. If Big Data is ever going to deliver on its promise of smarter, data-driven decision-making in every field, there has got to be a better, faster way of getting process ready data.