Subscribe to DSC Newsletter

What Question Should I Ask? The Case Against Search for Big Data Analytics

When it comes to Big Data, one of the largest data sets in the world is the Internet. Thanks to Google, the Internet is essentially indexed as though it were a massive database. Consequently, the world’s 1.9 billion Internet users are conditioned to search, relying on Google or other search engines to find the answers they are looking for. The problem with the Google approach to Big Data is: What if you don’t know what question to ask?

That is the real challenge [and opportunity] of Big Data Analytics.

Every day I receive phone calls and emails for data analysts, IT departments, and those tasked with Big Data projects within organizations across sectors asking what can we do? How can we extract gold from the mine that is Big Data? While the technological advances that have made it possible to collect, store and analyze Big Data are tremendous, organizations have hit a wall where the amount of data available far exceeds the human capacity to process it.

Most of them are querying their way through the data, so it’s important to understand that while the size or the number of records is big, the real challenge is the breadth of the data. As more and more data is aggregated, this problem continues to grow. For example, if you have a database with 100 columns and 6 choices per column, there are more possible queries than there are atoms in the universe. The sheer magnitude of potential queries exceeds the capability of mainstream data mining methods, making them mere data-shovels.

Consider the case of network security. We spoke to an organization that uses query-based tools to detect network intrusions. But the network attacks are continuous and ongoing, so the analysts can’t seem to stay ahead of the curve because they can’t detect new intrusion patterns in real-time.

What if they could?

When evaluating new approaches to Big Data Analytics, here are some key considerations:

  • Does this approach reduce the effect of noise in the data?
  • Does it speeds up processing time?
  • Does it reduce storage requirements?
  • Does it automate the pursuit of needles in haystacks?
  • Does it discover unexpected connections across data sources?
  • Does it automatically generate questions?

Google may have made search easy for information seekers, but search and query-based tools simply cannot deliver automatic, mission-critical insights from an organization’s Big Data, and throwing more data scientists at the problem is cost-prohibitive and, well, manual. To truly leverage Big Data for competitive advantage, organizations need a new approach – one that automatically surfaces the information you need when you need it.

Read more about Big Data Analytics on the Emcien blog!

Views: 226

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Tony Agresta on December 27, 2012 at 1:02pm

I would agree.   The analytics world has evolved from the days when reporting and dashboards could tell the story to more sophisticated technology that combines pattern recognition with graph analysis.  What  a powerful combination.   Data discovery methods using network graphs and interactive analysis to see the data in different forms can be useful.   Interacting with graphs to identify important network connections also helps.  And there are many ways to manipulate network graphs to reveal the insights.  But when combined with pattern recognition that analyzes the data for you and then points the analyst to a specific location is the way to go.  Using those insights to quickly detect similar patterns against new data sets is the way to go at a time when big data is only getting bigger.

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service