Subscribe to DSC Newsletter

Machine data is the fastest growing, most pervasive part of “big data”. It is also the most valuable, containing a critical record of user behavior, security risks, service levels, fraudulent activity, customer experience and more.

But, machine data remains largely untapped. Why? Because this data is difficult to process and analyze in a timely manner using traditional approaches. In fact, 70 percent of organizations spend more time collecting and preparing their data than they do analyzing it.

Can you search, browse and analyze your real-time and historical data from one place?
Are you spending more time collecting and preparing your data than analyzing it?
Can you analyze your events as they occur and ask new questions in the moment?

Ventana Report

Free Ventana Research report on strategies and approaches to use, analyze and derive real-time insights from machine data.

Splunk

Learn why over 3,700 enterprise customers and over half of the Fortune 100 use Splunk to make sense of their machine data.


 

Views: 612

Replies to This Discussion

Machine data is a separate analytics discipline because the focus is on 'data in motion' not 'data at rest'.

Smart grid, smart sensors, smart switches, ad hoc networks (military, space and commercial) continuously stream data as a push model to federated or single data acquisition systems that act as both filters and funnels to an analytics solution. (See document uploaded Cloud Service Smart Grid Solution.pdf)

The current generation of data storage and relational database solutions is not designed to process this in real time or handle the growing volume of non-structured data types.

That's why hardware and OS virtualization has emerged as a cloud solution.

That's why EMC, Teradata, Oracle and IBM are embracing the Google created Hadoop data model.

That's why in memory analytics is increasingly being used to handle SQL-like query on data in transit.

That's why columnar data appliances (Netezza, AsterData, Vertica...) are growing rapidly

To see the future, check out ucirrus.com which elegantly and by far the most rapidly processes SQL queries and data visualization on hundreds of thousands of data points per second. They proved it for call record analysis in telecoms and replaced Oracle at eBay for the same reasons.

But, machine learning implies that a data modeling and inference process, with minimal human input after completion of the learning cycle, can steadily improve results over time.

This requires a continuous data  forensics process running in parallel. If input data formats change or certain data types exhibit volatility (sparse data, wrong values), the machine may not be "smart" enough to detect and warn, let alone adapt and modify its algorithms.

Machine learning is excellent in industrial processes (geospatial, robotics) but weak in human behavior processes (fraud detection, building ontologies from human speech). 

Really, Big Data just means rapidly growing, semi-structured, multi-modal, multi-point transactional data.

You don't have to store it in a database in relational form and then apply BI tools for analysis. These exabytes of data end up as Gigabytes once analyzed, reduced and archived.

Google knows that with its crawled data. So should BI users.

Attachments:

RSS

On Data Science Central

© 2020   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service