Big Data Analytics
The first steps toward achieving a lasting competitive edge with Big Data analytics.
It’s no secret that the number of options available for storing and analyzing large amounts of data has been increasingly rapidly. The challenge going forward is figuring out what data to store where and what tools should be analyzed.
Sybase, an SAP company, moved to address that issue today with the release of version 15.4 of the Sybase IQ columnar database. The new release adds integration with the open source Hadoop data management framework along with a native MapReduce API, Predictive Model Markup Language (PMML) support, and a library of statistical and data mining algorithms that are tightly coupled with Sybase IQ’s PlexQ massively parallel processing (MPP) engine.
According to Joydeep Das, director of analytics product management at Sybase, the real end goal is to give customers the ability to quickly analyze data stored inside or outside of the IQ database. In some instances Das says that may mean transferring data from, for example, Hadoop into the IQ database. In other instances it may simply mean taking advantage of Sybase IQ’s ability to federate the processing of a request across Hadoop and Sybase IQ without having to move any data.
Das says there are multiple approaches to integrating data analytics across multiple data structures. With the release of 15.4 of Sybase IQ the company is embedding support for many of the interfaces these applications rely on, such as MapReduce, to improve the processing performance of the applications.
The best thing about Hadoop is that it provides a framework for managing large amounts of data on industry-standard x86-class servers. But Hadoop is not known for the speed at which that data can be processed. Going forward, IT organizations will be leveraging Hadoop to store more data than ever. But when it comes time to analyze the subsets of data that have the most value, Das says there is still going to be a need for powerful engines to run complex analytic queries.
So while Hadoop represents a major advancement in terms of reducing the cost of managing Big Data, it really represents the introduction of a critical new layer in the hierarchy of data management that still needs to be augmented by high-performance analytics systems.