A Data Science Central Community
Organizations around the globe and across industries have learned that the smartest business decisions are based on fact, not gut feel. That means they're based on analysis of data, and it goes way beyond the historical information held in internal transaction systems. Internet clickstreams, sensor data, log files, mobile data rich with geospatial information, and social-network comments are among the many forms of information now pushing information stores into the big-data league above 10 terabytes.
Trouble is, conventional data warehousing deployments can't scale to crunch terabytes of data or support advanced in-database analytics. Over the last decade, massively parallel processing (MPP) platforms and column-store databases have started a revolution in data analysis. But technology keeps moving, and we're starting to see upgrades that are blurring the boundaries of known architectures. What's more, a whole movement has emerged around NoSQL (not only SQL) platforms that take on semi-structured and unstructured information.
This image gallery presents a 2011 update on what's available, with options including EMC's Greenplum appliance, Hadoop and MapReduce, HP's recently acquired Vertica platform, IBM's separate DB2-based Smart Analytic System and Netezza offerings, and Microsoft's Parallel Data Warehouse. Smaller, niche database players include Infobright, Kognitio and ParAccel. Teradata reigns at the top of the market, picking off high-end defectors from industry giant Oracle. SAP's Sybase unit continues to evolve Sybase IQ, the original column-store database. In short, there's a platform for every scale level and analytic focus, so click on to see and read more about your options.
Read full article (14 pages) at http://www.informationweek.com/news/galleries/software/bi/231900870