A Data Science Central Community
The ten predictive analytics offerings listed below vary enormously in functionality and applicability. Where a product is exceptional it is given a five star rating – although this obviously does not mean it is the best solution for your organization. Tibco has been added because it embraces R for predictive analytics, but is also capable of BI and visualization – an interesting mix.
FICO provides a broad range of technologies and services to support business optimisation and the embedding of intelligence into applications of various kinds. Under the hood there is a lot going on – from linear (and non-linear) programming through to predictive analytics and other extremely powerful methods of supporting business decisions through applying intelligence to a variety of applications. It is clear that the experience and know-how of FICO in the industries it serves is probably unique. Financial services, retail, government and healthcare are its main markets, but the technologies and methods they employ have broad applicability.
Fraud detection and customer credit worthiness are two of the primary themes in the application of its technology, but the portfolio is so broad that most business problems will be addressable. Perhaps most interesting is the capability to combine different model types into a cohesive whole. Business rules (which are deterministic) can be combined with predictive analytics (which is usually probabilistic) to create very accurate models for dealing with customers and detecting anomalies.
For IBM predictive analytics is largely a data management and infrastructure issue. In my conversations with them they stress the data management aspect particularly, and with good reason. The application of algorithms to data and the building of models, which is primarily accomplished with SPSS, is really just a small part of the story. The management of large data volumes and the deployment of models into the production environment is the more challenging aspect of analytics, and it is something IBM does very well.
The IBM analytics solution will primarily be of interest to large organisations looking for more than a point solution, and wanting to create a viable, long term analytics infrastructure and capability. To this end IBM offers its InfoSphere data management and infrastructure products, and the SPSS suite of analytical tools for both analysts and end users. The combination represents the premier analytical solution currently available, and of course IBM has a number of vertical solutions to offer also. It is of course a fairly expensive solution, but in many ways is unchallenged.
Data Collection Family
This suite of products from IBM is primarily aimed at the design, creation, deployment, analysis and reporting of surveys. They provide a top-to-tail capability that supports various means of survey distribution (web, paper, phone, in-person) and the supporting technology to capture the results, including scanning of documents and text processing.
The SamplePower utility provides a means of establishing survey sample size – something that would normally require a skilled statistician. This sets the tone for the whole Data Collection product set, since virtually all elements of the process can be handled by users. This does not however include the analytics used to draw conclusions from the data, and is the domain of the statistics and Modeler packages.
IBM SPSS Statistics
This perhaps the most widely used set of statistical products in the world. The capability ranges from end user marketing tools through to specialised statistical analysis, and of course the very well respected SPSS analyst workbench. There isn’t much utility in detailing the features of the statistics capability because it does pretty well everything. A few things are also available that are not really statistical in nature such as neural networks.
IBM SPSS Modeler
This employs data mining techniques to find relationships within data. The professional version supports the creation of predictive models using classification, association and segmentation techniques. Modeler Premium adds the ability to process unstructured data from the web, text, email, social data and so on. Again there is little point listing all the techniques supported by Modeler since most conceivable options are present (Bayes, SVM, K-means etc).
IBM SPSS Decision Management allows predictive models to be integrated with business rules for deployment into production systems. The Collaboration and Deployment option supports the sharing of analytical assets and provides an environment to automate the analytical process.
InfoSphere addresses more than predictive analytics requirements and is fully addressed in a separate paper. However the broad capability of the product suite includes InfoSphere Warehouse for traditional data warehousing, InfoSphere Information Server, DataStage and Data Replication to support integration and data staging, Master Data Management and Big Data analytics, which is based on the Apache Hadoop technology.
Big Data analytics not only supports large data sets, but provides sufficient performance for real-time analytics and accommodation of very high volume streaming data. This will become more important as information sources from various sensors (eg RFID) and real-time market information becomes more widely used.
KXEN is one of the leaders in the world of predictive analytics, and with good reason. Although a recent marketing makeover seems to have deprived prospects of learning what is under the hood, we can tell you that there are some heavy duty algorithms working to make sure that predictive models are valid (Structural Risk Minimisation techniques are used). This is a heavy duty product suitable for large organisations in the main, although recent cloud based offerings make it accessible to smaller businesses.
There are six elements to the product range: ...
I'm curious to know why the author picked up these 10 platforms, but did not mention Teradata, goPivotal, Informatica, Matlab, Lavastorm, Splunk, HortonWorks, MarkLogic, Yottamine and many others. Here is a list with more than 40 companies.