Subscribe to DSC Newsletter

10 Enterprise Predictive Analytics Platforms Compared

Originally posted on Butler Analytics on July 17, by Martin Butler.

Google Analytics Interface

The ten predictive analytics offerings listed below vary enormously in functionality and applicability. Where a product is exceptional it is given a five star rating – although this obviously does not mean it is the best solution for your organization. Tibco has been added because it embraces R for predictive analytics, but is also capable of BI and visualization – an interesting mix.

FICO stars40

FICO provides a broad range of technologies and services to support business optimisation and the embedding of intelligence into applications of various kinds. Under the hood there is a lot going on – from linear (and non-linear) programming through to predictive analytics and other extremely powerful methods of supporting business decisions through applying intelligence to a variety of applications. It is clear that the experience and know-how of FICO in the industries it serves is probably unique. Financial services, retail, government and healthcare are its main markets, but the technologies and methods they employ have broad applicability.

Fraud detection and customer credit worthiness are two of the primary themes in the application of its technology, but the portfolio is so broad that most business problems will be addressable. Perhaps most interesting is the capability to combine different model types into a cohesive whole. Business rules (which are deterministic) can be combined with predictive analytics (which is usually probabilistic) to create very accurate models for dealing with customers and detecting anomalies.

IBM  stars50

For IBM predictive analytics is largely a data management and infrastructure issue. In my conversations with them they stress the data management aspect particularly, and with good reason. The application of algorithms to data and the building of models, which is primarily accomplished with SPSS, is really just a small part of the story. The management of large data volumes and the deployment of models into the production environment is the more challenging aspect of analytics, and it is something IBM does very well.

The IBM analytics solution will primarily be of interest to large organisations looking for more than a point solution, and wanting to create a viable, long term analytics infrastructure and capability. To this end IBM offers its InfoSphere data management and infrastructure products, and the SPSS suite of analytical tools for both analysts and end users. The combination represents the premier analytical solution currently available, and of course IBM has a number of vertical solutions to offer also. It is of course a fairly expensive solution, but in many ways is unchallenged.

SPSS

Data Collection Family

This suite of products from IBM is primarily aimed at the design, creation, deployment, analysis and reporting of surveys. They provide a top-to-tail capability that supports various means of survey distribution (web, paper, phone, in-person) and the supporting technology to capture the results, including scanning of documents and text processing.

The SamplePower utility provides a means of establishing survey sample size – something that would normally require a skilled statistician. This sets the tone for the whole Data Collection product set, since virtually all elements of the process can be handled by users. This does not however include the analytics used to draw conclusions from the data, and is the domain of the statistics and Modeler packages.

IBM SPSS Statistics

This perhaps the most widely used set of statistical products in the world. The capability ranges from end user marketing tools through to specialised statistical analysis, and of course the very well respected SPSS analyst workbench. There isn’t much utility in detailing the features of the statistics capability because it does pretty well everything. A few things are also available that are not really statistical in nature such as neural networks.

IBM SPSS Modeler

This employs data mining techniques to find relationships within data. The professional version supports the creation of predictive models using classification, association and segmentation techniques. Modeler Premium adds the ability to process unstructured data from the web, text, email, social data and so on. Again there is little point listing all the techniques supported by Modeler since most conceivable options are present (Bayes, SVM, K-means etc).

Deployment Family

IBM SPSS Decision Management allows predictive models to be integrated with business rules for deployment into production systems. The Collaboration and Deployment option supports the sharing of analytical assets and provides an environment to automate the analytical process.

InfoSphere

InfoSphere addresses more than predictive analytics requirements and is fully addressed in a separate paper. However the broad capability of the product suite includes InfoSphere Warehouse for traditional data warehousing, InfoSphere Information Server, DataStage and Data Replication to support integration and data staging, Master Data Management and Big Data analytics, which is based on the Apache Hadoop technology.

Big Data analytics not only supports large data sets, but provides sufficient performance for real-time analytics and accommodation of very high volume streaming data. This will become more important as information sources from various sensors (eg RFID) and real-time market information becomes more widely used.

KXEN  stars40

KXEN is one of the leaders in the world of predictive analytics, and with good reason. Although a recent marketing makeover seems to have deprived prospects of learning what is under the hood, we can tell you that there are some heavy duty algorithms working to make sure that predictive models are valid (Structural Risk Minimisation techniques are used). This is a heavy duty product suitable for large organisations in the main, although recent cloud based offerings make it accessible to smaller businesses.

There are six elements to the product range: ...

Read full article.

My Comment:

I'm curious to know why the author picked up these 10 platforms, but did not mention Teradata, goPivotal, Informatica, Matlab, Lavastorm, Splunk, HortonWorks, MarkLogic, Yottamine and many others. Here is a list with more than 40 companies.

Views: 3665

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Comment by Martin Butler on August 12, 2013 at 5:10am

Please read my first comment about the dynamics behind technology selection in the Enterprise. FICO for example is widely used in the financial services industry, retail and others. There are definitely others that could have been included, but as I stated in my comment there is nothing sacred about this list - just ten widely used products in the Enterprise (as opposed to being used by analysts)  - it could easily have been 20 products - but it wasn't.

Comment by Ralf Klinkenberg on August 12, 2013 at 5:04am

Sorry for missing the the short paragraph about StatSoft in your article, which indeed covers Statistica. Other vendors like IBM and Tibco are covered in so much more length that I missed this paragraph.

Is there a particular reason, why you did not cover RapidMiner and RapidAnalytics in your comparison?

And back to my original question: What were the selection criteria for this list of frameworks?

Comment by Martin Butler on August 12, 2013 at 4:54am

Please read the article before commenting - Statistica is covered. Rapid Miner is covered elsewhere on my web site. 

Comment by Ralf Klinkenberg on August 12, 2013 at 4:48am

If you take a look at the annual KDnuggets data mining tool poll

http://www.kdnuggets.com/2013/06/kdnuggets-annual-software-poll-rap...

you will note that many of the most widely used predictive analytics platforms are not covered in Martin Butler's comparison, like for example RapidMiner and RapidAnalytics and Statistica, while he covers some tools that only have a very small user base and a very small market adoption. The selection of tools in this comparison seems quite arbitrary. What were the selection criteria?

Market share?

Size of the user base and/or user community?

Sales volume?

Or just names the author was most familiar with?

A bit more transparency would be nice.

Comment by Martin Butler on August 11, 2013 at 1:54am

There is nothing sacred about this list - content or size. But - some things to bear in mind:

Nearly all technology markets become dominated by around 3 suppliers. Many of the entrants into the analytics market will fail or be swallowed up.

Corporate management likes big names.

Putt's law dictates that it's the suppliers management feel comfortable with that are the ones who get selected - not suppliers with the best technology.

I tried to accommodate these dynamics in my list.

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service