Subscribe to DSC Newsletter

Interesting discussion on LinkedIn:

Hi, we've been using SPSS Clementine for data mining and text mining but are interested in seeing what else is out there. What SAS tools is any compete with Clementine?

Hello Tom
I dont now Clementine, but could give you a brief introduction to SAS.

SAS datasets are actually tables in a relational system. It is however not a good idea to use a SAS library as a transactional database. On the other hand, you have verry much practical ways to create and replace them.

SAS base offers a huge library of procedures for data processing, reporting and statistics. It includes SQL and of course the classical Data Step.

This Data Step offers quite some data merging possibilities, combined with a decent structured programming language in a wel lintegrated unit.

Further, SAS interfaces with virtually anything. You can even create or read Excell files on a mainframe or let a spreadsheet run SAS conde in the background.

Then you have the new style graphical products. Enterprice Guide, for instace encapsulates base SAS behind an intuitive user interface. It is included in a standard licence.

If you cooperate in a large team, think about a metadata server and Data Integrator studio, and if you need sofisticated prediction, think about Enterprise Data Miner.

Hope you find someone who knows both too.

What's the pricing structure for SAS? Are there a lot of ad-ons like with SPSS. We only have one oerson here who has been a SAS user previously. We have been using SPSS products and a few other tools mainly.

You can check prices on SAS web page.

I have used the both and if you compare Enterprise Miner and Clementine, the statistical procedures provided in standard are approximately the same and the user interfaces of the both have the same philosophy - You have to link nodes together.

The main difference comes from what you explain:

- Clementine is only a piece of the SPSS package: a suite of individual tools (add-on) which are more or less correctly interfaced.
Explanation: SPSS package is couponed of individual software that SPSS has purchased - Clementine was an independent package which has not been developed by SPSS - SPSS has simply developed the Windows version (Clementine was developed previously to work under UNIX environment) - But last time I have a demo (around 1 year), the features have not really evolved since the original version.

- SAS Enterprise Miner (EM) gives you a full access to all SAS procedures in a single environment (it is its main advantage) and you can use your own node in the EM interface.
The standard nodes are sufficient and very simple to use, but if you want to create you own nodes, you can use the SAS language (not really difficult to handle and very powerful) - As SAS is widely used in the enterprise and research world, it is very easy to find SAS programs on Internet.
Note that the package includes also Enterprise Guide, a reporting tool.
However, if you want to connect SAS to external data sources, you will have to purchase additional add-on (ODBC connectors for example).

I personally prefer SAS, because I think it offers more possibilities and does not require to be familiar with different user interfaces (all in one).

In term of price, SAS EM is usually more expensive than Clementine (from my experience), but it depends how you negotiate (We have recently purchase SAS Pro for a very competitive price). I guess that the pricing policy depends on the regional location, but here in Europe (BENELUX), the EM should include at least 5 licenses. You can also negotiate this point.

If the price is for you an issue, an other similar package exists: Statistica Data Miner of Statsoft

It is comprehensive in term of statistical/datamining methodology and give you access to all Statistica, it has a similar user interface, and it is less expensive than the two previous one. Other advantage of Statistica: it looks like SPSS user interface (not necessary to programming) and it is fully integrated (a single user interface). You can also use 2 programming laguages (or VBA or R).

Note that SAS (and also Statsoft) can propose trial versions.

Good luck in your choice

Tags: clementine, sas, sas enterprise miner, text mining

Views: 2219

Reply to This

Replies to This Discussion

I have used them both, and find that SAS allows you better control over your analysis and allows multiple users to see processes very clearly-- I've also found, as someone else said, that the SAS company is very helpful at getting the right package of tools to you when told your needs.

SPSS does seem to be more intuitive for new users, though.
I like SAS EM because it's easy to use (just connecting nodes as described earlier) and the results are presented graphically with linked plots, histograms, etc which contains a lot of information. The presentation of the results is what I like best after using some data mining techniques s.a. ANNs, SVMs, SOMs on a computer running Linux (i.e. lacking a gui and poor result presentation) when I did my PhD.

The "Statistica Data Miner of Statsoft" package sounds intersting, will take a look at it.

Thanks for a good post!

I'd say the main reason to choose one over the other is often your existing infrastructure.
SAS have much bigger share in traditional statistical market and mainframe legacy businesses (banks, insurance, etc). Clementine is more common in places where there is an existing relational warehouse, or no established SAS data sets (telco's, marketing depts etc).

Not to say one is better than ther other, but its worth noting that the easy user interface was first developed in Clementine and others (SAS, StatSoft etc) have since copied the design.


On Data Science Central

© 2020   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service