I was at the Semantic Web Meetup @ the Hearst Building in NYC (amazing venue, the first green building completed in NYC) yesterday and someone asked about open source tools available for data mining, specifically for clustering. Unfortunately I had to run out after the meetup and couldn’t provide these to him. The one mentioned by the presenter was Weka, which also the first free open source tool I came across.
Anyway, here are the ones I have found that are worth checking out and I’m sure there are others and more to come.
These 4 have GUIs for us non-programmers.
Use at your own risk! I cannot speak to the accuracy of the algorithms although many of them seem to be well established in the field. There are really 2 issues here - you have to be concerned not only about the underlying algorithm being used but also how effectively and accurately it was translated into code. Nevertheless these seem like solid apps
Java based, open source, with GUI
From the University of Waikato in New Zealand (data mining in New Zealand sounds like lots of fun)
Python based, open source, GUI
From the AI Laboratory in Ljubljana, Slovenia
based on R, open source, GUI
I just downloaded this one but haven’t had a chance to look at it yet. I thought it should be considered due to the popularity of R.
Java based, open source, GUI
This implements the full WEKA catalog as well as their own library