Subscribe to DSC Newsletter

I am looking for, preferably real life, test data for my (prototype) data mining/machine learning tool Emping-0.6, This tool discovers heuristic rules in a table of nominal data and the relationship between these rules. It has currently been tested on the mushroom data set from the UCL machine learning repository and the audiology set from that same repository.

For more info see Muitovar. Don't forget to have a look at the 3635 rules for poisonous/edible mushrooms and to test them in a real-time database query of the original table.

The tool, Emping, is logical in nature, not statistical. So, if 100000 ravens are black and 1 is white, the rule is invalidated. In many cases this is not what you want, but in others it is. Rare side effects in medicine and pharmacy, and the relevant circumstances in which they occur, comes to mind.

Emping is written in the functional programming language Haskell and the current version 0.6 is open source under the GPL license. The underlying algoritm was discovered by myself and is not used in any other tool, as far as I know.

I'd be interested in any nominal data table but, in view of the above, especially an (anonymized) medical data set. Please contact me if you can help. Many thanks in advance.

Views: 276


You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service