A Data Science Central Community
We are students working on modeling a churn prediction for a telco company, and we are looking for the best tools to use them (free, and not free), so in our case our criteria are :
It’s a graduation project
We are working in telco company
Our database is on oracle
The model has to support ten million customers
We will build a interface connected to the model
After our research we made this ranking :
Free tools :
Not free tools :
We need your critics about that ranking, feel free to adjust it.
We are grateful for your help.
More important than tool are your data.
You will need all many attributes about customers + target variable (will / won't churn in next X days).
That's the biggest challenge in front of you.
10M customers? You can always use sampling and reduce it significantly. At least until you find set of the best predictors.
All of those tools can handle such job and connection to Oracle isn't a problem. But they are all so different ...
KXEN & SPSS Modeler can be mastered very fast.
SAS is great, handles large data very well, may be a little bit more complicated.
I would choose R - great features, but hardest to master - you have to write code, don't have nice GUI.
Weka isn't a real option. RapidMiner is based on Weka, may be an alternative with GUI.
I heard good things about Orange, but haven't ever try. Do it and share your experience.