Subscribe to DSC Newsletter

Hello everyone,

I am currently starting some research on human behavior modeling and prediction. While searching for the best statistics and data mining software I came across a very big $$ issue :) As I am doing this in the course of my PhD and currently the institute/university is not capable of providing a license for it, I decided to go for R.

I am particularly enthusiastic as it can be plugged together with Java and therefore address stuff in real-time. Based on your expertise, do you think this software will limit my results? What you be the major drawbacks of R?

Best Regards,
Jose Simoes

Views: 776

Reply to This

Replies to This Discussion

Hi Jose,
only draw back with R is that it can not handle large data more then 1.5 GB using R console. R can be connected to data base using all type of data base connection. If you are interested in using java based software then i would recommend kNIME open source software. This can be easily integrated with R and weka using plugins.
This has got nice connectivity to data base and we can handle large volume of data. I hope this will help you.
Please mail me if you need some more help.
Regards
Nikesh Srivastava
I do not doubt your words, but could you please point to a reference regarding the memory limitation (especially regarding the size 1.5 GB) ?
Apparently the limit is 3 GB:
http://cran.r-project.org/bin/windows/base/rw-FAQ.html#There-seems-...

Although you can definitely use distributed computing with R. I believe there are open source solutions, but there is also a commercial solution: REvolution Computing
http://www.revolution-computing.com/
Thank you !
I am not sure about the 3GB limit. The limitation is mostly becaue of the RAM. I don't know how big is the dataset on which Jose is going to work on, but I have seen R perform very well under Linux 64 bit versions.
Oh, seems that you are correct. I performed a little search and discovered this:
http://ubuntuforums.org/showthread.php?t=764143

So there is no memory limit for R under Linux ? Any remarks, Mr Winters ? :)
Hi Steffen,
Limitaion figure 1.5GB has been conculeded based on my experiment with R. i wanted to measure scalability of R in term amount of data it can take. If you want to know the memeory limitation of R just type memory.limit in R console. It would give you intial figure of memory. If you want to improve the memory type memory.size(4000) it would increase memory to 4 GB but defentely you can not stratch more then this. In fact when i was running SVM classification on data with 42k record i got error message R can not handle vector of size 1.5GB. So these are some of my observation while working with R. But R is algorithmics rich software.
Quote: If you want to improve the memory type memory.size(4000) it would increase memory to 4 GB but defentely you can not stratch more then this

Under windows or linux ? Please more references and less opinions :). No offense !
Memory.size() only applies on windows.
Some tips for diagnosis and treatment of memory problems:
http://www.ats.ucla.edu/stat/R/faq/memory_usage_pc.htm
http://www.r-bloggers.com/memory-management-in-r-a-few-tips-and-tri...
Thanks Robin !
Hi Kumar,
It seems that you are also familiar with PASW...Iam working on SPSS(now it is called as PASW) could please help me how to deploy a Model in PASW Deployment services....

Any Feedback may help..

Thanks in Advance,
Ashok B.
Thank you all for your answers.

However, there is another issue which I think I did not make clear. I pretend to use this JAVA code in real-time (and online), in other words, the code is supposed to be deployed in a Java Application Server (Servlet container) like JBOSS or Sailfin. So my question is, if I develop something in KNIME or other Java environment, will it still run on these environments?

Best Regards,
Jose Simoes

RSS

On Data Science Central

© 2020   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service