Subscribe to DSC Newsletter

Hello everyone,

I am currently starting some research on human behavior modeling and prediction. While searching for the best statistics and data mining software I came across a very big $$ issue :) As I am doing this in the course of my PhD and currently the institute/university is not capable of providing a license for it, I decided to go for R.

I am particularly enthusiastic as it can be plugged together with Java and therefore address stuff in real-time. Based on your expertise, do you think this software will limit my results? What you be the major drawbacks of R?

Best Regards,
Jose Simoes

Views: 806

Reply to This

Replies to This Discussion

Hi Jose,
only draw back with R is that it can not handle large data more then 1.5 GB using R console. R can be connected to data base using all type of data base connection. If you are interested in using java based software then i would recommend kNIME open source software. This can be easily integrated with R and weka using plugins.
This has got nice connectivity to data base and we can handle large volume of data. I hope this will help you.
Please mail me if you need some more help.
Nikesh Srivastava
I do not doubt your words, but could you please point to a reference regarding the memory limitation (especially regarding the size 1.5 GB) ?
Apparently the limit is 3 GB:

Although you can definitely use distributed computing with R. I believe there are open source solutions, but there is also a commercial solution: REvolution Computing
Thank you !
I am not sure about the 3GB limit. The limitation is mostly becaue of the RAM. I don't know how big is the dataset on which Jose is going to work on, but I have seen R perform very well under Linux 64 bit versions.
Oh, seems that you are correct. I performed a little search and discovered this:

So there is no memory limit for R under Linux ? Any remarks, Mr Winters ? :)
Hi Steffen,
Limitaion figure 1.5GB has been conculeded based on my experiment with R. i wanted to measure scalability of R in term amount of data it can take. If you want to know the memeory limitation of R just type memory.limit in R console. It would give you intial figure of memory. If you want to improve the memory type memory.size(4000) it would increase memory to 4 GB but defentely you can not stratch more then this. In fact when i was running SVM classification on data with 42k record i got error message R can not handle vector of size 1.5GB. So these are some of my observation while working with R. But R is algorithmics rich software.
Quote: If you want to improve the memory type memory.size(4000) it would increase memory to 4 GB but defentely you can not stratch more then this

Under windows or linux ? Please more references and less opinions :). No offense !
Memory.size() only applies on windows.
Some tips for diagnosis and treatment of memory problems:
Thanks Robin !
Hi Kumar,
It seems that you are also familiar with PASW...Iam working on SPSS(now it is called as PASW) could please help me how to deploy a Model in PASW Deployment services....

Any Feedback may help..

Thanks in Advance,
Ashok B.
Thank you all for your answers.

However, there is another issue which I think I did not make clear. I pretend to use this JAVA code in real-time (and online), in other words, the code is supposed to be deployed in a Java Application Server (Servlet container) like JBOSS or Sailfin. So my question is, if I develop something in KNIME or other Java environment, will it still run on these environments?

Best Regards,
Jose Simoes


On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service