A Data Science Central Community
I understand that Rapidminer and R has its limitation in size constraints (number of rows or records) since it is a community software. However is there any estimation on how many rows can these softwares take? Millions of rows? 1 million , 2million?
I need this information. Thanks in advance.
Their status as open source, community software doesn't have anything to do with dataset size limitations.
Both work with data in memory. So the limit is the available memory. On a notebook with 2 GB of RAM the limit is smaller than on a 64 bit server with 16 GB of RAM.
The limitations don't depend on rows only but what is in those rows. One row with 10 string fields takes more memory than 100 rows of a few integer values.
So you can say: Both software packages can easily cope with millions of rows as long as they fit into the computer's RAM.