A Data Science Central Community

Both R & Python should be measured based on their effectiveness in advanced analytics & data science. Initially, as a new comer in data science field we spend good amount of time to understand the pros and cons of these two. I too carried out this study solely for “self” to decide which tool should i pick to get in depth of data science. Eventually, i have started realizing that both (R & Python) has its space of mastery along with their broad support to data science. Here some understanding on “when-to-use-what”

- R is very rich when you get into the descriptive statistics, Inference, statistical modeling and start plotting your data on the bar, pie chart and histogram. When your data is pretty much shaped and easily consumable for statistical modeling using vectors, matrix etc.
- First time learner who have some knowledge on statistics can start getting depth of Graph-cum-visualization with their data using R in terms of trends, identify the correlation etc. I observed that you don’t need to start practicing R as a separate programming language. You can very well start sculling your boat in depth of statistics keeping R in another hand.
- R plays vital role for analyst who love to see the data distributions before drawing conclusion. It also helps analysts to visualize outliers and data density of given data set.
- As you start getting more into probabilistic problems and probabilistic distribution R ease the data manipulation using vector and matrix. Even same applies to linear regression problems also.
- With support of R to stat rich problems you don’t need to get into the complexities of python, OOPS and understanding of the data types nitty-gritty.

Now, when you start getting into space of predictive modeling, machine learning and mathematical modeling, Python can give a easy hand. Mathematical functions, algorithmic problems find good support from Python libraries for k-means & hierarchical clustering, multivariate regression, SVM etc. Not limited to this, but it also has good support from data processing & data munging libraries like Pandas and NumPy. Here are some cents for python:

- We know! Python is full fledge “scripting language” and this statement tells everything. Most importantly, over the years Python has developed an eco-system for end-to-end analytics.
- Now you are not confined to data process and formalization, but you can easily play around data sourcing and data parsing too using programming model. This open the opportunities to analyze semi-structured data (JSON,XML) easily.
- With Python you have all liberties to start consuming the data from unstructured sources too. With streaming support from Hadoop extend the possibilities of using python on unstructured data stored on HDFS and from HBase for graph & networked data processing.
- With rich libraries like Scikit-learn you can do all text mining, vectorize the text data and identify similarities between posts and texts.
- Having OO language in your hand your program will be far structured and modular for all your complex mathematical calculation in comparison to R. I would rather call it as an easy to read.
- There are lot of ready-to-serve material in support of machine learning and predictive modeling using python. Read these two in combination: Machine learning with Python + Building ML with python.

So in summary, we can bet on R when we start getting into statistical analysis and then eventually turn up towards Python to take your problem to a predictive end.

This write up doesn't meant to highlight R or Python's limitations. R has evolved as a good support to ML and does have combination with Hadoop as RADOOP. However, Python also has good support to statistics and does have rich library (matplotlib) for visualization. But, as i mentioned earlier in this write up, above finding points are solely based on ease of use while you learn Data Science. I suppose once matured we can develop expertise in any one on them as per job role.

Original post: http://datumengineering.wordpress.com/2014/02/08/r-python/

© 2019 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

**Technical**

- Free Books and Resources for DSC Members
- Learn Machine Learning Coding Basics in a weekend
- New Machine Learning Cheat Sheet | Old one
- Advanced Machine Learning with Basic Excel
- 12 Algorithms Every Data Scientist Should Know
- Hitchhiker's Guide to Data Science, Machine Learning, R, Python
- Visualizations: Comparing Tableau, SPSS, R, Excel, Matlab, JS, Pyth...
- How to Automatically Determine the Number of Clusters in your Data
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- Fast Combinatorial Feature Selection with New Definition of Predict...
- 10 types of regressions. Which one to use?
- 40 Techniques Used by Data Scientists
- 15 Deep Learning Tutorials
- R: a survival guide to data science with R

**Non Technical**

- Advanced Analytic Platforms - Incumbents Fall - Challengers Rise
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- How to Become a Data Scientist - On your own
- 16 analytic disciplines compared to data science
- Six categories of Data Scientists
- 21 data science systems used by Amazon to operate its business
- 24 Uses of Statistical Modeling
- 33 unusual problems that can be solved with data science
- 22 Differences Between Junior and Senior Data Scientists
- Why You Should be a Data Science Generalist - and How to Become One
- Becoming a Billionaire Data Scientist vs Struggling to Get a $100k Job
- Why do people with no experience want to become data scientists?

**Articles from top bloggers**

- Kirk Borne | Stephanie Glen | Vincent Granville
- Ajit Jaokar | Ronald van Loon | Bernard Marr
- Steve Miller | Bill Schmarzo | Bill Vorhies

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives**: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge