A Data Science Central Community
New featured content for data scientists:
Data Science in Python: Pandas Cheat Sheet -- This cheat sheet, along with explanations, was first published on DataCamp. Click on the picture to zoom in. To view other cheat sheets (Python, R, Machine Learning, Probability, Visualizations, Deel Learning, Data Science, and so on) click here. To read the article,…Continue
Added by Vincent Granville on January 31, 2017 at 10:30pm — No Comments
The main focus of this article is on computing the point that minimizes the sum of the "distances" to n points in a d-dimensional space, called centroid or center, in the presence of outliers.
This long article has several sections.
1. A related physics problem
2. Algorithm to find the centroid
Added by Vincent Granville on January 30, 2017 at 2:30pm — No Comments
Here is our updated selection of featured articles and resources posted over the weekend:
Added by Vincent Granville on January 15, 2017 at 7:49pm — No Comments
I published a post about the current status of "Data Scientist" in Japan, as a periodic follow-up analysis since two years ago. Its trend still remains, but it's beyond my anticipation at that time.
Indeed growing trend of "Artificial Intelligence" in Japan is steeper than that in English, and "Data Scientist" is now getting to be…Continue
This article, written by Kass RE, Caffo BS, Davidian M, Meng X-L, Yu B, and Reid N, contains the following rules:
Added by Vincent Granville on January 10, 2017 at 11:16am — No Comments
This post is the fourth part of the multi-part series on how to build a search engine –
Added by Vivek Kalyanarangan on January 10, 2017 at 1:00am — No Comments
Randomness is all around us. Its existence sends fear into the hearts of predictive analytics specialists everywhere -- if a process is truly random, then it is not predictable, in the analytic sense of that term. Randomness refers to the absence of patterns, order, coherence, and predictability in a system.
Below is my personal list of statistical and machine learning methods that every data scientist should know in 2016.