# AnalyticBridge

A Data Science Central Community

# January 2017 Blog Posts (9)

### How to Handle Outliers in Regression Problems

New featured content for data scientists:

Data Science in Python: Pandas Cheat Sheet -- This cheat sheet, along with explanations, was first published on DataCamp. Click on the picture to zoom in. To view other cheat sheets (Python, R, Machine Learning, Probability, Visualizations, Deel Learning, Data Science, and so on) click here. To read the article,…

Continue

Added by Vincent Granville on January 31, 2017 at 10:30pm — No Comments

### Tutorial: Neutralizing Outliers in Any Dimension

The main focus of this article is on computing the point that minimizes the sum of the "distances" to n points in a d-dimensional space, called centroid or center, in the presence of outliers.

This long article has several sections.

Content

1. A related physics problem

2. Algorithm to find the centroid

• Source code to generate points and compute centroid, using Monte…
Continue

Added by Vincent Granville on January 30, 2017 at 2:30pm — No Comments

### 46 SQL Job Interview Questions for Data Scientists

Here is our updated selection of featured articles and resources posted over the weekend:

Continue

Added by Vincent Granville on January 15, 2017 at 7:49pm — No Comments

### In Japan, "Artificial Intelligence" comes to be a super star while "Data Scientist" is fading away

I published a post about the current status of "Data Scientist" in Japan, as a periodic follow-up analysis since two years ago. Its trend still remains, but it's beyond my anticipation at that time.

Indeed growing trend of "Artificial Intelligence" in Japan is steeper than that in English, and "Data Scientist" is now getting to be…

Continue

Added by Takashi J. OZAKI on January 13, 2017 at 6:30am — 1 Comment

### Ten Simple Rules for Effective Statistical Practice

This article, written by Kass RE, Caffo BS, Davidian M, Meng X-L, Yu B, and Reid N, contains the following rules:

• Statistical Methods Should Enable Data to Answer Scientific Questions
• Signals Always Come with Noise
• Statistical Analysis Is More Than a Set of Computations
• Keep it Simple
• Provide Assessments of Variability
• When Possible,…
Continue

Added by Vincent Granville on January 10, 2017 at 11:16am — No Comments

### How to build a search engine: Part 4

This post is the fourth part of the multi-part series on how to build a search engine –

Continue

Added by Vivek Kalyanarangan on January 10, 2017 at 1:00am — No Comments

### 7 Traps to Avoid Being Fooled by Statistical Randomness

Randomness is all around us. Its existence sends fear into the hearts of predictive analytics specialists everywhere -- if a process is truly random, then it is not predictable, in the analytic sense of that term.  Randomness refers to the absence of patterns, order, coherence, and predictability in a system.

Unfortunately, we…

Continue

Added by Kirk Borne on January 9, 2017 at 6:00pm — 5 Comments

### 12 Statistical and Machine Learning Methods that Every Data Scientist Should Know

Below is my personal list of statistical and machine learning methods that every data scientist should know in 2016.

1. Statistical Hypothesis Testing (t-test, chi-squared test & ANOVA)
2. Multiple Regression (Linear Models)
3. General Linear Models (GLM: Logistic Regression, Poisson Regression)
4. Random Forest
5. Xgboost (eXtreme Gradient Boosted Trees)
6. Deep Learning
7. Bayesian Modeling with…
Continue

Added by Takashi J. OZAKI on January 8, 2017 at 6:30am — 1 Comment

### Data science jobs not requiring human interactions

Added by Mirko Krivanek on January 6, 2017 at 1:30am — 2 Comments

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008