A Data Science Central Community

Takashi J. OZAKI updated their profile

Jun 16, 2016

Takashi J. OZAKI posted a blog post### 12 Statistical and Machine Learning Methods that Every Data Scientist Should Know

Below is my personal list of statistical and machine learning methods that every data scientist should know in 2016.Statistical Hypothesis Testing (t-test, chi-squared test & ANOVA)Multiple Regression (Linear Models)General Linear Models (GLM: Logistic Regression, Poisson Regression)Random ForestXgboost (eXtreme Gradient Boosted Trees)Deep LearningBayesian Modeling with MCMCword2vecK-means ClusteringGraph Theory & Network Analysis(A1) Latent Dirichlet Allocation & Topic Modeling(A2)…See More

Apr 20, 2016

Takashi J. OZAKI's blog post was featured### Overview and simple trial of Convolutional Neural Network with MXnet

Actually I've known about MXnet for weeks as one of the most popular library / packages in Kaggler, but just recently I heard bug fix has been almost done and some friends say the latest version looks stable, so at last I installed it.MXnet: https://github.com/dmlc/mxnetI…See More

Mar 30, 2016

Takashi J. OZAKI's blog post was featured### Multivariate modeling vs. univariate modeling along human intuition: predicting taste of wine

I wrote a blog post inspired by Jamie Goode's book "Wine Science: The Application of Science in Winemaking".In this book, Goode argued that reductionistic approach cannot explain relationship between chemical ingredients and taste of wine. Indeed, we know not all high (alcohol) wines are excellent, although in general high wines are believed to be good. Usually taste of wine is affected by a complicated balance of many components such as sweetness, acid, tannin, density or others that are given…See More

Nov 28, 2015

Takashi J. OZAKI's blog post was featured### R and Stan: introduction to Bayesian modeling

I wrote a series of blog posts on Bayesian modeling with R and Stan.Bayesian modeling with R and Stan (1): OverviewBayesian modeling with R and Stan (2): Installation and an easy exampleBayesian modeling…See More

Aug 18, 2015

Takashi J. OZAKI's blog post was featured### Even without any "golden feature", multivariate modeling can work

A/B testing is widely used for online marketing, management of Internet ads or any other usual analytics. In general, people use it in order to look for "golden features (metrics)" that are vital points for growth hacking. To validate A/B testing, statistical hypothesis tests such as t-test are used and people are trying to find any metric with a significant effect across conditions. If you successfully find a metric with a significant difference between design A and B of a click button, you'll…See More

Jun 20, 2015

Takashi J. OZAKI posted a blog post### Overfitting or generalized? Comparison of ML classifiers - a series of articles

In my own blog I wrote a series of articles about how major machine learning classifiers work, with some visualization of their decision boundaries on various datasets.Machine learning for package users with R (0): PrologueMachine learning for package users with R (1): Decision Tree…See More

Jun 5, 2015

Takashi J. OZAKI's blog post was featured### Overfitting or generalized? Comparison of ML classifiers - a series of articles

In my own blog I wrote a series of articles about how major machine learning classifiers work, with some visualization of their decision boundaries on various datasets.Machine learning for package users with R (0): PrologueMachine learning for package users with R (1): Decision Tree…See More

Jun 5, 2015

Takashi J. OZAKI's blog post was featured### Decision tree vs. linearly separable or non-separable pattern

As a part of a series of posts discussing how a machine learning classifier works, I ran decision tree to classify a XY-plane, trained with XOR patterns or linearly separable patterns.1. Simple (non-overlapped) XOR patternIt worked well. Its decision boundary was drawn almost perfectly parallel to the assumed true…See More

Mar 25, 2015

Takashi J. OZAKI posted a blog post### Decision tree vs. linearly separable or non-separable pattern

As a part of a series of posts discussing how a machine learning classifier works, I ran decision tree to classify a XY-plane, trained with XOR patterns or linearly separable patterns.1. Simple (non-overlapped) XOR patternIt worked well. Its decision boundary was drawn almost perfectly parallel to the assumed true…See More

Mar 23, 2015

Takashi J. OZAKI's blog post was featured### Overheating of "Artificial Intelligence" boom in Japan, while "Data Scientist" is fading out

Currently I'm concerned about incredible overheating of "Artificial Intelligence" boom in Japan - while "Data Scientist" has gone.Google Trends shows Japanese people are getting just less attracted by statistics that is believed to be expertise of Data Scientist, and now they are enthusiastic about Artificial Intelligence. I feel this situation looks much puzzling.So, what's going on in 2015?... yes, I think not a few data science experts in Japan must agree that "Artificial Intelligence"…See More

Mar 20, 2015

Takashi J. OZAKI's blog post was featured### Learn how each ML classifier works: decision boundary vs. assumed true boundary

In the latest post of my own blog, I argued about how to learn how each machine learning classifier works visually. My idea is that first I prepare samples for training and then I show its assumed true boundary, and finally decision boundary estimated by the classifier with a dense grid covering over the space as test dataset and the assumed boundary are compared.In the case below, the assumed true boundary of the space is a set of 3 parallel lines; I think everybody will guess so intuitively,…See More

Mar 14, 2015

Takashi J. OZAKI posted a blog post### Learn how each ML classifier works: decision boundary vs. assumed true boundary

In the latest post of my own blog, I argued about how to learn how each machine learning classifier works visually. My idea is that first I prepare samples for training and then I show its assumed true boundary, and finally decision boundary estimated by the classifier with a dense grid covering over the space as test dataset and the assumed boundary are compared.In the case below, the assumed true boundary of the space is a set of 3 parallel lines; I think everybody will guess so intuitively,…See More

Mar 13, 2015

Takashi J. OZAKI's blog post was featured### Deep Belief Net with {h2o} on MNIST and its Kaggle competition

In order to evaluate how Deep Belief Net (Deep Learning) of {h2o} works on actual datasets, I applied it to MNIST dataset; but I got the dataset from a Kaggle competition on MNIST so consequently I joined the competition. :P)As well known, classification tasks such as for MNIST should be done by rather Convolutional NN…See More

Feb 26, 2015

Takashi J. OZAKI's blog post was featured### Experiments of Deep Learning with {h2o} package on R

Below is the latest post (and the first post in these 10 months...) of my blog.What kind of decision boundaries does Deep Learning (Deep Belief Net) draw? Practice with R and {h2o} packageOnce I wrote a post about a relationship between features of machine learning classifiers and their decision boundaries on the same dataset. The result was much interesting and many people looked to enjoy and even argued about it.Actually I've been looking for similar attempts about Deep Learning but I…See More

Feb 18, 2015

Takashi J. OZAKI's blog post was featured### Answers to "10 questions about big data and data science" from Japan

10 questions about big data and data sciencehttp://www.datasciencecentral.com/profiles/blogs/participate-in-our-big-data-survey-interview-questionsraised by Dr. Vincent Granville are very interesting I feel. I have to say I'm never any leader of big data or data science in Japan -- but I'm afraid nobody will answer. So as a personal opinion, I wrote answers as a blog post.…See More

Feb 20, 2014

- Short Bio:
- Ph. D. Data Scientist

*Expertise*

Data Science: mainly for digital marketing

Statistics: Multivariate modeling, including Bayesian modeling

Machine Learning: including Deep Learning

*Proficiency*

Programming language: R, Stan, Python, SQL, C, Matlab

Natural language: Japanese, English (business level)

*CV*

Jan 2016 - present: Data Scientist, Google

Jul 2013 - Dec 2015: Data Scientist, Recruit Communications Co., Ltd.

Jun 2012 - Jun 2013: Data Scientist, CyberAgent, Inc.

Apr 2006 - May 2012: Academic Researcher in Cognitive Neuroscience, in some universities or national research institutes

Mar 2006: Ph. D. in Frontier Sciences, The University of Tokyo

- My Website or LinkedIn Profile (URL):
- http://tjo-en.hatenablog.com/

- Field of Expertise:
- Business Analytics, Predictive Modeling, Data Mining, Econometrics, Web Analytics, Statistical Consulting, Artificial Intelligence

- Years of Experience in Analytical Role:
- 4 yr

- Professional Status:
- Technical

- Interests:
- Networking

- What is your Favorite Data Mining or Analytical Website?
- http://www.kdnuggets.com/

- Your Company:

- Industry:
- Technology

- Your Job Title:
- Data Scientist

- How did you find out about AnalyticBridge?
- I saw an article by Dr. Vincent Granville via Google search.

Posted on January 13, 2017 at 6:30am 1 Comment 0 Likes

I published a post about the current status of "Data Scientist" in Japan, as a periodic follow-up analysis since two years ago. Its trend still remains, but it's beyond my anticipation at that time.

Indeed growing trend of "Artificial Intelligence" in Japan is steeper than…

ContinuePosted on January 8, 2017 at 6:30am 1 Comment 1 Like

Below is my personal list of statistical and machine learning methods that every data scientist should know in 2016.

**Statistical Hypothesis Testing (t-test, chi-squared test & ANOVA)****Multiple Regression (Linear Models)****General Linear Models (GLM: Logistic Regression, Poisson Regression)****Random Forest****Xgboost (eXtreme Gradient Boosted Trees)****Deep Learning****Bayesian Modeling with…**

Posted on March 30, 2016 at 8:30am 0 Comments 0 Likes

Actually I've known about MXnet for weeks as one of the most popular library / packages in Kaggler, but just recently I heard bug fix has been almost done and some friends say the latest version looks stable, so at last I installed it.

MXnet: https://github.com/dmlc/mxnet…

ContinuePosted on November 26, 2015 at 8:43am 0 Comments 0 Likes

I wrote a blog post inspired by Jamie Goode's book "Wine Science: The Application of Science in Winemaking".

In this book, Goode argued that reductionistic approach cannot explain relationship between chemical ingredients and taste of wine. Indeed, we know not all high (alcohol) wines are excellent, although in general high wines are believed to be good. Usually taste of wine is affected by a complicated balance of many components such as sweetness, acid, tannin,…

Continue
## Comment Wall (1 comment)

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

Hi Paul (id: paulagyiri),

I'd be happy if you contact me via LinkedIn (http://jp.linkedin.com/in/tjozaki).

Thanks,

-TJO