A Data Science Central Community

While many of the programming libraries encapsulate the inner working details of graph and other algorithms, as a data scientist it helps a lot having a reasonably good familiarity of such details. A solid understanding of the intuition behind such algorithms not only helps in appreciating the logic behind them but also helps in making conscious decisions about their applicability in real life cases. There are several graph based algorithms and most notable are the shortest path…

ContinueAdded by Vincent Granville on January 21, 2020 at 10:12am — No Comments

In 2019, Google announced TensorFlow 2.0, it is a major leap from the existing TensorFlow 1.0. The key differences are as follows:

**Ease of use:** Many old libraries (example tf.contrib) were removed, and some consolidated. For example, in TensorFlow1.x the model could be made using Contrib, layers, Keras or estimators, so many options for the same task confused many new users. TensorFlow 2.0 promotes TensorFlow Keras for model experimentation and Estimators…

Added by Vincent Granville on January 9, 2020 at 9:49am — No Comments

*Summary:** AI/ML itself is the next big thing for many fields if you’re on the outside looking in. But if you’re a data scientist it’s possible to see those advancements that will propel AI/ML to its next phase of utility.*

“The Next Big Thing in AI/ML is…” as the lead to an article is probably the most…

ContinueAdded by Vincent Granville on January 7, 2020 at 7:41am — No Comments

*Another good article by Ajit Joakar. *

** Co-relation does not equal causation** – is a mantra drilled into a Data Scientist from an early age

That’s fine. But very few talk of the follow-on question ..

*How exactly do you determine causation?*

This problem is further compounded because most books and examples are based on standard datasets (ex: Boston, Iris etc) . These examples do not discuss…

ContinueAdded by Vincent Granville on December 17, 2019 at 2:30pm — No Comments

*Written by Ajit Jaokar.*

Firstly, there are three broad categories of algorithms:

**Supervised learning:**You know how to classify the input data and the type of behavior you want to predict, but you need the algorithm to calculate it for you on new data**Unsupervised learning:**You do not know how to classify the data, and you want the algorithm to find patterns and classify the data for…

Added by Vincent Granville on December 17, 2019 at 9:00am — No Comments

There's no doubt about it, probability and statistics is an enormous field, encompassing topics from the familiar (like the average) to the complex (regression analysis, correlation coefficients and hypothesis testing to name but a few). If you want to be a great data scientist, you have to know some basic statistics. The following picture shows which statistics topics you must know if you're going to excel in data science.…

ContinueAdded by Vincent Granville on December 12, 2019 at 6:30pm — No Comments

At the time of writing, I'm a 52 year-old working in the fields of mathematics and data science. In mathematics, that makes me well-seasoned (and probably well-tenured, if I had chosen to continue in academia). In data science, some would consider me a dinosaur. In fact, many older people considering a career in data science might be put off by the thought that data science is tough to break into at a later age. But is that statement true? Should the over 50 crowd put down their textbooks…

ContinueAdded by Vincent Granville on December 10, 2019 at 11:51am — No Comments

We study the properties of a typical chaotic system to derive general insights that apply to a large class of unusual statistical distributions. The purpose is to create a unified theory of these systems. These systems can be deterministic or random, yet due to their gentle chaotic nature, they exhibit the same behavior in both cases. They lead to new models with numerous applications in Fintech, cryptography, simulation and benchmarking tests of statistical hypotheses. They are also…

ContinueAdded by Vincent Granville on November 29, 2019 at 2:30am — No Comments

*Summary:** 99% of our application of NLP has to do with chatbots or translation. This is a very interesting story about expanding the bounds of NLP and feature creation to predict bestselling novels. The authors created over 20,000 NLP features, about 2,700 of which proved to be predictive with a 90% accuracy rate in predicting NYT bestsellers.…*

Added by Vincent Granville on November 28, 2019 at 10:00pm — No Comments

In this article, we explore a new type of generalized univariate normal distributions that satisfies useful statistical properties, with interesting applications. This new class of distributions is defined by its characteristic function, and applications are discussed in the last section. These distributions are semi-stable (we define what this means below). In short it is a much wider class than the stable distributions (the only stable distribution with a finite variance…

ContinueAdded by Vincent Granville on November 27, 2019 at 11:14pm — No Comments

Machine learning is a hot topic in research and industry, with new methodologies developed all the time. The speed and complexity of the field makes keeping up with new techniques difficult even for experts — and potentially overwhelming for beginners.

To demystify machine learning and to offer a learning path for those who are new to the core…

ContinueAdded by Vincent Granville on November 27, 2019 at 10:58am — No Comments

*This article is by Jorge Castañón, Ph.D., Senior Data Scientist at the IBM Machine Learning Hub.*

Data visualization plays two key roles:

1. *Communicating results clearly to a general audience.*

2. …

ContinueAdded by Vincent Granville on November 12, 2019 at 10:00am — No Comments

Some original and very interesting material is presented here, with possible applications in Fintech. No need for a PhD in math to understand this article: I tried to make the presentation as simple as possible, focusing on high-level results rather than technicalities. Yet, professional statisticians and mathematicians, even academic researchers, will find some deep and fascinating results worth further exploring.…

ContinueAdded by Vincent Granville on October 26, 2019 at 6:00pm — No Comments

By Bill Vorhies.

*Summary:** Here’ a proposal for real ‘zero touch’, ‘set-em-and-forget-em’ machine learning from the researchers at Amazon. If you have an environment as fast changing as e-retail and a huge number of models matching buyers and products you could achieve real cost savings and revenue increases by making the refresh cycle faster and more accurate with automation. This capability likely will be coming soon to your favorite AML…*

Added by Vincent Granville on October 22, 2019 at 2:30pm — No Comments

This list of lists contains books, notebooks, presentations, cheat sheets, and tutorials covering all aspects of data science, machine learning, deep learning, statistics, math, and more, with most documents featuring Python or R code and numerous illustrations or case studies. All this material is available for free, and consists of content mostly created in 2019 and 2018, by various top experts in their respective fields. A few of these documents are available on LinkedIn: see last…

ContinueAdded by Vincent Granville on October 13, 2019 at 11:00am — No Comments

I have used synthetic data sets many times for simulation purposes, most recently in my articles Six degrees of Separations between any two Datasets and How to Lie with p-values. Many…

ContinueAdded by Vincent Granville on October 2, 2019 at 5:00pm — No Comments

This is an interesting data science conjecture, inspired by the well known six degrees of separation problem, stating that there is a link involving no more than 6 connections between any two people on Earth, say between you and anyone living (say) in North Korea.

Here the link is between any two univariate data sets…

ContinueAdded by Vincent Granville on September 9, 2019 at 10:30am — No Comments

The material discussed here is also of interest to machine learning, AI, big data, and data science practitioners, as much of the work is based on heavy data processing, algorithms, efficient coding, testing, and experimentation. Also, it's not just two new conjectures, but paths and suggestions to solve these problems. The last section contains a few new, original exercises, some with solutions, and may be useful to students, researchers, and instructors offering math and statistics classes…

ContinueAdded by Vincent Granville on September 8, 2019 at 4:09am — No Comments

Machine learning is a hot topic in research and industry, with new methodologies developed all the time. The speed and complexity of the field makes keeping up with new techniques difficult even for experts — and potentially overwhelming for beginners.

To demystify machine learning and to offer a learning path for those who are new to the core…

ContinueAdded by Vincent Granville on August 30, 2019 at 11:08am — No Comments

I introduce here a family of very peculiar statistical distributions governed by two parameters: *p*, a real number in [0, 1], and *b*, an integer > 1.

Potential applications are found in cryptography, Fintech (stock market modeling), Bitcoin, number theory, random number…

ContinueAdded by Vincent Granville on August 30, 2019 at 10:11am — No Comments

- Bayesian statistics (1)
- analytics (1)
- churn (1)
- crowd sourcing (1)
- data mining (1)
- email campaigns (1)
- fico (1)
- graph (1)
- lifetime value (1)
- rosacea (1)
- statistical litigation (1)
- user retention (1)

**2020**

- January (3)

**2019**

- December (4)
- November (5)
- October (4)
- September (2)
- August (5)
- July (2)
- June (2)
- May (4)
- April (3)
- March (3)
- February (5)
- January (2)

**2018**

- December (2)
- November (1)
- September (5)
- August (10)
- July (3)
- June (7)
- May (11)
- April (8)
- March (9)
- February (9)
- January (11)

**2017**

- December (6)
- November (8)
- October (9)
- September (5)
- August (8)
- July (3)
- June (6)
- May (4)
- April (10)
- March (4)
- February (6)
- January (5)

**2016**

**2015**

**2014**

**2013**

- December (6)
- November (6)
- October (4)
- September (4)
- August (7)
- July (8)
- June (4)
- May (8)
- April (9)
- March (11)
- February (9)
- January (6)

**2012**

- December (2)
- November (12)
- October (17)
- September (10)
- August (15)
- July (13)
- June (12)
- May (10)
- April (8)
- March (20)
- February (19)
- January (11)

**2011**

- December (19)
- November (15)
- October (11)
- September (16)
- August (7)
- July (4)
- June (8)
- May (11)
- April (9)
- March (6)
- February (7)
- January (7)

**2010**

- December (9)
- November (12)
- October (14)
- September (16)
- August (6)
- July (6)
- June (1)
- May (4)
- April (4)
- March (3)
- February (5)
- January (10)

**2009**

- December (11)
- November (9)
- October (6)
- September (1)
- July (1)
- June (1)
- May (2)
- April (1)
- March (1)
- February (2)
- January (2)

**2008**

© 2020 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC Powered by

Badges | Report an Issue | Privacy Policy | Terms of Service

**Most Popular Content on DSC**

To not miss this type of content in the future, subscribe to our newsletter.

- Book: Statistics -- New Foundations, Toolbox, and Machine Learning Recipes
- Book: Classification and Regression In a Weekend - With Python
- Book: Applied Stochastic Processes
- Long-range Correlations in Time Series: Modeling, Testing, Case Study
- How to Automatically Determine the Number of Clusters in your Data
- New Machine Learning Cheat Sheet | Old one
- Confidence Intervals Without Pain - With Resampling
- Advanced Machine Learning with Basic Excel
- New Perspectives on Statistical Distributions and Deep Learning
- Fascinating New Results in the Theory of Randomness
- Fast Combinatorial Feature Selection

**Other popular resources**

- Comprehensive Repository of Data Science and ML Resources
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- 100 Data Science Interview Questions and Answers
- Cheat Sheets | Curated Articles | Search | Jobs | Courses
- Post a Blog | Forum Questions | Books | Salaries | News

**Archives:** 2008-2014 |
2015-2016 |
2017-2019 |
Book 1 |
Book 2 |
More

**Most popular articles**

- Free Book and Resources for DSC Members
- New Perspectives on Statistical Distributions and Deep Learning
- Time series, Growth Modeling and Data Science Wizardy
- Statistical Concepts Explained in Simple English
- Machine Learning Concepts Explained in One Picture
- Comprehensive Repository of Data Science and ML Resources
- Advanced Machine Learning with Basic Excel
- Difference between ML, Data Science, AI, Deep Learning, and Statistics
- Selected Business Analytics, Data Science and ML articles
- How to Automatically Determine the Number of Clusters in your Data
- Fascinating New Results in the Theory of Randomness
- Hire a Data Scientist | Search DSC | Find a Job
- Post a Blog | Forum Questions