A Data Science Central Community
Many data scientists have a passion for mathematics, and many modern math problems can be explored using data science. Below is a selection of interesting articles, many about challenging, deep mathematical problems, by a data scientist who developed math-free algorithms. Some of these articles cover statistical theory and thus belong to…Continue
Added by Vincent Granville on May 28, 2018 at 11:30am — No Comments
Mathematical Olympiads are popular among high school students. However, there is nothing similar for college students, except maybe IMC. Even IMC is not popular. It focuses mostly on the same kind of problems as high school Olympiads, and you can not participate if you are over 23 years old. In addition, it is organized by country, as opposed to globally, thus favoring countries with a large population. Topics such as…Continue
Added by Vincent Granville on May 25, 2018 at 9:00am — No Comments
The list below is a (non-comprehensive) selection of what I believe should be taught first, in data science classes, based on 30 years of business experience. This is a follow up to my article Why logistic regression should be taught last.
I am not sure whether these topics below are even discussed in data camps or college…Continue
Added by Vincent Granville on May 24, 2018 at 9:15pm — No Comments
It is crucial to ask the right questions and/or understand the problem, prior to beginning data analysis. Below is a list of 20 questions you need to ask before delving into analysis:
Added by Cynthia Clare on May 23, 2018 at 8:30pm — No Comments
Predictive analytics uses current and historical data in order to determine the probability of a particular outcome. This is a particularly powerful approach when it is applied to medical diagnosis. In an effort to reduce misdiagnosis, historical data of former patient’s symptoms may be applied to the assessment of a new patient.
While doctors are the ultimate experts and decision-makers, using predictive analytics as a means of establishing precedent for…Continue
Added by Goli Tajadod on May 22, 2018 at 2:30am — No Comments
Here is our selection of featured articles and resources posted in the last few days.
Added by Vincent Granville on May 19, 2018 at 3:22pm — No Comments
With a data analysis plan, you know what you’re going to do when you actually sit down to do the analysis of the data you’ve gathered. It’s a vitally important thing for you to have, as it will guide how you’re going to collect your data. After all, it’s very difficult to add in new variables afterward.
For that reason, you want to make sure you’ve created your plan beforehand so that you can be sure that you’re asking all the questions you need to and you know what you’re going to…Continue
Added by Vincent Granville on May 13, 2018 at 5:30pm — No Comments
Here is our selection of featured articles and resources posted in the last few days:
Added by Vincent Granville on May 13, 2018 at 3:00pm — No Comments
These articles are between 3 and 5 year old, but are still valuable today. The methodology used in these articles is modern, and still state-of-the-art today. Some discuss immense data sets still available to the public, and that resulted in designing new machine learning techniques to handle them.
I am in the process of organizing these articles (written by myself) to eventually self-publish data science tutorials, in a few separate booklets, that are easy to understand for the…Continue
Added by Vincent Granville on May 12, 2018 at 8:30pm — No Comments
In this article, we show that the issue with polynomial regression is not over-fitting, but numerical precision. Even if done right, numerical precision still remains an insurmountable challenge. We focus here on step-wise polynomial regression, which is supposed to be more stable than the traditional model. In step-wise regression, we estimate one coefficient at a time, using the classic least square technique. …Continue
Added by Vincent Granville on May 9, 2018 at 9:30pm — No Comments
Guest blog post by David Enríquez Arriano. For more information or to get higher pictures resolution, contact the author (see contact information at the bottom of this article.)
This is a different approach to solve the AI problem. It is a cognitive math based on pyramids built with self-programming logic gates through learning.
A Boolean polynomial associated with a given truth table can be implemented with electronic…Continue
Added by Vincent Granville on May 8, 2018 at 4:07pm — No Comments
Figure 1. Scatter plot of word embedding coordinates (coordinate #3 vs. coordinate #10). You can see that semantically related words are close to each other.
This blog post is an extract from chapter 6 of the book “From Words to Wisdom. An Introduction to Text Mining…Continue
Added by Rosaria Silipo on May 7, 2018 at 12:00am — No Comments
Here is our selection of featured resources and articles published in the last few days. Enjoy the reading!
Added by Vincent Granville on May 5, 2018 at 9:57am — No Comments
Summary: Our starting assumption that sequence problems (language, speech, and others) are the natural domain of RNNs is being challenged. Temporal Convolutional Nets (TCNs) which are our workhorse CNNs with a few new features are outperforming RNNs on major applications today. Looks like RNNs may well be history.
Added by Vincent Granville on May 2, 2018 at 7:59am — No Comments
Summary: Not everyone wants to invest the time and money to become a data scientist, and if you’re mid-career the barriers are even higher. If you still want to be deeply involved in the new data-driven economy and well paid, the growth rate and opportunities as a data engineer or business analyst need to be on your radar screen.
Added by Vincent Granville on May 1, 2018 at 8:17am — No Comments