Subscribe to DSC Newsletter

Vincent Granville's Blog (770)

How the Mathematics of Fractals Can Help Predict Stock Markets Shifts

In financial markets, two of the most common trading strategies used by investors are the momentum and mean reversion strategies. If a stock exhibits momentum (or trending behavior as shown in the figure below), its price on the current period is more likely to increase (decrease) if it has already increased (decreased) on the previous period.

When the return of a stock at time t depends in some way on the return at the previous time t-1, the returns are said to be autocorrelated. In…

Continue

Added by Vincent Granville on July 8, 2019 at 10:25am — No Comments

Where’s the Love – Trends in Data Science Career Opportunities

Summary:  The annual Burtch Works salary survey tells us a lot about which industries are using the most data scientists and the difference between higher and lower skilled data scientists.  Salary increases show us whether demand is increasing, and finally we take a shot at determining which skills are most in demand.

 What a difference a few years can make.  We used to say that everyone loves a data scientist – and wants to be one. …

Continue

Added by Vincent Granville on July 8, 2019 at 10:18am — No Comments

How to learn the maths of Data Science using your high school maths knowledge

By Ajit Jaokar. This post is a part of my forthcoming book on Mathematical foundations of Data Science. In this post, we use the Perceptron algorithm to bridge the gap between high school maths and deep learning. 

Background

As part of my role as course director of the Artificial Intelligence: Cloud and Edge Computing at the University of Oxford, I see more students who are familiar with programming than with mathematics.

They have last learnt maths…

Continue

Added by Vincent Granville on June 27, 2019 at 12:22pm — No Comments

Machine Learning and Data Science Cheat Sheet

Originally published in 2014 and viewed more than 200,000 times, this is the oldest data science cheat sheet - the mother of all the numerous cheat sheets that are so popular nowadays. I decided to update it in June 2019. While the first half, dealing with installing components on your laptop and learning UNIX, regular expressions, and file management hasn't changed much, the second half, dealing with machine learning, was rewritten entirely from scratch. It is amazing how things changed in…

Continue

Added by Vincent Granville on June 6, 2019 at 8:27pm — No Comments

7 Simple Tricks to Handle Complex Machine Learning Issues

We propose simple solutions to important problems that all data scientists face almost every day. In short, a toolbox for the handyman, useful to busy professionals in any field.

1. Eliminating sample size effectsMany statistics, such as correlations or R-squared, depend on the sample size, making it difficult to…

Continue

Added by Vincent Granville on June 4, 2019 at 12:00pm — No Comments

Gentle Approach to Linear Algebra, with Machine Learning Applications

This simple introduction to matrix theory offers a refreshing perspective on the subject. Using a basic concept that leads to a simple formula for the power of a matrix, we see how it can solve time series, Markov chains, linear regression, data reduction, principal components analysis (PCA) and other machine learning problems. These problems are usually solved with more advanced matrix calculus, including eigenvalues, diagonalization, generalized inverse matrices, and other types of…

Continue

Added by Vincent Granville on May 28, 2019 at 9:00pm — No Comments

New Book: Classification and Regression In a Weekend (in Python)

We have added a new free book in our selection exclusively for DSC members. See the first entry below, to get started with machine learning with Python.

1. Book: Classification and Regression In a Weekend

This tutorial began as a series of weekend workshops created by Ajit Jaokar and Dan Howarth. The idea was to work with a specific (longish) program such that we explore as much of it as possible in one weekend. This book is an attempt to take this idea online.…

Continue

Added by Vincent Granville on May 16, 2019 at 6:24pm — No Comments

Confidence Intervals Without Pain, with Excel

We propose a simple model-free solution to compute any confidence interval and to extrapolate these intervals beyond the observations available in your data set. In addition we propose a mechanism  to sharpen the confidence intervals, to reduce their width by an order of magnitude. The methodology works with any estimator (mean, median, variance, quantile, correlation and so on) even when the data set violates the classical requirements necessary to make traditional statistical techniques…

Continue

Added by Vincent Granville on May 9, 2019 at 11:30am — No Comments

Re-sampling: Amazing Results and Applications

This crash course features a new fundamental statistics theorem -- even more important than the central limit theorem -- and a new set of statistical rules and recipes. We discuss concepts related to determining the optimum sample size, the optimum k in k-fold cross-validation, bootstrapping, new re-sampling techniques, simulations, tests of hypotheses, confidence intervals, and statistical inference using a unified, robust, simple…

Continue

Added by Vincent Granville on May 4, 2019 at 12:30pm — No Comments

Some Fun with Gentle Chaos, the Golden Ratio, and Stochastic Number Theory

So many fascinating and deep results have been written about the number (1 + SQRT(5)) / 2 and its related sequence - the Fibonacci numbers - that it would take years to read all of them. This number has been studied both for its applications (population growth, architecture) and its mathematical properties, for over 2,000 years. It is still a topic of active research.…

Continue

Added by Vincent Granville on April 25, 2019 at 7:30am — No Comments

Causality – The Next Most Important Thing in AI/ML

Summary:  Finally there are tools that let us transcend ‘correlation is not causation’ and identify true causal factors and their relative strengths in our models.  This is what prescriptive analytics was meant to be.

 

Just when I thought we’d figured it all out,…

Continue

Added by Vincent Granville on April 24, 2019 at 7:30pm — No Comments

New Stock Trading and Lottery Game Rooted in Deep Math

I describe here the ultimate number guessing game, played with real money. It is a new trading and gaming system, based on state-of-the-art mathematical engineering, robust architecture, and patent-pending technology. It offers an alternative to the stock market and traditional gaming. This system is also far more transparent than the stock market, and can not be manipulated, as formulas to win the biggest returns (with real money) are made public. Also, it simulates a neutral,…

Continue

Added by Vincent Granville on April 15, 2019 at 10:00am — No Comments

A Radical AI Strategy - Platformication

Summary:  A new business model strategy based around intermediary platforms powered by AI/ML is promising the most direct path to fastest growth, profitability, and competitive success.  Adopting this new approach requires a deep change in mindset and is quite different from just adopting AI/ML to optimize your current operations.…

Continue

Added by Vincent Granville on April 8, 2019 at 11:00pm — No Comments

Long-range Correlations in Time Series: Modeling, Testing, Case Study

We investigate a large class of auto-correlated, stationary time series, proposing a new statistical test to measure departure from the base model, known as Brownian motion. We also discuss a methodology to deconstruct these time series, in order to identify the root mechanism that generates the observations. The time series studied here can be discrete or continuous in time, they  can have various degrees of smoothness (typically measured using the Hurst exponent) as well as long-range or…

Continue

Added by Vincent Granville on April 1, 2019 at 1:00pm — No Comments

Fascinating Developments in the Theory of Randomness

I present here some innovative results from my most recent research on stochastic processes. chaos modeling, and dynamical systems, with applications to Fintech, cryptography, number theory, and random number generators. While covering advanced topics, this article is accessible to professionals with limited knowledge in statistical or mathematical theory. It introduces new material not covered in my recent book (available …

Continue

Added by Vincent Granville on March 21, 2019 at 7:30am — No Comments

How to Automatically Determine the Number of Clusters in your Data - and more

Determining the number of clusters when performing unsupervised clustering is a tricky problem. Many data sets don't exhibit well separated clusters, and two human beings asked to visually tell the number of clusters by looking at a chart, are likely to provide two different answers. Sometimes clusters overlap with each other, and large clusters contain sub-clusters, making a decision not easy.

For instance, how many clusters do you see in the picture below? What is the optimum number…

Continue

Added by Vincent Granville on March 13, 2019 at 6:00pm — No Comments

Deep Analytical Thinking and Data Science Wizardry

Many times, complex models are not enough (or too heavy), or not necessary, to get great, robust, sustainable insights out of data. Deep analytical thinking may prove more useful, and can be done by people not necessarily trained in data science, even by people with limited coding experience. Here we explore what we mean by deep analytical thinking, using a case study, and how it works: combining craftsmanship, business acumen, the use and creation of tricks and rules of thumb, to provide…

Continue

Added by Vincent Granville on March 7, 2019 at 1:46pm — No Comments

New Perspectives on Statistical Distributions and Deep Learning

In this data science article, emphasis is placed on science, not just on data. State-of-the art material is presented in simple English, from multiple perspectives: applications, theoretical research asking more questions than it answers, scientific computing, machine learning, and algorithms. I attempt here to lay the foundations of a new statistical technology, hoping that it will plant the seeds for further research on a topic with a broad range of potential…

Continue

Added by Vincent Granville on February 23, 2019 at 11:00am — No Comments

A Plethora of Original, Not Well-Known Statistical Tests

Many of the following statistical tests are rarely discussed in textbooks or in college classes, much less in data camps. Yet they help answer a lot of different and interesting questions. I used most of them without even computing the underlying distribution under the null hypothesis, but instead, using simulations to check whether my assumptions were plausible or not. In short, my approach to statistical testing is is model-free, data-driven. Some are easy to implement even in Excel. Some…

Continue

Added by Vincent Granville on February 13, 2019 at 7:00pm — No Comments

Machine Learning Glossary

For background to this post, please see Learn Machine Learning Coding Basics in a weekend. Here,we present the glossary that we use for the coding and the mindmap attached to these classes and upcoming book. About 80 terms are included in the glossary, covering Ensembles, Regression, Classification,…

Continue

Added by Vincent Granville on February 12, 2019 at 12:31pm — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service