This list of lists contains books, notebooks, presentations, cheat sheets, and tutorials covering all aspects of data science, machine learning, deep learning, statistics, math, and more, with most documents featuring Python or R code and numerous illustrations or case studies. All this material is available for free, and consists of content mostly created in 2019 and 2018, by various top experts in their respective fields. A few of these documents are available on LinkedIn: see last section on how to download them. Below are the first two sections.General ReferencesFree Deep Learning Book (639 pages) by Prof. Gilles LouppePython Crash Course (562 pages) by Eric MatthesFree Book: Applied Data Science (141 pages) - Columbia UniversityData Science in PracticeMachine Learning 101 - By Jason Mayes, GoogleThe Ultimate guide to AI, Data Science & Machine LearningFree Handbooks for Data Science ProfessionalsFree Book: Natural Language Processing with PythonData Visualization ResourcesTextbook: Probability Course - Harvard UniversityTextbook: The Math of Machine Learning - Berkeley UniversityComprehensive Guide to Machine Learning - Berkeley UniversityFree Book: Foundations of Data Science - by Microsoft ResearchComprehensive Guide on Machine Learning - by J.P. MorganGentle Approach to Linear Algebra - by Vincent GranvilleData Science Central Books, Booklets and ReferencesStatistics: New Foundations, Toolbox, and Machine Learning RecipesDeep Learning and Computer Vision with CNNsGetting Started with TensorFlow 2.0Classification and Regression in a WeekendOnline Encyclopedia of Statistical ScienceAzure Machine Learning in a WeekendEnterprise AI - An Application PerspectiveApplied Stochastic ProcessesComprehensive Repository of Data Science and ML ResourcesFoundations of ML and Data Science for DevelopersElegant Representation of Forward/Back Propagation in Neural NetworksLearning the Math of Data ScienceTo access all these documents and more, follow this link.See More

I have used synthetic data sets many times for simulation purposes, most recently in my articles Six degrees of Separations between any two Datasets and How to Lie with p-values. Many applications (including the data sets themselves) can be found in my books Applied Stochastic Processes and New Foundations of Statistical Science. For instance, these data sets can be used to benchmark some statistical tests of hypothesis (the null hypothesis known to be true or false in advance) and to assess the power of such tests or confidence intervals. In other cases, it is used to simulate clusters and test cluster detection / pattern detection algorithms, see here. I also used such data sets to discover two new deep conjectures in number theory (see here), to design new Fintech models such as bounded Brownian motions, and find new families of statistical distributions (see here).Goldbach's comet In this article, I focus on peculiar random data sets to prove -- heuristically -- two of the most famous math conjectures in number theory, related to prime numbers: the Twin Prime conjecture, and the Goldbach conjecture. The methodology is at the intersection of probability theory, experimental math, and probabilistic number theory. It involves working with infinite data sets, dwarfing any data set found in any business context.Read full article here. See More

]]>

This is an interesting data science conjecture, inspired by the well known six degrees of separation problem, stating that there is a link involving no more than 6 connections between any two people on Earth, say between you and anyone living (say) in North Korea. Here the link is between any two univariate data sets of the same size, say Data A and Data B. The claim is that there is a chain involving no more than 6 intermediary data sets, each highly correlated to the previous one (with a correlation above 0.8), between Data A and Data B. The concept is illustrated in the example below, where only 4 intermediary data sets (labeled Degree 1, Degree 2, Degree 3, and Degree 4) are actually needed. Correlation table for the 6 data setsThe view the (random) data sets, understand how the chain of intermediary data sets was built, and access the spreadsheets to reproduce the results or test on different data, follow this link. It makes for an interesting theoretical data science research project, for people with too much free time on their hands. See More

The material discussed here is also of interest to machine learning, AI, big data, and data science practitioners, as much of the work is based on heavy data processing, algorithms, efficient coding, testing, and experimentation. Also, it's not just two new conjectures, but paths and suggestions to solve these problems. The last section contains a few new, original exercises, some with solutions, and may be useful to students, researchers, and instructors offering math and statistics classes at the college level: they range from easy to very difficult. Some great probability theorems are also discussed, in layman's terms: see section 1.2. The two deep conjectures highlighted in this article (conjectures B and C) are related to the digit distribution of well known math constants such as Pi or log 2, with an emphasis on binary digits of SQRT(2). This is an old problem, one of the most famous ones in mathematics, still unsolved today.Content of this articleA Strange Recursive FormulaConjecture AA deeper resultConjecture BConnection to the Berry-Esseen theoremPotential path to solving this problemPotential Solution Based on Special Rational Number SequencesInteresting statistical resultConjecture CAnother curious statistical resultExercisesRead the full article here. See More

We are in the process of writing and adding new material (compact eBooks) exclusively available to our members, and written in simple English, by world leading experts in AI, data science, and machine learning. In the upcoming months, the following will be added:The Machine Learning Coding BookOff-the-beaten-path Statistics and Machine Learning Techniques Encyclopedia of Statistical ScienceOriginal Math, Stat and Probability Problems - with SolutionsComputational Number Theory for Data ScientistsRandomness, Pattern Recognition, Simulations, Signal Processing - New developmentsWe invite you to sign up here to not miss these free books. Previous material (also for members only) can be found here.Currently, the following content is available:1. Book: Enterprise AI - An Application Perspective Enterprise AI: An applications perspective takes a use case driven approach to understand the deployment of AI in the Enterprise. Designed for strategists and developers, the book provides a practical and straightforward roadmap based on application use cases for AI in Enterprises. The authors (Ajit Jaokar and Cheuk Ting Ho) are data scientists and AI researchers who have deployed AI applications for Enterprise domains. The book is used as a reference for Ajit and Cheuk's new course on Implementing Enterprise AI.The table of content is available here. The book can be accessed here (members only.)2. Book: Applied Stochastic ProcessesFull title: Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of Numeration Systems. Published June 2, 2018. Author: Vincent Granville, PhD. (104 pages, 16 chapters.)This book is intended to professionals in data science, computer science, operations research, statistics, machine learning, big data, and mathematics. In 100 pages, it covers many new topics, offering a fresh perspective on the subject. It is accessible to practitioners with a two-year college-level exposure to statistics and probability. The compact and tutorial style, featuring many applications (Blockchain, quantum algorithms, HPC, random number generation, cryptography, Fintech, web crawling, statistical testing) with numerous illustrations, is aimed at practitioners, researchers and executives in various quantitative fields.New ideas, advanced topics, and state-of-the-art research are discussed in simple English, without using jargon or arcane theory. It unifies topics that are usually part of different fields (data science, operations research, dynamical systems, computer science, number theory, probability) broadening the knowledge and interest of the reader in ways that are not found in any other book. This short book contains a large amount of condensed material that would typically be covered in 500 pages in traditional publications. Thanks to cross-references and redundancy, the chapters can be read independently, in random order.The table of content is available here. The book can be accessed here (members only.)DSC ResourcesComprehensive Repository of Data Science and ML ResourcesAdvanced Machine Learning with Basic ExcelDifference between ML, Data Science, AI, Deep Learning, and StatisticsSelected Business Analytics, Data Science and ML articlesHire a Data Scientist | Search DSC | Find a JobPost a Blog | Forum QuestionsSee More

]]>

]]>

We are in the process of writing and adding new material (compact eBooks) exclusively available to our members, and written in simple English, by world leading experts in AI, data science, and machine learning. In the upcoming months, the following will be added:The Machine Learning Coding BookOff-the-beaten-path Statistics and Machine Learning Techniques Encyclopedia of Statistical ScienceOriginal Math, Stat and Probability Problems - with SolutionsComputational Number Theory for Data ScientistsRandomness, Pattern Recognition, Simulations, Signal Processing - New developmentsWe invite you to sign up here to not miss these free books. Previous material (also for members only) can be found here.Currently, the following content is available:1. Book: Enterprise AI - An Application Perspective Enterprise AI: An applications perspective takes a use case driven approach to understand the deployment of AI in the Enterprise. Designed for strategists and developers, the book provides a practical and straightforward roadmap based on application use cases for AI in Enterprises. The authors (Ajit Jaokar and Cheuk Ting Ho) are data scientists and AI researchers who have deployed AI applications for Enterprise domains. The book is used as a reference for Ajit and Cheuk's new course on Implementing Enterprise AI.The table of content is available here. The book can be accessed here (members only.)2. Book: Applied Stochastic ProcessesFull title: Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of Numeration Systems. Published June 2, 2018. Author: Vincent Granville, PhD. (104 pages, 16 chapters.)This book is intended to professionals in data science, computer science, operations research, statistics, machine learning, big data, and mathematics. In 100 pages, it covers many new topics, offering a fresh perspective on the subject. It is accessible to practitioners with a two-year college-level exposure to statistics and probability. The compact and tutorial style, featuring many applications (Blockchain, quantum algorithms, HPC, random number generation, cryptography, Fintech, web crawling, statistical testing) with numerous illustrations, is aimed at practitioners, researchers and executives in various quantitative fields.New ideas, advanced topics, and state-of-the-art research are discussed in simple English, without using jargon or arcane theory. It unifies topics that are usually part of different fields (data science, operations research, dynamical systems, computer science, number theory, probability) broadening the knowledge and interest of the reader in ways that are not found in any other book. This short book contains a large amount of condensed material that would typically be covered in 500 pages in traditional publications. Thanks to cross-references and redundancy, the chapters can be read independently, in random order.The table of content is available here. The book can be accessed here (members only.)DSC ResourcesComprehensive Repository of Data Science and ML ResourcesAdvanced Machine Learning with Basic ExcelDifference between ML, Data Science, AI, Deep Learning, and StatisticsSelected Business Analytics, Data Science and ML articlesHire a Data Scientist | Search DSC | Find a JobPost a Blog | Forum QuestionsSee More

]]>