We are in the process of writing and adding new material (compact eBooks) exclusively available to our members, and written in simple English, by world leading experts in AI, data science, and machine learning. In the upcoming months, the following will be added:The Machine Learning Coding BookOff-the-beaten-path Statistics and Machine Learning Techniques Encyclopedia of Statistical ScienceOriginal Math, Stat and Probability Problems - with SolutionsComputational Number Theory for Data ScientistsRandomness, Pattern Recognition, Simulations, Signal Processing - New developmentsWe invite you to sign up here to not miss these free books. Previous material (also for members only) can be found here.Currently, the following content is available:1. Book: Enterprise AI - An Application Perspective Enterprise AI: An applications perspective takes a use case driven approach to understand the deployment of AI in the Enterprise. Designed for strategists and developers, the book provides a practical and straightforward roadmap based on application use cases for AI in Enterprises. The authors (Ajit Jaokar and Cheuk Ting Ho) are data scientists and AI researchers who have deployed AI applications for Enterprise domains. The book is used as a reference for Ajit and Cheuk's new course on Implementing Enterprise AI.The table of content is available here. The book can be accessed here (members only.)2. Book: Applied Stochastic ProcessesFull title: Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of Numeration Systems. Published June 2, 2018. Author: Vincent Granville, PhD. (104 pages, 16 chapters.)This book is intended to professionals in data science, computer science, operations research, statistics, machine learning, big data, and mathematics. In 100 pages, it covers many new topics, offering a fresh perspective on the subject. It is accessible to practitioners with a two-year college-level exposure to statistics and probability. The compact and tutorial style, featuring many applications (Blockchain, quantum algorithms, HPC, random number generation, cryptography, Fintech, web crawling, statistical testing) with numerous illustrations, is aimed at practitioners, researchers and executives in various quantitative fields.New ideas, advanced topics, and state-of-the-art research are discussed in simple English, without using jargon or arcane theory. It unifies topics that are usually part of different fields (data science, operations research, dynamical systems, computer science, number theory, probability) broadening the knowledge and interest of the reader in ways that are not found in any other book. This short book contains a large amount of condensed material that would typically be covered in 500 pages in traditional publications. Thanks to cross-references and redundancy, the chapters can be read independently, in random order.The table of content is available here. The book can be accessed here (members only.)DSC ResourcesComprehensive Repository of Data Science and ML ResourcesAdvanced Machine Learning with Basic ExcelDifference between ML, Data Science, AI, Deep Learning, and StatisticsSelected Business Analytics, Data Science and ML articlesHire a Data Scientist | Search DSC | Find a JobPost a Blog | Forum QuestionsSee More

Given n observations x1, ..., xn, the generalized mean (also called power mean) is defined as The case p = 1 corresponds to the traditional arithmetic mean, while p = 0 yields the geometric mean, and p = -1 yields the harmonic mean. See here for details. This metric is favored by statisticians. It is a particular case of the quasi-arithmetic mean. Here I introduce another kind of mean called exponential mean, also based on a parameter p, that may have an appeal to data scientists and machine learning professionals. It is also a special case of the quasi-arithmetic mean. Though the concept is basic, there is very little if any literature about it. It is related to the LogSumExp and the Log semiring. It is defined as follows:Here the logarithm is in base p, with p positive. When p tends to 0, mp is the minimum of the observations. When p tends to 1, it yields the classic arithmetic mean, and as p tends to infinity, it yields the maximum of the observations. Content of this articleAdvantages of the exponential meanIllustration on a test data setImportant inequalityDoubly exponential meanRead the full article here. See More

]]>

]]>

Product of two large primes are at the core of many encryption algorithms, as factoring the product is very hard for numbers with a few hundred digits. The two prime factors are associated with the encryption keys (public and private keys). Here we describe a new approach to factoring a big number that is the product of two primes of roughly the same size. It is designed especially to handle this problem and identify flaws in encryption algorithms. Riemann zeta function in the complex planeWhile at first glance it appears to substantially reduce the computational complexity of traditional factoring, at this stage there is still a lot of progress needed to make the new algorithm efficient. An interesting feature is that the success depends on the probability of two numbers to be co-prime, given the fact that they don't share the first few primes (say 2, 3, 5, 7, 11, 13) as common divisors. This probability can be computed explicitly and is about 99%. The methodology relies heavily on solving systems of congruences, the Chinese Remainder Theorem, and the modular multiplicative inverse of some carefully chosen integers. We also discuss computational complexity issues. Finally, the off-the-beaten-path material presented here leads to many original exercises or exam questions for students learning probability, computer science, or number theory: proving the various simple statements made in my article. ContentSome Number Theory Explained in Simple EnglishCo-primes and pairwise co-primesProbability of being co-primeModular multiplicative inverseChinese remainder theorem, version AChinese remainder theorem, version BThe New Factoring AlgorithmImproving computational complexityFive-step algorithmProbabilistic optimizationCompact Formulation of the ProblemRead the full article here. Other Math Articles by Same AuthorHere is a selection of articles pertaining to experimental math and probabilistic number theory:Statistics: New Foundations, Toolbox, and Machine Learning RecipesApplied Stochastic ProcessesVariance, Attractors and Behavior of Chaotic Statistical SystemsNew Family of Generalized Gaussian DistributionsA Beautiful Result in Probability TheoryTwo New Deep Conjectures in Probabilistic Number TheoryExtreme Events Modeling Using Continued FractionsA Strange Family of Statistical DistributionsSome Fun with Gentle Chaos, the Golden Ratio, and Stochastic Number...Fascinating New Results in the Theory of RandomnessTwo Beautiful Mathematical Results - Part 2Two Beautiful Mathematical ResultsNumber Theory: Nice Generalization of the Waring ConjectureFascinating Chaotic Sequences with Cool ApplicationsSimple Proof of the Prime Number TheoremFactoring Massive Numbers: Machine Learning ApproachSee More

We discuss a simple trick to significantly accelerate the convergence of an algorithm when the error term decreases in absolute value over successive iterations, with the error term oscillating (not necessarily periodically) between positive and negative values. We first illustrate the technique on a well known and simple case: the computation of log 2 using its well know, slow-converging series. We then discuss a very interesting and more complex case, before finally focusing on a more challenging example in the context of probabilistic number theory and experimental math.The technique must be tested for each specific case to assess the improvement in convergence speed. There is no general, theoretical rule to measure the gain, and if the error term does not oscillate in a balanced way between positive and negative values, this technique does not produce any gain. However, in the examples below, the gain was dramatic. Let's say you run an algorithm, for instance gradient descent. The input (model parameters) is x, the output if f(x), for instance a local optimum. We consider f(x) to be univariate, but it easily generalizes to the multivariate case, by applying the technique separately for each component. At iteration k, you obtain an approximation f(k, x) of f(x), and the error is E(k, x) = f(x) - f(k, x). The total number of iterations is N. starting with first iteration k = 1. The idea consists in first running the algorithm as is, and then compute the "smoothed" approximations, using the following m steps.Read the full article here.ContentGeneral framework and simple illustrationA strange functionEven stranger functionsSee More

]]>

The methodology described here has broad applications, leading to new statistical tests, new type of ANOVA (analysis of variance), improved design of experiments, interesting fractional factorial designs, a better understanding of irrational numbers leading to cryptography, gaming and Fintech applications, and high quality random numbers generators (and when you really need them). It also features exact arithmetic / high performance computing and distributed algorithms to compute millions of binary digits for an infinite family of real numbers, including detection of auto- and cross-correlations (or lack of) in the digit distributions.The data processed in my experiment, consisting of raw irrational numbers (described by a new class of elementary recurrences) led to the discovery of unexpected apparent patterns in their digit distribution: in particular, the fact that a few of these numbers, contrarily to popular belief, do not have 50% of their binary digits equal to 1. It turned out that perfectly random digits simulated in large numbers, with a good enough pseudo-random generator, also exhibit the same strange behavior, pointing to the fact that pure randomness may not be as random as we imagine it is. Ironically, failure to exhibit these patterns would be an indicator that there really is a departure from pure randomness in the digits in question.In addition to new statistical / mathematical methods and discoveries and interesting applications, you will learn in my article how to avoid this type of statistical traps that lead to erroneous conclusions, when performing a large number of statistical tests, and how to not be misled by false appearances. I call them statistical hallucinations and false outliers.This article has two main sections: section 1, with deep research in number theory, and section 2, with deep research in statistics, with applications. You may skip one of the two sections depending on your interests and how much time you have. Both sections, despite state-of-the-art in their respective fields, are written in simple English. It is my wish that with this article, I can get data scientists to be interested in math, and the other way around: the topics in both cases have been chosen to be exciting and modern. I also hope that this article will give you new powerful tools to add to your arsenal of tricks and techniques. Both topics are related, the statistical analysis being based on the numbers discussed in the math section. One of the interesting new topics discussed here for the first time is the cross-correlation between the digits of two irrational numbers. These digit sequences are treated as multivariate time series. I believe this is the first time ever that this subject is not only investigated in detail, but in addition comes with a deep, spectacular probabilistic number theory result about the distributions in question, with important implications in security and cryptography systems. Another related topic discussed here is a generalized version of the Collatz conjecture, with some insights on how to potentially solve it.Read the full article here. Content1. On the Digits Distribution of Quadractic Irrational NumbersProperties of the recursionReverse recursionProperties of the reverse recursionConnection to Collatz conjectureSource codeNew deep probabilistic number theory resultsSpectacular new result about cross-correlationsApplications2. New Statistical Techniques Used in Our AnalysisData, features, and preliminary analysisDoing it the right wayAre the patterns found a statistical illusion, or caused by errors, or real?Pattern #1: Non-Gaussian behaviorPattern #2: Illusionary outliersPattern #3: Weird distribution for block countsRelated articles and booksAppendixSee More

Summary: The Gartner Magic Quadrant for Data Science and Machine Learning Platforms is just out the big news is how much more capable all the platforms have become. Of course there are also some interesting winner and loser stories.The Gartner Magic Quadrant for Data Science and Machine Learning Platforms is just out for 2020. The really big news is how many excellent choices are now available. In a remarkable move, the whole field of competitors has moved strongly up and to the right offering more and more Leaders or near-leader Visionaries than ever before.It’s a mark of maturity in our industry that so many platforms offer fully capable model development, operationalizing, and management features. That list of requirements as defined by Gartner grows longer every year and earning a better rating requires increasing capability and increasing customer satisfaction.What Are the Major Changes?As in previous years we’ve charted the major changes in position using green arrows for improvement and red arrows to indicate a reduced rating. The blue dots are current ratings and the gray dots are from a year ago.Read the full article here with the 2020 version of the above chart, with comments.See More

In this notebook, we try to predict the positive (label 1) or negative (label 0) sentiment of the sentence. We use the UCI Sentiment Labelled Sentences Data Set.Sentiment analysis is very useful in many areas. For example, it can be used for internet conversations moderation. Also, it is possible to predict ratings that users can assign to a certain product (food, household appliances, hotels, films, etc) based on the reviews.In this notebook we are using two families of machine learning algorithms: Naive Bayes (NB) and long short term memory (LSTM) neural networks.AYLIENDeeplearning4jUnderstanding LSTM NetworksEmpirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling The Unreasonable Effectiveness of Recurrent Neural NetworksWe will use pandas, numpy for data manipulation, nltk for natural language processing, matplotlib, seaborn and plotly for data visualization, sklearn and keras for learning the models.Read the full article with source code and illustrations, here. See More