A Data Science Central Community
This list of lists contains books, notebooks, presentations, cheat sheets, and tutorials covering all aspects of data science, machine learning, deep learning, statistics, math, and more, with most documents featuring Python or R code and numerous illustrations or case studies. All this material is available for free, and consists of content mostly created in 2019 and 2018, by various top experts in their respective fields. A few of these documents are available on LinkedIn: see last…Continue
Added by Vincent Granville on October 13, 2019 at 11:00am — No Comments
We are in the process of writing and adding new material (compact eBooks) exclusively available to our members, and written in simple English, by world leading experts in AI, data science, and machine learning. In the upcoming months, the following will be added:
Added by Vincent Granville on December 1, 2018 at 6:26pm — No Comments
Being extremely versatile general purpose, professional programming language, Python offers plenty of applications. Python language is user-friendly and simple to grasp and this made it popular throughout the world. Python plays a critical role for data scientists to find out lucrative job opportunities.
Today, Python has become the most in-demand programming language in the data science world. Python offers an extensive range…Continue
Added by Divyesh Aegis on September 5, 2019 at 12:00am — No Comments
In the data-driven enterprise system, Spark has become a popular name that is easy to use, offer speed and versatility. The data can be understood at fast speed allowing one to make faster decisions. The Big Data has a huge benefit with the faster data processing of Spark. This clustering of large datasets works with a framework in open source that helps in analyzing. The codes are done in the Scala that has made it possible and easier for data processing that gives a certain boost to the…Continue
Added by Divyesh Aegis on August 13, 2019 at 12:51am — No Comments
Properly implemented Machine Learning (ML) models can have a positive effect on organizational efficiency. It is first necessary to understand how these models are created, how they function, and how they are put into production.
The Definition of a Machine Learning Model
When a computer is presented with questions within a particular domain, a machine learning model will run an algorithm that will enable it to resolve those questions. These algorithms are not…Continue
Added by Arash Aghlara on August 7, 2019 at 3:30am — No Comments
Python was introduced in 1991 by Guido Van Rossum as a high level, general purpose language. Even today, it supports multiple programming paradigms including procedural, object oriented and functional. Soon, it became one of the most popular languages in the industry, and in fact is the very language that influence Ruby and Swift. Even TIOBE Index reports mentions python as the third most popular…Continue
Added by Divyesh Aegis on July 16, 2019 at 12:55am — No Comments
It will be unwise to expect you will generate lot of sales if you have significant amount of web traffic. It alone cannot be of much help in this matter. You will need to track the website metrics properly in order to take necessary measure to convert the traffic into your business prospects. You will need to analyze your website from time to time to ensure that it is not only accessible to the users but also provides all necessary guidance to show them the right way to make a…Continue
Added by Jenny Richards on June 6, 2019 at 1:30am — No Comments
We propose simple solutions to important problems that all data scientists face almost every day. In short, a toolbox for the handyman, useful to busy professionals in any field.
1. Eliminating sample size effects. Many statistics, such as correlations or R-squared, depend on the sample size, making it difficult to…Continue
Added by Vincent Granville on June 4, 2019 at 12:00pm — No Comments
This crash course features a new fundamental statistics theorem -- even more important than the central limit theorem -- and a new set of statistical rules and recipes. We discuss concepts related to determining the optimum sample size, the optimum k in k-fold cross-validation, bootstrapping, new re-sampling techniques, simulations, tests of hypotheses, confidence intervals, and statistical inference using a unified, robust, simple…Continue
Added by Vincent Granville on May 4, 2019 at 12:30pm — No Comments
Graph are meant to be seen
The third layer of graph technology that we discuss in this article is the front-end layer, the graph visualization one. The visualization of information has been the support of many types of analysis, including Social Network Analysis. For decades, visual representations have helped researchers,…
Added by Elise Devaux on April 9, 2019 at 4:00am — No Comments
The emergence of alternative data as a key enabler in expanding credit delivery and financial inclusion is unmistakable.
The saying that the only thing that is constant is change, is attributed to Heraclitus, the Greek Philosopher. This is so very relevant today in the way lenders use technology and scoring solutions to understand the credit worthiness of applicants. Credit Risk Management has come a long way from the days when banks used just one credit score cut off to…Continue
Added by Naagesh Padmanaban on March 25, 2019 at 11:15pm — No Comments
I present here some innovative results from my most recent research on stochastic processes. chaos modeling, and dynamical systems, with applications to Fintech, cryptography, number theory, and random number generators. While covering advanced topics, this article is accessible to professionals with limited knowledge in statistical or mathematical theory. It introduces new material not covered in my recent book (available …Continue
Added by Vincent Granville on March 21, 2019 at 7:30am — No Comments
Graph analytics frameworks consist of a set of tools and methods developed to extract knowledge…Continue
Added by Elise Devaux on February 27, 2019 at 5:00am — No Comments
This is another interesting problem, off-the-beaten-path. It ends up with a formula to compute the integral of a function, based on its derivatives solely.
For simplicity, I'll start with some notations used in the context of matrix theory, familiar to everyone: T(f) = g, where f and g are vectors, and T a square matrix. The notation T(f) represents the product between the matrix T, and the vector f. Now, imagine that the…Continue
Organizations across industries are adopting graph analytics to reinforce their anti-fraud programs. In this post, we examine three types of fraud graph analytics can help investigators combat: insurance fraud, credit card fraud, VAT fraud.
In many areas, fraud investigators have at their disposal large datasets in which clues are hidden. These clues are left behind by…
Added by Elise Devaux on January 22, 2019 at 12:30am — No Comments
A passionate customer always provides feedback about his favorite product if it touches his emotional chord.
Product review contains wealth of information. Analyzing the review texts can unearth many hidden data points about the customer and the product. Such insights can help grow the business and gain revenue.
Lets look into a specific example. …Continue
Added by Kaniska Mandal on January 24, 2019 at 3:30pm — No Comments
Why is graph visualization so important? How can it help businesses sifting through large amounts of complex data? We explore the answer in this post through 5 advantages of graph visualization and different use cases.
Also called network, a graph is a collection of nodes (or vertices) and edges (or links). Each node represents a single data point (a person, a phone number, a transaction) and each edge represents how two nodes…Continue
Added by Elise Devaux on January 11, 2019 at 9:25am — No Comments
Figure 1. Scatter plot of word embedding coordinates (coordinate #3 vs. coordinate #10). You can see that semantically related words are close to each other.
This blog post is an extract from chapter 6 of the book “From Words to Wisdom. An Introduction to Text Mining…Continue
Added by Rosaria Silipo on May 7, 2018 at 12:00am — No Comments
From detecting anomalies to understanding what are the key elements in a network, or highlighting communities, graph analytics reveal information that would otherwise remain hidden in your data. We will see how to integrate your graph analytics with Linkurious Enterprise to detect and investigate insights in your connected data.
Added by Elise Devaux on October 4, 2018 at 9:30am — No Comments
Full title: Applied Stochastic Processes, Chaos Modeling, and Probabilistic Properties of Numeration Systems. Published June 2, 2018. Author: Vincent Granville, PhD. (104 pages, 16 chapters.)
This book is intended for professionals in data science, computer science, operations research, statistics, machine learning, big data, and mathematics. In 100 pages, it covers many new topics, offering a fresh perspective on the subject. It is accessible to…Continue
Added by Vincent Granville on September 8, 2018 at 11:16am — No Comments