Subscribe to DSC Newsletter

Featured Blog Posts (1,486)

20 Questions to Ask Prior to Starting Data Analysis

It is crucial to ask the right questions and/or understand the problem, prior to beginning data analysis. Below is a list of 20 questions you need to ask before delving into analysis:

  1. Who is the…
Continue

Added by Cynthia Clare on May 23, 2018 at 8:30pm — No Comments

Mathematical Olympiads for Undergrad Students

Mathematical Olympiads are popular among high school students. However, there is nothing similar for college students, except maybe IMC. Even IMC is not popular. It focuses mostly on the same kind of problems as high school Olympiads, and you can not participate if you are over 23 years old. In addition, it is organized by country, as opposed to globally, thus favoring countries with a large population. Topics such as…

Continue

Added by Vincent Granville on May 25, 2018 at 9:00am — No Comments

The Role of Predictive Analytics in Medical Diagnosis

Predictive analytics uses current and historical data in order to determine the probability of a particular outcome. This is a particularly powerful approach when it is applied to medical diagnosis. In an effort to reduce misdiagnosis, historical data of former patient’s symptoms may be applied to the assessment of a new patient.

While doctors are the ultimate experts and decision-makers, using predictive analytics as a means of establishing precedent for…

Continue

Added by Goli Tajadod on May 22, 2018 at 2:30am — No Comments

Machine Learning with Signal Processing Techniques

Stochastic Signal Analysis is a field of science concerned with the processing, modification and analysis of (stochastic) signals.

Anyone with a background in Physics or Engineering knows to some degree about signal analysis techniques, what these technique are and how they can be used to analyze, model and classify signals.

Data Scientists coming from a different fields, like Computer Science or Statistics, might not be aware of the analytical power these techniques bring with…

Continue

Added by ahmet taspinar on April 29, 2018 at 9:00am — No Comments

I Analyzed 10 MM digits of SQRT(2) - Look at My Findings

This article is intended for practitioners who might not necessarily be statisticians or statistically-savvy. The mathematical level is kept as simple as possible, yet I present an original, simple approach to test for randomness, with an interesting application to illustrate the methodology. This material is not something usually discussed in textbooks or classrooms (even for statistical students), offering a fresh perspective, and out-of-the-box tools that are useful in many contexts, as…

Continue

Added by Vincent Granville on March 31, 2018 at 10:30pm — 2 Comments

What is an Analytics Translator and Why is the Role Important to Your Organization?

Today, enterprises recognize the critical value of advanced analytics within the organization and they are implementing data democratization initiatives. As these initiatives evolve, new roles emerge in the organization. The newest of these analysis-related roles is the 'analytics translator'. As the enterprise considers the relevance of this new role within the business, it is important to understand the responsibilities of an Analytics…

Continue

Added by Kartik Patel on February 23, 2018 at 2:30am — No Comments

What is Clickless Analysis? Can it Simplify Adoption of Augmented Analytics? (Part 1 of 3 articles)

The concept of Clickless Analytics is one that will be happily embraced by business users and by the business enterprise. The reason is simple! Clickless Analytics allows users to find and analyze information without specialized skills, by using natural language.

In this, the first of a three-part series we discuss Clickless Analytics and how it can simplify user adoption of augmented analytics.

What is Clickless Analytics?

Clickless Analytics…

Continue

Added by Kartik Patel on January 25, 2018 at 5:30am — No Comments

Supervised learning in disguise: the truth about unsupervised learning

One of the first lessons you’ll receive in machine learning is that there are two broad categories: supervised and unsupervised learning. Supervised learning is usually explained as the one to which you provide the correct answers, training data, and the machine learns the patterns to apply to new data. Unsupervised learning is (apparently) where the machine figures out the correct answer on its own.

Supposedly, unsupervised learning can discover something new that has not been found…

Continue

Added by Danko Nikolic on February 14, 2018 at 1:00pm — No Comments

Easy Dashboards for Everyone Using Google Data Studio

No matter the job, most professionals do some level of analysis on their computer.  There are always some data sets that live outside the walls.  Or, some analyses that we know could be performed better in a not-easily-sharable tool such as excel, R, python, SPSS, SAS and so on.

So how do you share your personal analysis with others?  Often times people export…

Continue

Added by Laura Ellis on January 11, 2018 at 4:30pm — No Comments

Beautiful Number Theory Problem and Sandbox for Data Scientists

The Waring conjecture - actually a problem associated with a number of conjectures, many now being solved - is one of the most fascinating mathematical problems. This article covers new aspects of this problem, with a generalization and new conjectures, some with a tentative solution, and a new framework to tackle the problem. Yet it is written in simple English and accessible to the layman.

I also review a number of famous related mathematical conjectures, including one with a $1…

Continue

Added by Vincent Granville on January 10, 2018 at 6:00pm — No Comments

6 Predictions about Data Science, Machine Learning, and AI for 2018

Summary:  Here are our 6 predictions for data science, machine learning, and AI for 2018.  Some are fast track and potentially disruptive, some take the hype off over blown claims and set realistic expectations for the coming year.

It’s that time of year again when we do a look back in order to offer a look forward.  What trends will speed up, what things will actually happen, and what things won’t in the coming year for data science, machine…

Continue

Added by Vincent Granville on December 14, 2017 at 3:00pm — No Comments

High Precision Computing in Python or R

Here we discuss an application of HPC (not high performance computing, instead high precision computing, which is a special case of HPC)  applied to dynamical systems such as  the logistic map in chaos theory. defined as X(k) = 4 X(k) (1 - X(k-1)). 

For all these systems, the loss of precision propagates exponentially, to the point that after 50 iterations, all generated values are completely wrong. Tons of articles have been written on this subject - none of them acknowledging the…

Continue

Added by Vincent Granville on November 13, 2017 at 7:00pm — No Comments

Linear Models Don’t have to Fit Exactly for P-Values To Be Accurate, Right, and Useful

There is no need to get confused with multiple linear regression, generalized linear model or general linear methods. The general linear model or multivariate regression model is a statistical linear model and is written as Y = XB + U.





Usually, a linear model includes a number of different statistical models such as ANOVA, ANCOVA, MANOVA, MANCOVA, ordinary linear regression, t-test and F-test. The GLM is a generalization of multiple…

Continue

Added by Chirag Shivalker on November 2, 2017 at 11:30pm — 1 Comment

Information Retrieval Document Search Engine in R

Introduction:

In this post, we learn about building a basic search engine or document retrieval system using Vector space model. This use case is widely used in information retrieval systems. Given a set of documents and search term(s)/query we need to retrieve relevant documents that are similar to the search query. 

Problem statement:

The problem statement explained above is represented as in below image. …

Continue

Added by suresh kumar Gorakala on November 7, 2017 at 6:30am — No Comments

Fascinating Time Series with Cool Applications

Here we describe well-known chaotic sequences, including new generalizations, with application to random number generation, highly non-linear auto-regressive models for times series, simulation, random permutations, and the use of big numbers (libraries available in programming languages to work with numbers with hundreds of decimals) as standard computer precision almost always produces completely erroneous results after a few iterations  -- a fact rarely if ever mentioned in the scientific…

Continue

Added by Vincent Granville on November 6, 2017 at 8:30pm — No Comments

Interesting Problem: Self-correcting Random Walks

Section 3 was added on October 11. Section 4 was added on October 19.  A $2,000 award is offered to solve any of the open questions, click here for details

This is another off-the-beaten-path problem, one that you won't find in textbooks. You can solve it using data science methods (my approach) but the…

Continue

Added by Vincent Granville on October 4, 2017 at 2:00pm — 5 Comments

Book on Computer Programming

Data scientists use a range of tools in their work and some of these eventually require programming. This book, titled The Art and Craft of Computer Programming, is a guide to computer programming. It does not focus on a specific programming language, but instead contains the essential material from a first year Computer Science course. The book is available from Amazon.com.…

Continue

Added by Mark McIlroy on October 19, 2017 at 8:00pm — 1 Comment

What Kind of OLAP Do We Really Need?

The narrow-sensed OLAP

OLAP is part and parcel of a BI application. As the name suggests, the word is an acronym for online analytical processing. Users, frontline employees, to be precise, are responsible for performing various types of data processing online.  

But, the concept of OLAP tends to be used in a very narrow sense. It has almost become an equivalence of multidimensional analysis. Based on a prebuilt data cubic, the analysis performs summarization…

Continue

Added by JIANG Buxing on February 1, 2018 at 4:00am — No Comments

9 Off-the-beaten-path Statistical Science Topics with Interesting Applications

You will find here nine interesting topics that you won't learn in college classes. Most have interesting applications in business and elsewhere. They are not especially difficult, and I explain them in simple English. Yet they are not part of the traditional statistical curriculum, and even many data scientists with a PhD degree have not heard about some of these concepts.…

Continue

Added by Vincent Granville on October 2, 2017 at 1:43pm — No Comments

Audience is The Future Business Model, Data Analytics Can Improves It

Companies and enterprises are facing a daily grind, while they are also required to see to it that their customers are happy & satisfied, operations is efficient and employees are satisfied; and all this makes running the business – a real challenge. “Audience is the new business model”, and if any organization is struggling miserably to communicate with customers or their audience, there certainly is a negative impact of it across the business plan,…

Continue

Added by Chirag Shivalker on September 26, 2017 at 12:30pm — No Comments

Featured Monthly Archives

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2018   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service