Subscribe to DSC Newsletter

Featured Blog Posts (1,472)

Free book on Computer Programming

Data scientists use a range of tools in their work and some of these eventually require programming. This book, titled The Art and Craft of Computer Programming, is a guide to computer programming. It does not focus on a specific programming language, but instead contains the essential material from a first year Computer Science course. The book is available for free download from the author's personal website www.markmcilroy.com and…

Continue

Added by Mark McIlroy on October 19, 2017 at 8:00pm — No Comments

Interesting Problem: Self-correcting Random Walks

Section 3 was added on October 11. Section 4 was added on October 19.  An award will be offered to solve any of the open questions (announcement coming soon.)

This is another off-the-beaten-path problem, one that you won't find in textbooks. You can solve it using data science methods (my approach) but the mathematician with some spare time could find an elegant solution. Share it with your colleagues to see how math-savvy they are, or with your students. I was able to make…

Continue

Added by Vincent Granville on October 4, 2017 at 2:00pm — 5 Comments

What Kind of OLAP Do We Really Need?

The narrow-sensed OLAP

OLAP is part and parcel of a BI application. As the name suggests, the word is an acronym for online analytical processing. Users, frontline employees, to be precise, are responsible for performing various types of data processing online.  

But, the concept of OLAP tends to be used in a very narrow sense. It has almost become an equivalence of multidimensional analysis. Based on a prebuilt data cubic, the analysis performs summarization…

Continue

Added by JIANG Buxing on October 9, 2017 at 4:00am — No Comments

9 Off-the-beaten-path Statistical Science Topics with Interesting Applications

You will find here nine interesting topics that you won't learn in college classes. Most have interesting applications in business and elsewhere. They are not especially difficult, and I explain them in simple English. Yet they are not part of the traditional statistical curriculum, and even many data scientists with a PhD degree have not heard about some of these concepts.…

Continue

Added by Vincent Granville on October 2, 2017 at 1:43pm — No Comments

Audience is The Future Business Model, Data Analytics Can Improves It

Companies and enterprises are facing a daily grind, while they are also required to see to it that their customers are happy & satisfied, operations is efficient and employees are satisfied; and all this makes running the business – a real challenge. “Audience is the new business model”, and if any organization is struggling miserably to communicate with customers or their audience, there certainly is a negative impact of it across the business plan,…

Continue

Added by Chirag Shivalker on September 26, 2017 at 12:30pm — No Comments

Why Analytics Projects Fail – And It’s Not The Analytics!

Being in a highly technical, complex field it is easy to sometimes lose the ‘human aspect’ of the solutions we are developing. We focus on apply edge computing concepts, or whether a seasonality model works better for our predictive accuracy than some other approach. Don't get me wrong, these are all important activities. However, in working with many firms in developing, deploying and supporting advanced analytics solutions, particularly in the domain of the Industrial IoT space, it’s often…

Continue

Added by Ed Crowley on September 26, 2017 at 3:00pm — No Comments

Can you solve these mathematical / statistical problems?

I recently posted an article featuring a non traditional approach to find large prime numbers. The research section of this article offers interesting challenges, both for data scientists interested in mathematics, and for mathematicians interested in data science and big data. My approach is data, pattern recognition, and machine learning heavy. Here is the introduction:

Large prime numbers have been a topic of considerable research, for its own mathematical beauty, as well as to…

Continue

Added by Vincent Granville on September 21, 2017 at 9:00pm — No Comments

Building Convolutional Neural Networks with Tensorflow

In the past year I have also worked with Deep Learning techniques, and I would like to share with you how to make and train a Convolutional Neural Network from scratch, using tensorflow. Later on we can use this knowledge as a building block to make interesting Deep Learning applications.

The pictures here are from the full article. Source code is also provided.…

Continue

Added by ahmet taspinar on September 7, 2017 at 7:30am — No Comments

A Day in the life of an Analyst

A typical day in the life of an Analyst

An Analyst works on varied projects with multiple deliverables and varied duties depending on the business objectives.

However there are some tasks that can be easily classified as “common everyday duties” in a “typical work day of a business analyst”

Clarification and…

Continue

Added by Ivy Pro School on September 3, 2017 at 2:00pm — No Comments

Overpromising and Underperforming: Understanding and Evaluating Self-service BI Tools

From the OLAP concept in earlier years to the agile BI over the last few years, BI vendors never stop advertising the self-service capability, claiming that business users will be able to perform analytics by themselves. Since there are strong self-service needs among users, the two really hit it off and it is very likely that a quick deal is made. The question is - does a BI product’s self-service functionality enable a truly flexible data analytics by business users?

There isn’t a…

Continue

Added by JIANG Buxing on August 31, 2017 at 12:00am — 1 Comment

Curious Mathematical Object: Hyperlogarithms

Logarithms turn a product of numbers into a sum of numbers: log(xy) = log(x) + log(y). Hyperlogarithms generalize the concept as follows: Hlog(XY) = Hlog(X) + Hlog(y), where X and Y are any kind of objects, and the product and sum are replaced by operators in some arbitrary space. …

Continue

Added by Vincent Granville on August 16, 2017 at 12:00pm — 1 Comment

Fighting eCommerce fraud with graph technology



ECommerce fraud is growing quickly, creating new challenges in terms of prevention and detection. As merchants gather more and more information about customers and their behaviors, the key element in the fight against fraud is now to draw on the connections within the data collected to uncover fraudulent behaviors. In this post we explain why and how graph technologies are crucial in the detection of eCommerce fraud.…

Continue

Added by Elise Devaux on August 9, 2017 at 9:30am — No Comments

Type I and Type II Errors in One Picture

This picture speaks more than words. It explains the concept or false positive and false negative, that is, what is referred to by statisticians as Type I and Type II errors.

Other great pictures summarizing data science and statistical concepts, can be found…

Continue

Added by Vincent Granville on August 10, 2017 at 5:17pm — No Comments

Capturing Low-Probability, High-Impact Events 'Black Swans' in Economic and Financial Models

Capturing Low-Probability, High-Impact Events 'Black Swans' in Economic and Financial Models

Jamilu Auwalu Adamu , Lecturer, Nigeria

Incorporation of Fat - Tailed Effects of the Underlying Assets Probability Distribution using Advanced Stressed Methods.



Capturing the effects of Low-Probability, High-Impact "Black Swans" in the existing stochastic and deterministic models is tremendously…

Continue

Added by Jamilu Auwalu Adamu on July 31, 2017 at 8:30am — No Comments

Introducing User Behavioral Analysis in the Risk Process

Many years ago when I was entering the intelligence community, I attended a class in Virginia where the instructor opened the session with a test that I will never forget and that I have applied to almost every analytic task in my career. At the beginning of the class we were shown a ten-minute video of grand central station at rush hour with tens of thousands of people and were asked if we could find a single pickpocket in the crowd by the end of the video.  At the end of ten minutes no…
Continue

Added by Andrew Marane on July 31, 2017 at 11:30am — No Comments

How to Detect if Numbers are Random or Not

In this article, you will learn some modern techniques to detect whether a sequence appears as random or not, whether it satisfies the central limit theorem (CLT) or not -- and what the limiting distribution is if CLT does not apply -- as well as some tricks to detect abnormalities. Detecting lack of randomness is also referred to as signal versus noise detection, or pattern recognition.

It leads to the exploration of time series with massive, large-scale (long term) auto-correlation…

Continue

Added by Vincent Granville on July 10, 2017 at 12:00am — 5 Comments

Open sourcing 'spot the difference'

Capital One UK’s Data Science team has been focused on move from proprietary (paid-for) software to open source for some time now.

There are several key benefits to making this change. Open source software is prevalent in academia which makes it much easier for our new starters to hit the ground running, building models and analysing data on day one with the company (the switch has also been a terrific development opportunity for my team to learn new skills). Our team now has greater…

Continue

Added by Dan Kellett on July 21, 2017 at 1:30am — No Comments

Text Clustering : Get quick insights from Unstructured Data

In this two-part series, we will explore text clustering and how to get insights from unstructured data. It will be quite powerful and industrial strength. The first part will focus on the motivation. The second part will be about implementation.

This post is the first part of the two-part series on how to get insights from unstructured data using text clustering. We will build this in a very modular way so that it can be applied to any dataset. Moreover, we will also focus…

Continue

Added by Vivek Kalyanarangan on July 5, 2017 at 9:30pm — No Comments

For Companies, Data Analytics is a Pain; But Why?

Businesses across the globe are facing the brunt, one of huge data influx and second of increasing data complexity and of course the market volatility. To address these challenges, companies and all their verticals are turning to data-driven analytics and insights as a means to better understand their organization’s customer bases and to grow their businesses; and manage the increasing uncertainty up to a certain extent. 

The shift from conventional…

Continue

Added by Chirag Shivalker on June 22, 2017 at 1:30pm — No Comments

Featured Monthly Archives

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

Follow Us

On Data Science Central

On DataViz

On Hadoop

© 2017   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Terms of Service