A Data Science Central Community
DFS, ''the next big thing'' is taking North America by storm and slowly knocking on Europe's doors. The way it works is simple: sports lovers select a team of real world athletes…Continue
Added by Jure Rejec on July 28, 2015 at 7:00am — No Comments
I’m going to keep this tutorial light on math, because the goal is just to give a general understanding.
The idea of Monte Carlo methods is this—generate some random samples for some random variable of interest, then use these samples to compute values you’re interested in.
I know, super broad. The truth is Monte Carlo has a ton of different applications. It’s…Continue
There are many advantages of using cloud computing solutions. It helps streamline workflow, makes it easier for the employees to report for duty, collaborate online, and enjoy access to key information. Cloud hosting also helps cut down operational costs. The biggest benefit of using cloud computing is that it allows businesses to concentrate on core business responsibilities rather than worrying about security, maintenance, back-up, support, and…Continue
Added by Fred Valentine on July 22, 2015 at 10:32pm — No Comments
Currently I am working on number of projects for scaling and developing BI solution in Qlikview for our clients. Following are some learning that I have got:
Scaling - when it comes to Qlikview it’s the most difficult question I have been asked. Reason is that a part of answer (and dare I say most crucial one) lies with the person asking it i.e. end customer; which is the amount of data that they are expecting the system to hold. Once we have this answer and keeping in consideration…Continue
Added by Vishal Sharma on July 21, 2015 at 6:17am — No Comments
There is lot of noise about Visualization of data in BI space these days. Simple charts, complex charts (few of them can not be digested by many business users :) ), tons of properties and configurations and much more....
Yes, a great work has been done in these area by most of the vendors in the BI marketplace today, and all decision makers are enthusiastic…Continue
Added by Kartik Patel on July 20, 2015 at 12:00am — No Comments
Linear regression is one of the first things you should try if you’re modeling a linear relationship (actually, non-linear relationships too!). It’s fairly simple, and probably the first thing to learn when tackling machine learning.
At first, linear regression shows up just as a simple equation for a line. In machine learning, the weights are usually represented by a vector θ (in statistics they’re often represented…Continue
The job of a data analyst nowadays has become very extensive, in its need to cover a number of different and ever-changing tasks.
A data analyst must query a variety of internal and external data sources, each with a different access protocol and format; integrate these data with results from REST and web services queried over the Internet, such as Google API or any social media channel; exchange information with business analysts, who, while lacking the deep mathematical background,…Continue
Added by Rosaria Silipo on July 14, 2015 at 2:27am — No Comments
It’s important to know what goes on inside a machine learning algorithm. But it’s hard. There is some pretty intense math happening, much of which is linear algebra. When I took Andrew Ng’s course on machine learning, I found the hardest part was the linear…Continue
Added by Alex Woods on July 10, 2015 at 10:30pm — No Comments
R has become a massively popular language for data mining and predictive model building with over two million users worldwide. The wide adoption of R has to do with the fact that it is available as open source, runs on most technology platforms and is commonly taught in academic institutions in courses with significant components of data science, machine learning and statistics. A recent study found that R is now cited in academic papers more often then SAS and SPSS, a change from previous…Continue
Added by Mark Rabkin on July 9, 2015 at 5:59pm — No Comments
Good Morning and Welcome to this addition of the Morning Analytic Coffee Blog.
Sorry for the recent delays, between holidays (Happy Canada Day and 4th of July, everyone!) and some personal items I’ve attended to.
For a while, I’ve been generalizing about Analytics. Today, I am getting…Continue
Added by Richard D. Quodomine on July 8, 2015 at 6:04am — No Comments
Random Forest is a machine learning algorithm used for classification, regression, and feature selection. It's an ensemble technique, meaning it combines the output of one weaker technique in order to get a stronger result.
The weaker technique in this case is a decision tree. Decision trees work by splitting the and re-splitting the data by…Continue
Added by Alex Woods on July 4, 2015 at 8:30am — No Comments
When you're cleaning up data, you usually end up using a 5-8 functions a ton of times, and then a few more once or twice. Here are those 5-8 functions I find myself using again and again.
Here is a quick overview:
names() - returns the column names of a dateset…Continue
Added by Alex Woods on July 4, 2015 at 8:00am — No Comments