Subscribe to DSC Newsletter

April 2013 Blog Posts (25)

How to install MapR M3 on Ubuntu through Ubuntu Partner Archive.

In a recent post of mine I had mentioned about the partnership between MapR and Canonical towards an initiative to make Hadoop available with Ubuntu natively through Ubuntu Partner Archive. Since, the package has been released now, I thought of showing how to get it done. Trust me it's really cool to…


Added by Mohammad Tariq Iqbal on April 30, 2013 at 7:17pm — No Comments

Shooting stars

This is a follow up to our video series From chaos to clusters, made with data points moving over time to form clusters, and produced with open source and home-made data science algorithms.…


Added by Vincent Granville on April 29, 2013 at 7:00pm — 1 Comment

Vincent, one way bumpiness might be interpreted...

from a practitioner's perspective, is that it is a measure of noise, a detector of outliers that may show up as unaccounted-for noise, from the way, say, a process is producing the data, even a data-entry process, or some other force or process/system giving rise to that (those) particular noise(s). Yes?

So the thing would be try various tactics for reducing bumpiness, maybe by screening those outliers, etc., and even running a TSA on the residuals after factoring or…


Added by Bill Luker Jr on April 29, 2013 at 3:33pm — 1 Comment

Predicting the reality show winner using big data

I wonder, using big data and predictive analytic, can we predict the winner of x-factor or American Idols from the start of their audition performance? I think we might have a good chance to predict the winner right away.

What if we could only have the information from their first performance, what should be the variables to be used in the predictive model? Here’s from what I could think of:

  • The voice: quantified timbre, energy and rhythm
  • Song…

Added by Eka Aulia on April 29, 2013 at 9:25am — No Comments

Hadoop Herd : When to use What...

8 years ago not even Doug Cutting would have thought that the tool which he's naming after the name of his kid's soft toy would so soon become a rage and change the way people and organizations look at their data. Today Hadoop and BigData have almost become synonyms to each other. But Hadoop is not just Hadoop now. Over the time it has evolved into one big…


Added by Mohammad Tariq Iqbal on April 25, 2013 at 6:55pm — No Comments

From chaos to clusters - statistical modeling without models

Here I provide the mathematics, explanations and source code to produce the data and moving clusters in the From chaos to clusters video series.…


Added by Vincent Granville on April 24, 2013 at 11:00pm — 8 Comments

Hadoop+Ubuntu : The Big Fat Wedding.

Now, here is a treat for all you Hadoop and Ubuntu lovers. Last month, Canonical, the organization behind the Ubuntu operating system, partnered with MapR, one of the Hadoop heavyweights, in an effort to make Hadoop available as an integrated part of Ubuntu through its repositories. The partnership announced that…


Added by Mohammad Tariq Iqbal on April 20, 2013 at 8:10pm — No Comments

Take-aways from IE's first Predictive Analytics Summit in the Asia-Pacific

Having asked for a budget for it with special approval, I, of course, would take whatever the Event offered in the short-but fruitful 2 days in Hong Kong during 18th to 19th April, 2013. My overall feedback is positive and will recommend companies to spend their training budget for Analytics people to come to here instead of staying in the classroom to learn Stat 101.  This event meant to be for particioners

The Analytics market in…


Added by Jeffrey Ng on April 20, 2013 at 7:00pm — No Comments

Weekly Digest - April 22

Sponsored Announcements


Added by Vincent Granville on April 20, 2013 at 6:00pm — No Comments

Should business and engineering schools develop joint programs?

Businesses are increasingly using data-driven methods to make business decisions. Hence, there is a need for people with both good business skills and programming/quant skills. Finance/Accounting PhDs  and other business PhDs do have such skills, but they are few in number, are costly to hire, and the majority anyway prefer academia. This limits businesses to mainly hire bachelors or masters level candidates.

However, a majority of the…


Added by Srinivasan Krishnamurthy on April 19, 2013 at 10:30am — 2 Comments

Selected articles posted this week


Staff Postings…


Added by Vincent Granville on April 17, 2013 at 10:30pm — No Comments

The amateur data scientist and her projects

With so much data available for free everywhere, and so many open tools, I would expect to see the emergence of a new kind of analytic practitioners: the amateur data scientist.

Just like the amateur astronomer, the amateur data scientist will significantly contribute to the art and science, and will eventually solve…


Added by Vincent Granville on April 17, 2013 at 5:00pm — No Comments

How do you solve your "Pre-Etl" Source to target mapping problems?

It's one of the integration problems that most of the big palyers in the industry have pretty much left untouched, Anyone working in the data integration / data warehousing industy understands that when you build a data warehouse, you have to create these complex pre-ETL source mappings before the ETL developers start work. The way most organizations do this is with spreadsheets. Every organization has an exorbitant amount of spreadsheets that they use to document this stuff. Once…


Added by Mohammad Azad on April 15, 2013 at 2:30pm — 1 Comment

How can I predict when my customers will churn and did Big Data could help?

If you attract thousands of new customers this is worthless if an equal number leaves. Minimizing customer churn is surely a smart objective. But how can I predict when my customers will churn and did Big Data could help?


Facing this topic I have made a personal research, and realize a synthesis, which has helped me to clarify some ideas. The attached presentation does not intend to be exhaustive on the subject, but could perhaps bring you some useful…


Added by Michel Bruley on April 15, 2013 at 6:36am — No Comments

Intellipaat provides Hadoop online Training.

Intellipaat provides Hadoop online Training.

Hi,We will start a new Hadoop Developer batch from  20th april’13. Certification will be provided after successful completion of training.

Interested candidates please drop an email for registration at [email protected] or give us a call.


Sales Intellipaat Team

Mob: 91-9019368913  

Visit us at…


Added by soniya on April 15, 2013 at 5:06am — No Comments

Unleashing Intelligence through natural language (Part 2 - Autonomously generated conclusions)

In this series I reveal rules of intelligence contained within grammar, and explain how they can be utilized to unleash intelligence in software. These rules are extremely simple, but still undiscovered by scientists.

Under certain conditions, three types of conclusions that can be generated autonomously:

1) Specification substitution conclusion:

• Given "John is a father" and "A father is a man";

• Because of the common word…


Added by Menno Mafait on April 15, 2013 at 2:48am — No Comments

Is your data really Big(Data)??

The advent of so many noticeable tools and technologies for handling BigData problems has made the lives of a lot of people and organizations easier. A lot of these are open source, they have good support, good community and are pretty active. But there is another aspect of it. When things become easy, free, with good support and in abundance,  we often start to over-utilize them. Having said that, I would like to share one incident.

We organize …


Added by Mohammad Tariq Iqbal on April 14, 2013 at 4:05pm — No Comments

Three classes of metrics: centrality, volatility, and bumpiness

All statistical textbooks focus on centrality (median, average or mean) and volatility (variance). None mention the third fundamental class of metrics: bumpiness.

Here we introduce the concept of bumpiness and…


Added by Vincent Granville on April 14, 2013 at 2:30pm — 7 Comments

How to better compete with other data scientists

In two weeks, you can greatly improve your resume by learning new stuff, at no cost, without attending any classes. All the links below contain actual material to get you started.

Check out the following resources:…


Added by Vincent Granville on April 12, 2013 at 10:30am — No Comments

BI reporting tools: To Pay or Not To Pay.

Free VS Non-Free

Our company recently implemented a BI Reporting solution intended for a retail company with both, an offline retail chain and an online web-shop. This was a Demo project, so during the planning phase our team decided to use two different platforms, including different back-end and front-end software.

In general, we planned to compare the free and non-free software available for similar BI Reporting implementation.

We chose the following platforms for…


Added by Alexey on April 8, 2013 at 4:58am — No Comments

Blog Topics by Tags

Monthly Archives














On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service