Subscribe to DSC Newsletter

October 2012 Blog Posts (44)

Traditional BI vs Data Analytics Approach

See: …


Added by Michael Walker on October 31, 2012 at 10:37am — No Comments

Simple technique to improve poor predictive models

This technique does not exploit the original data used to produce the model, but just the predicted and observed values, and nothing else. It was initially designed in the context of time series, to improve daily weather forecasts or daily stock trading signals.

The enhanced model in the chart below is an example of…


Added by Vincent Granville on October 30, 2012 at 10:00pm — 2 Comments

De-duplicating, merging customer records with clustering

Frustrated with multiple records of the same customer which just differ due to a typo or abbreviation or different possible representations of same address?

Customer duplicate records could be very tricky. They suffer the problems such as abbreviating the address, typos and various possible representation of same address and name.

read more @...…


Added by Venkatesh Umaashankar on October 30, 2012 at 8:50am — 1 Comment

SAS Stored Process: Using Alternating Background Color with PROC REPORT

I remember when I began learning how to program using SAS.  One of my first tasks was to create a set of reports that listed the trouble tickets from the customer.  This was before it was easy to get information from SAS to Excel and the goal was to have the reports on the website so it could be reviewed easily.

All Hail My SAS Heroes!

PROC REPORT results were ugly and hard to read - the manager did not complain but he wasn't impressed.  What a breakthrough when I found…


Added by Tricia Aanderud on October 30, 2012 at 6:33am — No Comments

Variable Reduction and significance

Hi everyone !, greetings from southamerica !, at the moment I`m working with 70 variables, (weather), there is multicolinearity in many of them....I d like to know if there is a way of creating 1 variable (from every 12) that resume the info contained in the rest. I´ve done PCA, Cluster, GWR (in GIS) I`ve tried with glm (SAS) and Stepwise (forward and backward) and there is no way I can find relevant variables over dependant variable.
Thanks a lot !

Added by Mauricio G. on October 30, 2012 at 6:15am — 2 Comments

Optimization plugin for RapidMiner

 Optimization in general means selecting a best choice out of various alternatives, which reduces the cost or disadvantage of an objective.  Optimization problems are very popular in the fields such as economics, finance, logistics, etc.


more... @ …


Added by Venkatesh Umaashankar on October 29, 2012 at 4:12am — No Comments

Application of Analytics

Author: Rahul Nawab, Co- founder, IQR Analytics and Promoter, ADSA (Academy for Decision Science and Analytics).


In this article we will talk about some of the successful applications of analytics. We will start with 2 different examples which I was reading recently from a podcast of Accenture and then move to some of the other industrial examples.

1.     Harrah’s Entertainment


Gary Loveman, CEO of…


Added by AcademyForDecisionScience&Analyt on October 29, 2012 at 12:59am — No Comments

Reverse word of mouth advertising to find a new position

This simple computational marketing technique, called reverse endorsement, will let you (indirectly) sell yourself to hiring managers, without having to write anything or talk to anyone.

Reverse endorsement leverages the "Skills & Expertise" feature now found on most LinkedIn profiles. I believe that when LinkedIn…


Added by Vincent Granville on October 28, 2012 at 8:30pm — No Comments

Top 16 articles for week ending October 24

From Data Science Central and AnalyticBridge

  1. Splunk for Hadoop – Real Time Analytics…

Added by Vincent Granville on October 25, 2012 at 10:00pm — No Comments

How to reverse-engineer Google?

Or blending data science with the art of search engine optimization (SEO). Here we propose a statistical methodology to increase the amount of organic traffic that a web site receives from Google for specific keywords, leveraging SEO principles to make it a real science, not just an art.

Traditionally, SEO (when implemented by…


Added by Vincent Granville on October 25, 2012 at 9:30pm — 3 Comments

Row vs Columnar vs NoSQL Databases




Added by Michael Walker on October 24, 2012 at 5:43pm — No Comments

10 Tips to Create Useful & Beautiful Visualizations


Added by Vincent Granville on October 24, 2012 at 1:00pm — No Comments

Key words through graph entropy Hierarchical clustering


In the last post I showed how to extract key words from a text through a principle called graph entropy.

Today I'm going to show another application of the graph entropy in order to extract clusters of key words.


The key words of a document depict the main topic of the content, but if the document is big, often, there are many different sub topics related to the…


Added by Cristian Mesiano on October 24, 2012 at 11:34am — No Comments

CrowdANALYTIX launches series of ideation and data mining contests in collaboration with Indian School of Business

Indian School of Business, one of the premier management institution's in India and rated one among the Top 20 B-Schools in the world for their post graduate programs has collaborated with CrowdANALYTIX to initiate a contest that will help students collaborate with a large community of data scientists on a ideation problem. This is the first in a sequence of 4 including ideation and data mining…


Added by Aravind on October 24, 2012 at 12:52am — No Comments

36 articles from top news outlets

Here is our selection for this week:

  1. Data Science: Neither Elementary Nor Magic…

Added by Vincent Granville on October 24, 2012 at 12:30am — No Comments

Simplifying Big Data Analytics

Most analytics and data projects have started thinking of investing in big data initiatives.  With so much buzz about big data, organizations have started investing or are thinking of investing in Hadoop While it is great to stay on top of trends, it often ends up being another investment where the full benefit and potential is simply not realized. The learning curve is too steep and the time to implement too high. Current analytics resources lack the strong programming skills required to…


Added by Rahul Deshmukh on October 23, 2012 at 1:35pm — No Comments

Greenplum & Kaggle partnership, and OpenChorus

Greenplum is very excited to announce the availability of the Greenplum Chorus source code and continue its goal of enabling organizations to derive greater insight and economic value from Big Data through a partnership with Kaggle, a platform for data science competitions. 

Greenplum’s …

Added by Vincent Granville on October 23, 2012 at 11:28am — No Comments

Want to receive a Complimentary Executive Summary of Our Big Data Analytics Study?

Big Boom in Big Data Analytics Research!

400+ Survey Respondents from Fortune 1000 Companies…


Added by Leslie Ament on October 20, 2012 at 10:34am — No Comments

Top 27 articles for week ending October 18

From Data Science Central and AnalyticBridge

  1. Sponsored: Free 30-day Vertica trial (HP's analytic and data base platform)…

Added by Vincent Granville on October 18, 2012 at 12:00pm — No Comments

Big Data & Text Mining: Finding Nuggets in Mountains of Textual Data


Big amount of information is available in textual form in databases or online sources, and for many enterprise functions (marketing, maintenance, finance, etc.) represents a huge opportunity to improve their business knowledge. For example, text mining is starting to be used in marketing, more specifically in analytical customer relationship management, in order to achieve the holy 360° view of the customer (integrating…


Added by Michel Bruley on October 18, 2012 at 7:16am — No Comments

Blog Topics by Tags

Monthly Archives














On Data Science Central

© 2021   TechTarget, Inc.   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service