A Data Science Central Community
This is an example where data science and statistical analysis is superior to intuition. Here, intuition is misleading you into the wrong conclusions.
By twin data points, I mean observations that are almost identical. In any 2- or 3-dimensional data set with 300+ rows, if the data is quantitative and evenly distributed in a…Continue
My new blog post on how to do the equivalent of SQL's "CREATE TABLE" in the Pandas Python Data Analysis Library. Sounds simple, but I wasn't able to find such an example anywhere on the web.
Added by Michael Malak on June 26, 2013 at 1:29pm — No Comments
Intellipaat will Start a new Hadoop Developer Batch From 29th june 2013. Interested Candidates Drop an Email at sales(@)intellipaat(dot)com.
sales Team Intellipaat
Here is a crisp info graphic which communicates the top 5 data products which can be $ denting in the Airline industry
1. Property reccomender
2. Word of mouth modeler
3, Funnel friction Spotter
4. Traveller churn scorer
5. Sentiment Analyzer
In this series I reveal and explain rules of intelligence contained within grammar, that can be utilized to unleash intelligence in software. These rules are extremely simple, but still undiscovered by scientists.
To be able to explain making assumptions, we need to understand (the difference with) drawing conclusions first:
• Conclusions are drawn straight ahead - top-down - like in: Given "John is a father." and "…
Added by Menno Mafait on June 19, 2013 at 10:39pm — No Comments
This is a review committee working group for practitioners and academics to establish a formal definition and set of classification criteria regarding business analytics model risk.
This group has been established based upon interest and feedback concerning a recent set of posts regarding business analytics model risk: …Continue
Added by Scott Mongeau on June 19, 2013 at 6:53am — No Comments
Added by Scott Mongeau on June 17, 2013 at 10:00am — No Comments
Seven Questions on Adopting Analytics Culture
Seven questions are posed and are addressed in serial. The theme: ‘how can organizations adopt analytics-based decision making culture?’
In particular, the questions address the use of change management to adopt evidence-based decision making, associated organizational challenges, and how analytics can be used to manage organizational change itself:…Continue
Business analytics model risk (part 0 of 5): framing model risk - the complexity genie and the challenge of deciding on decision models
Introduction to a series of five articles on model risk
Here we introduce a series of five articles seeking to frame, define, and categorize business analytics model risk. The intention is to propose processes and practices for strengthening organizational decision model risk mitigation. The series of five…Continue
Added by Scott Mongeau on June 13, 2013 at 3:54pm — No Comments
My new blog post on querying Hive from iPython Notebook with pandas, the Python alternative to R:
Added by Michael Malak on June 13, 2013 at 9:44am — No Comments
In my recent article on a new, robust coefficient of correlation and R Squared, I mentioned an algorithm to generate random permutations:
Added by Vincent Granville on June 10, 2013 at 9:30pm — No Comments
I attendedAngelHack Sydney recently during the month of May.
AngelHack is a hackathon where developers and entrepreneurs come together to prototype a viable business idea within 24 hours.
The project that I worked on was called "DropQuery". The basic concept is this.
* You have some data files - CSV, XLS, XML
* You want to quickly query it.
I talked to a few people at the…Continue
Added by Eric Bae on June 6, 2013 at 6:28pm — No Comments
the next day wind speed is forecasted using different weather parameters. the forecasted wind speed is converted into electricity supply forecasts using wind turbine power curves. all forecasts have errors. hence most of the wind energy generators supply their electricity directly into the real time markets.
this causes them to leave a lot of upside on the table.
this is cause electricity cannot be stored. all that is produced is either consumed or grounded.shortfall…Continue
Added by Parag Patil on June 6, 2013 at 5:52pm — No Comments
Marketers are seeing the number of communication channels available to them increasing, while their budgets remain the same. The answer is not necessarily to do more marketing but to integrate the use of these channels effectively in order to find the right set of actions which bring the highest conversion rates, maximum profits and the most satisfied customers. For that, brands can rely on the analysis of consumer behavior, highlighting the customer lifecycle before purchasing and…Continue
Added by Michel Bruley on June 5, 2013 at 1:22am — No Comments
With big data, one sometimes has to compute correlations involving thousands of buckets of paired observations or time series. For instance a data bucket corresponds to a node in a decision tree, a customer segment, or a subset of observations having the same multivariate feature. Specific contexts of interest include multivariate feature selection (a combinatorial problem) or identification of best predictive set…Continue
The Big Data age is dawning. Just like every major emerging opportunity that presents an unprecedented competitive advantage across sectors, we might not know the ultimate outcome with this journey at this stage. But everyone wants to know: who will race to the finish line and come out on top? Whether you're the turtle or the hare, it's important to observe and manage the signs presented to you on the road to victory. Here's a collection of differentiators that will lead to…Continue
Added by Radhika Subramanian on June 4, 2013 at 11:02am — No Comments
Ever find yourself waiting for your data to appear and you start wondering if you are paying for sins from a past life? Dante must have been thinking of this situation as he created his Circles of Hell. There is no way the agony of waiting for data to appear with a looming deadline did not make his list!
Often it seems like the closer the deadline, the slower the connection. When working with data locally, the data appears…Continue
Added by Tricia Aanderud on June 4, 2013 at 5:23am — No Comments