A Data Science Central Community
This is an example where data science and statistical analysis is superior to intuition. Here, intuition is misleading you into the wrong conclusions.
By twin data points, I mean observations that are almost identical. In any 2- or 3-dimensional data set with 300+ rows, if the data is quantitative and evenly distributed in a…Continue
In my recent article on a new, robust coefficient of correlation and R Squared, I mentioned an algorithm to generate random permutations:
Added by Vincent Granville on June 10, 2013 at 9:30pm — No Comments
With big data, one sometimes has to compute correlations involving thousands of buckets of paired observations or time series. For instance a data bucket corresponds to a node in a decision tree, a customer segment, or a subset of observations having the same multivariate feature. Specific contexts of interest include multivariate feature selection (a combinatorial problem) or identification of best predictive set…Continue