A Data Science Central Community
Randomness is all around us. Its existence sends fear into the hearts of predictive analytics specialists everywhere -- if a process is truly random, then it is not predictable, in the analytic sense of that term. Randomness refers to the absence of patterns, order, coherence, and predictability in a system.
When you're cleaning up data, you usually end up using a 5-8 functions a ton of times, and then a few more once or twice. Here are those 5-8 functions I find myself using again and again.
Here is a quick overview:
names() - returns the column names of a dateset…Continue
Added by Alex Woods on July 4, 2015 at 8:00am — No Comments
We all love features... lots of features! ...in our new cars, in our gadgets, in our smart phones, in our toys, and in our data sets!
Consider this toy that we found at the thrift store for under $6.00:
This toy house delivers numerous musical and other sound effects that are triggered whenever one of the features in the house is pressed, or…Continue
Added by Kirk Borne on January 9, 2015 at 2:30pm — No Comments
In a previous article, we defined data charaterization as a “methodology for generating descriptive parameters that describe the behavior and characteristics of a data item, for use in any unsupervised learning algorithm to find features, clusters, patterns, and trends in the data without the bias of incorporating class…Continue
Added by Kirk Borne on May 31, 2014 at 11:30am — No Comments
The extended annotated version of the "Big Data A to Z Glossary of my Favorite Data Science Things" is now live at: http://bit.ly/1g5NcBt
However, the original…Continue
Added by Kirk Borne on March 20, 2014 at 3:00pm — No Comments
Check out the new post on DataScienceCentral:
Added by Kirk Borne on February 28, 2014 at 5:16pm — No Comments
The application of analytics and data science methods to people is becoming an increasingly common use case for Big Data in the workplace. For example:
Added by Kirk Borne on February 15, 2014 at 11:09am — No Comments
Added by Kirk Borne on February 11, 2014 at 6:48pm — No Comments
Do you know which is the most sought after skill amongst HR professionals in 2014?
It is analytics.
More than 85% of HR professionals feel they will be able to do their job better if they pick up data analysis skills. (Source: http://jigsawacademy.com/em/2014/01/01/)
Jigsaw Academy is a leader in…Continue
Added by Kirk Borne on January 21, 2014 at 7:59am — No Comments
Here are 13 books on Machine Learning and Data Mining that are great resources, references, and refreshers for Data Scientists. (This is definitely a small selective subsample of the many excellent books available.)
Added by Kirk Borne on January 15, 2014 at 1:30pm — No Comments
Here are 13 informative and inspirational books on Big Data and Data Science. This is definitely not intended to be a comprehensive list (since a complete list of such readings would itself be a form of "Big Data", and consequently the number of possibilities is a nearly uncountable number! NOTE definition of "uncountable" = an infinite set that contains too many elements to be countable.)…Continue
When we devote so much time and energy talking about Big Data, are we neglecting the important things that you can do with Small Data?
Maybe, but... probably not.Continue