A Data Science Central Community

*Guest blog post by Mic Farris. Mic is a Decision Science & Analytics Leader at CenturyLink.*

Two of the biggest buzzwords in our industry are “big data” and “data science”. Big Data seems to have a lot of interest right now, but Data Science is fast becoming a very hot topic.

I think there’s room to really define the **science** of data science – what are those fundamentals that are needed to make data science *truly *a science we can build upon?

What follows are such a set of fundamentals:

**Fundamentals of Data Science**

**Introduction**

The easiest thing for people within the big data / analytics / data science disciplines is to say “I do data science”. However, when it comes to data science fundamentals, we need to ask the following critical questions: What really is “data”, what are we trying to do with data, and how do we apply scientific principles to achieve our goals with data?

- What is Data?
- The Goal of Data Science
- The Scientific Method

**Probability and Statistics**

The world is a probabilistic one, so we work with data that is probabilistic – meaning that, given a certain set of preconditions, data will appear to you in a specific way *only part of the time*. To apply data science properly, one *must *become familiar and comfortable with probability and statistics.

- The Two Characteristics of Data
- Examples of Statistical Data
- Introduction to Probability
- Probability Distributions
- Connection with Statistical Distributions
- Statistical Properties (Mean, Mode, Median, Moments, Standard Deviation, etc.)
- Common Probability Distributions (Discrete, Binomial, Normal)
- Other Probability Distributions (Chi-Square, Poisson)
- Joint and Conditional Probabilities
- Bayes’ Rules
- Bayesian Inference

**Decision Theory**

This section is one of the key fundamentals of data science. Whether applied in scientific, engineering, or business fields, we are trying to make decisions using data. Data itself isn’t useful unless it’s telling us something, which means **we’re making a decision about what it is telling us**. How do we come up with those decisions? What are the factors that go into this decision making process? What is the best method for making decisions with data? This section tell us…

- Hypothesis Testing
- Binary Hypothesis Test
- Likelihood Ratio and Log Likelihood Ratio
- Bayes Risk
- Neyman-Pearson Criterion
- Receiver Operating Characteristic (ROC) Curve
- M-ary Hypothesis Test
- Optimal Decision Making

**Content**

The full article has the following additional sections, each with many interesting topics:

- Probability and Statistics
- Decision Theory
- Estimation Theory
- Coordinate Systems
- Linear Transformations
- Effects of Computation on Data
- Prototype Coding / Programming
- Graph Theory
- Algorithms
- Machine Learning

Click here to read the full article. Click here to read new articles published this week.

**DSC Resources**

- Services: Hire a Data Scientist | Search DSC | Classifieds | Find a Job
- Contributors: Post a Blog | Ask a Question
- Follow us: @DataScienceCtrl | @AnalyticBridge

## You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge