Subscribe to DSC Newsletter

The alphabet of data science...

  • Artificial Intelligence:: AI is the capability of a machine to imitate intelligent human behavior. BMW, Tesla, Google are using AI for self-driving cars. AI should be used to solve real world tough problems like climate modeling to disease analysis and betterment of humanity.
  • Boosting and Bagging: it is the technique used to generate more accurate models by ensembling multiple models together
  • Crisp-DM: is the cross industry standard process for data mining.  It was developed by a consortium of companies like SPSS, Teradata, Daimler and NCR Corporation in 1997 to bring the order in developing analytics models. Major 6 steps involved are business understanding, data understanding, data preparation, modeling, evaluation and deployment.
  • Data preparation: In analytics deployments more than 60% time is spent on data preparation. As a normal rule is garbage in garbage out. Hence it is important to cleanse and normalize the data and make it available for consumption by model.
  • Ensembling: is the technique of combining two or more algorithms to get more robust predictions. It is like combining all the marks we obtain in exams to arrive at final overall score. Random Forest is one such example combining multiple decision trees.
  • Feature selection: Simply put this means selecting only those feature or variables from the data which really makes sense and remove non relevant variables. This uplifts the model accuracy.
  • Gini Coefficient: it is used to measure the predictive power of the model typically used in credit 

Read the whole alphabet here

Views: 128

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service