A Data Science Central Community
New featured content for data scientists:
Data Science in Python: Pandas Cheat Sheet -- This cheat sheet, along with explanations, was first published on DataCamp. Click on the picture to zoom in. To view other cheat sheets (Python, R, Machine Learning, Probability, Visualizations, Deel Learning, Data Science, and so on) click here. To read the article, click here.
Will Trump Kill Statistician's Jobs? -- Today Trump met with leaders of pharmaceutical companies, to discuss “astronomical” drug prices and reduce regulations, so that drug companies can still make hefty profits while charging less for drugs. The motivation could be to keep the costs of healthcare down to facilitate the elimination of Obamacare. But how do you achieve such a goal? Someone somewhere has to be the loser in that game. Read article
How to Handle Outliers in Regression Problems -- In this article, we discuss a general framework to drastically reduce the influence of outliers in most contexts. It applies to problems such as clustering (finding centroids,) regression, measuring correlation or R-Squared, and many more. Read article.
26 Great Articles and Tutorials about Regression Analysis -- This resource is part of a series on specific topics related to data science: regression, clustering, neural networks, deep learning, decision trees, ensembles, correlation, ouliers, regression Python, R, Tensorflow, SVM, data reduction, feature selection, experimental design, cross-validation, model fitting, and many more. To access this resource, click here.
How to Keep Your R Code Simple While Tackling Big Datasets -- Upcoming DSC Webinar. R, TERR, Spark and Python are tools that benefit from larger systems. Software-Defined Servers enable data scientists to size their processing system to the size of a particular data problem. In this Data Science Central webinar you will learn how Software-Defined Servers work in practice for several common data science tools and will explore how removing core and memory constraints has multiple, profound and positive implications for application developers tackling big data problems of all kinds. Register here.