In the series of implementing Recommendation engines, in my previous blog about recommendation system in R, I have explained about…

A Data Science Central Community

As R programming language becoming popular more and more among data science group, industries, researchers, companies embracing R, going forward I will be writing posts on learning Data science using R. The tutorial course will include topics on data types of R, handling data using R, probability theory, Machine Learning, Supervised – unSupervised learning, Data Visualization using R, etc. Before going further, let’s just see some stats and tidbits on data science and…

ContinueAdded by suresh kumar Gorakala on December 29, 2015 at 9:30am — 1 Comment

In the series of implementing Recommendation engines, in my previous blog about recommendation system in R, I have explained about…

Added by suresh kumar Gorakala on November 23, 2015 at 7:04pm — No Comments

Recently I have come across a term, CRISP-DM - a data mining standard. Though this process is not a new one but I felt every analyst should know about commonly used Industry wide process. In this post I will explain about different phases involved in creating a data mining solution. **CRISP-DM**, an acronym for **Cross Industry Standard Process for Data Mining**, is a data mining process model that includes commonly used approaches that data…

Added by suresh kumar Gorakala on October 22, 2015 at 10:59am — 2 Comments

In my previous blog I have explained about linear regression. In today’s post I will explain about logistic regression.

Consider a scenario where we need to predict a medical condition of a patient (HBP) ,HAVE HIGH BP or NO HIGH BP, based on some observed symptoms – Age, weight, Issmoking, Systolic value, Diastolic value, RACE, etc.. In this scenario we have to…

Continue
Consider a scenario where we need to predict a medical condition of a patient (HBP) ,HAVE HIGH BP or NO HIGH BP, based on some observed symptoms – Age, weight, Issmoking, Systolic value, Diastolic value, RACE, etc.. In this scenario we have to…

Added by suresh kumar Gorakala on October 7, 2015 at 9:33pm — No Comments

R is getting popular programming language in the area of Data Science. Integrating Rscript with web UI pages is a challenge which many application developers are facing. In this blog post I will explain how we can expose R script as an API, using rApache and Apache webserver.

rApache is a project supporting web application…

Added by suresh kumar Gorakala on April 20, 2015 at 3:30am — No Comments

A business problem which involves predicting future events by extracting patterns in the historical data. Prediction problems are solved using Statistical techniques, mathematical models or machine learning techniques.

For example: Forecasting stock price for the next week, predicting which football team wins the world cup, etc.

Added by suresh kumar Gorakala on December 26, 2014 at 12:43pm — 4 Comments

In my last post, I have explained about MSE, today I will explain the variance & bias trade-off, Precision recall trade-off while assessing the model accuracy.

Variance refers to the amount by which the estimated output (f) would change if we estimated it (f) using a different training dataset. Since the training data is used to fit the statistical learning method, different training sets will…

Added by suresh kumar Gorakala on August 5, 2014 at 6:24am — No Comments

Recently, I have started reading a book "Introduction to statistical Learning", which had good introduction for model accuracy assessing. This post contains excerpts of the chapter:

Often we take different statistical approaches to build a solution for a data analytical problem. Why is it necessary to introduce so many…

Added by suresh kumar Gorakala on August 5, 2014 at 6:00am — No Comments

In our day to day life, we come across a large number of Recommendation engines like Facebook Recommendation Engine for Friends’ suggestions, and suggestions of similar Like Pages, Youtube recommendation engine suggesting videos similar to our previous searches/preferences. In today’s blog post I will explain how to build a basic recommender System.…

Added by suresh kumar Gorakala on June 5, 2014 at 10:55pm — No Comments

In today’s blog post, we shall look into time series analysis using R package – forecast. Objective of the post will be explaining the different methods available in forecast package which can be applied while dealing with time series analysis/forecasting.

A time series is a collection of observations of well-defined data items…

ContinueAdded by suresh kumar Gorakala on May 9, 2014 at 2:23am — 1 Comment

Ever since I’ve started working on R , I always wondered how I can present the results of my statistical models as web applications. After doing some research over the internet I’ve come across ShinyR – a new package

from RStudio which can be used to develop interactive web applications with R.

Before going into how to build web apps using R, let me give you some overview about ShinyR.

**Features:**

- No JavaScript/HTML knowledge…

Added by suresh kumar Gorakala on March 23, 2014 at 1:30am — No Comments

In my previous blog post I have explained the steps needed to solve a data analysis problem. Going further, I will be discussing in-detail each and every step of Data Analysis. In this post, we shall discuss about exploratory Analysis.

**What is Exploratory Analysis?**

*“Understanding data…*

Added by suresh kumar Gorakala on March 6, 2014 at 11:09pm — 2 Comments