In the series of implementing Recommendation engines, in my previous blog about recommendation system in R, I have explained about…
A Data Science Central Community
We all know that deep learning algorithms improve the accuracy of AI applications to great extent. But this accuracy comes with requiring heavy computational processing units such as GPU for developing deep learning models. Many of the machine learning developers cannot afford GPU as they are very costly and find this as a roadblock for learning and developing Deep learning applications. To help the AI, machine learning developers Google has released…
Added by suresh kumar Gorakala on October 1, 2018 at 9:07am — No Comments
In this post, we learn about building a basic search engine or document retrieval system using Vector space model. This use case is widely used in information retrieval systems. Given a set of documents and search term(s)/query we need to retrieve relevant documents that are similar to the search query.
The problem statement explained above is represented as in below image. …Continue
Added by suresh kumar Gorakala on November 7, 2017 at 6:30am — No Comments
As R programming language becoming popular more and more among data science group, industries, researchers, companies embracing R, going forward I will be writing posts on learning Data science using R. The tutorial course will include topics on data types of R, handling data using R, probability theory, Machine Learning, Supervised – unSupervised learning, Data Visualization using R, etc. Before going further, let’s just see some stats and tidbits on data science and…Continue
Added by suresh kumar Gorakala on November 23, 2015 at 7:04pm — No Comments
Recently I have come across a term, CRISP-DM - a data mining standard. Though this process is not a new one but I felt every analyst should know about commonly used Industry wide process. In this post I will explain about different phases involved in creating a data mining solution.
CRISP-DM, an acronym for Cross Industry Standard Process for Data Mining, is a data mining process model that includes commonly used approaches that data…
Added by suresh kumar Gorakala on October 7, 2015 at 9:33pm — No Comments
Added by suresh kumar Gorakala on April 20, 2015 at 3:30am — No Comments
A business problem which involves predicting future events by extracting patterns in the historical data. Prediction problems are solved using Statistical techniques, mathematical models or machine learning techniques.
For example: Forecasting stock price for the next week, predicting which football team wins the world cup, etc.
In my last post, I have explained about MSE, today I will explain the variance & bias trade-off, Precision recall trade-off while assessing the model accuracy.
Variance refers to the amount by which the estimated output (f) would change if we estimated it (f) using a different training dataset. Since the training data is used to fit the statistical learning method, different training sets will…
Added by suresh kumar Gorakala on August 5, 2014 at 6:24am — No Comments
Recently, I have started reading a book "Introduction to statistical Learning", which had good introduction for model accuracy assessing. This post contains excerpts of the chapter:
Often we take different statistical approaches to build a solution for a data analytical problem. Why is it necessary to introduce so many…
Added by suresh kumar Gorakala on August 5, 2014 at 6:00am — No Comments
In our day to day life, we come across a large number of Recommendation engines like Facebook Recommendation Engine for Friends’ suggestions, and suggestions of similar Like Pages, Youtube recommendation engine suggesting videos similar to our previous searches/preferences. In today’s blog post I will explain how to build a basic recommender System.…
Added by suresh kumar Gorakala on June 5, 2014 at 10:55pm — No Comments
In today’s blog post, we shall look into time series analysis using R package – forecast. Objective of the post will be explaining the different methods available in forecast package which can be applied while dealing with time series analysis/forecasting.
A time series is a collection of observations of well-defined data items…Continue
Ever since I’ve started working on R , I always wondered how I can present the results of my statistical models as web applications. After doing some research over the internet I’ve come across ShinyR – a new package
from RStudio which can be used to develop interactive web applications with R.
Before going into how to build web apps using R, let me give you some overview about ShinyR.
Added by suresh kumar Gorakala on March 23, 2014 at 1:30am — No Comments
In my previous blog post I have explained the steps needed to solve a data analysis problem. Going further, I will be discussing in-detail each and every step of Data Analysis. In this post, we shall discuss about exploratory Analysis.
What is Exploratory Analysis?