In the series of implementing Recommendation engines, in my previous blog about recommendation system in R, I have explained about…
A Data Science Central Community
I would like to know What is the importance understanding underlying data distributions in a dataset before applying any machine learning algorithm - it can be either prediction or classification…Continue
Need advice Dear Community,I have a situation, where I need to classify items into groups (lets say 6). When I ran k-means 90% of my data fall in 1 group remaining 10% fall in other groups. What's…Continue
In this post, we learn about building a basic search engine or document retrieval system using Vector space model. This use case is widely used in information retrieval systems. Given a set of documents and search term(s)/query we need to retrieve relevant documents that are similar to the search query.
The problem statement explained above is represented as in below image. …Continue
As R programming language becoming popular more and more among data science group, industries, researchers, companies embracing R, going forward I will be writing posts on learning Data science using R. The tutorial course will include topics on data types of R, handling data using R, probability theory, Machine Learning, Supervised – unSupervised learning, Data Visualization using R, etc. Before going further, let’s just see some stats and tidbits on data science and…Continue
Recently I have come across a term, CRISP-DM - a data mining standard. Though this process is not a new one but I felt every analyst should know about commonly used Industry wide process. In this post I will explain about different phases involved in creating a data mining solution.
CRISP-DM, an acronym for Cross Industry Standard Process for Data Mining, is a data mining process model that includes commonly used approaches that data…