A Data Science Central Community
In this post, we learn about building a basic search engine or document retrieval system using Vector space model. This use case is widely used in information retrieval systems. Given a set of documents and search term(s)/query we need to retrieve relevant documents that are similar to the search query.
The problem statement explained above is represented as in below image. …Continue
Added by suresh kumar Gorakala on November 7, 2017 at 6:30am — No Comments
Cross-row and group computation often involves computing link relative ratio and year-on-year comparison. Link relative ratio refers to comparison between the current data and data of the previous period. Generally, it takes month as the time interval. For example, compare the sales amount of April with that of March, and the growth rate we get is the link relative ratio of April. Hour, day, week and quarter can also be used as the time…Continue
Added by Jessica May on September 22, 2014 at 2:00am — No Comments
It is common to use R language to group and summarize data of files. Sometimes we may find ourselves processing comparatively big files which have smaller computed result and bigger source data. We cannot load them wholly to the memory when we need to compute them. The only solutions could be batch importing and computing as well as result merging. We’ll use an example in the following to illustrate the way of R language to group and summarize data from big text files.
Here is a file,…Continue
Both esProc and R language are typical data processing and analysis languages with two-dimension…Continue
Added by Jessica May on July 21, 2014 at 1:36am — No Comments
Added by Joey Fitts on August 13, 2009 at 9:37pm — No Comments