A Data Science Central Community
There is a type of text files that they are too big to be entirely loaded into the memory, yet as the data have been sorted by a certain column and if they are imported in groups according to this column, they can be all put into the memory for computing. These text files include the call detail record of a telecom company, statistics of visitors on a website, information of members of a shopping mall, etc.
A great deal of complicated code, which is difficult to maintain, is…Continue
Added by Lynn Guo on December 15, 2014 at 6:24pm — No Comments
During developing the database applications, we often need to perform computations on the grouped data in each group. For example, list the names of the students who have published papers in each of the past three years; make statistics of the employees who have taken part in all previous training; select the top three days when each client gets the highest scores in a golf game; and the like. To perform these computations, SQL needs multi-layered nests, which will…Continue
Added by Lynn Guo on December 3, 2014 at 6:30pm — No Comments
esProc can help Java deal with various computations in processing structured texts. But in the case of non-single row records, it is necessary to preprocess the data before esProc can perform computations on it.
Let’s look at this through an example. The text file Social.txt is the access records of a website, in which every three rows corresponds to a record. The records should be rearranged first before other computations can be performed. They should be imported in the form…Continue
Added by Lynn Guo on November 4, 2014 at 8:30pm — No Comments