Subscribe to DSC Newsletter

All Blog Posts Tagged 'text' (30)

Semantic Roles according to Word2Vec

Figure 1. Scatter plot of word embedding coordinates (coordinate #3 vs. coordinate #10). You can see that semantically related words are close to each other.

This blog post is an extract from chapter 6 of the book “From Words to Wisdom. An Introduction to Text Mining…

Continue

Added by Rosaria Silipo on May 7, 2018 at 12:00am — No Comments

Information Retrieval Document Search Engine in R

Introduction:

In this post, we learn about building a basic search engine or document retrieval system using Vector space model. This use case is widely used in information retrieval systems. Given a set of documents and search term(s)/query we need to retrieve relevant documents that are similar to the search query. 

Problem statement:

The problem statement explained above is represented as in below image. …

Continue

Added by suresh kumar Gorakala on November 7, 2017 at 6:30am — No Comments

New book on social media analytics

I am pleased to announce my new book on social media analytics. The book offers concepts, tools, tutorials, and cases studies to understand and analyze the seven layers of social media data, including text, actions, networks, apps, hyperlinks, search engine, and location…

Continue

Added by Dr. Gohar Khan on July 6, 2015 at 7:14pm — No Comments

esProc Helps Process Structured Text in Java - Import data into the database

While importing the structured text files into the database using Java alone, we need to combine the SQL statements together manually, and to deal with various troublesome situations as well, like if the data in a table has been existed, whether we should update it or insert data into it, if some fields are included in the file, and if the fields in the file are consistent with those in the table.

 

As esProc participates in Java programming, these problems can be solved…

Continue

Added by Lynn Guo on December 22, 2014 at 7:04pm — No Comments

esProc Helps Process Structured Texts in Java – Handle Big Files in Groups

There is a type of text files that they are too big to be entirely loaded into the memory, yet as the data have been sorted by a certain column and if they are imported in groups according to this column, they can be all put into the memory for computing. These text files include the call detail record of a telecom company, statistics of visitors on a website, information of members of a shopping mall, etc.

 

A great deal of complicated code, which is difficult to maintain, is…

Continue

Added by Lynn Guo on December 15, 2014 at 6:24pm — No Comments

esProc Helps Process Structured Texts in Java –Expression Computing

As Java doesn’t directly support dynamically parsing expressions in the text files, the computation can only be realized by splitting strings manually and then writing a recursive program. The whole process requires writing a great amount of code, is complicated and the code is difficult to maintain. With the assistance of esProc, we can develop program for the computation in Java without writing code manually. Let’s look at how esProc works through an example.

 

Here is a text…

Continue

Added by Lynn Guo on December 10, 2014 at 6:30pm — No Comments

Processing Structured Text in Java–Conditional Filtering

Following problems will arise if you perform conditional filtering on text files in Java alone: 

1. The text file is not a database,so it cannot be accessed by SQL. The code needs to be modified if filtering conditions are changed. Besides, if you want a flexible conditional filtering as that in SQL, you have to self-program the dynamic expression parsing and evaluating, resulting in a great amount of programming work.

2. Stepwise loading is required for the big files that…

Continue

Added by Lynn Guo on November 23, 2014 at 6:00pm — No Comments

esProc Helps Process Structured Texts in Java – Set Operations

Java doesn’t support set operations directly, so nested loops have to be used to realize the operations of intersection, union, complement and etc. between text files. If there are many text files, or the file to be computed is too big to be loaded into the memory, or it is required to perform set operations according to multiple fields, the code will become even more complicated. However, with the assistance of esProc, which supports set operations…

Continue

Added by Lynn Guo on November 13, 2014 at 6:00pm — No Comments

esProc Helps Process Structured Texts in Java – Set Operations

Java doesn’t support set operations directly, so nested loops have to be used to realize the operations of intersection, union, complement and etc. between text files. If there are many text files, or the file to be computed is too big to be loaded into the memory, or it is required to perform set operations according to multiple fields, the code will become even more complicated. However, with the assistance of esProc, which supports set operations directly, Java can realize these…

Continue

Added by Lynn Guo on November 11, 2014 at 12:00am — No Comments

Another way to Process structured text in java - Non-Single row records

esProc can help Java deal with various computations in processing structured texts. But in the case of non-single row records, it is necessary to preprocess the data before esProc can perform computations on it. 

Let’s look at this through an example. The text file Social.txt is the access records of a website, in which every three rows corresponds to a record. The records should be rearranged first before other computations can be performed. They should be imported in the form…

Continue

Added by Lynn Guo on November 4, 2014 at 8:30pm — No Comments

A use case to read and analyze Excel data in Java

Generally, Java programmers use poi or other open source packages to read and compute Excel data. These open source packages support low-level programming, which increases the overall learning cost and complicates the operation. But with the help of esProc, Java can avoid these problems.…

Continue

Added by Jessica May on October 8, 2014 at 12:26am — No Comments

Data alignment join in Java for easier text analytics

The join statements of the database can be used conveniently to perform the operation of alignment join. But sometimes the data is stored in the text files, and to compute it in Java alone we need to write a large number of loop statements. This makes the code cumbersome. Using esProc to help with programming in Java can solve the problem easily and quickly. Let’s look at how this works…

Continue

Added by Jessica May on September 28, 2014 at 8:00am — No Comments

How to Process Text Files in the Data Analytics

Text files often brings headache for data analysts, are there any more convenient way for text files process? I prepare a case on how esProc deal with it, including import various text files; process big text files; visit text files of hdfs; as well as general operations, such as, file moving, deleting and checking whether a file exists. The following will illustrate these functions through examples. …

Continue

Added by Jessica May on July 14, 2014 at 1:30am — No Comments

Video Clips Expound Upon Predictive Analytics

Since February's launch of my book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, I have participated in a number of video interviews that explore the topic and field of predictive analytics. Here is a sampling:

 

Bloomberg TV – Predictive Analytics in Four Minutes:

 …

Continue

Added by Eric Siegel on August 21, 2013 at 1:16pm — No Comments

Financial Times Review of “Predictive Analytics”: “Ignores Dark Overtones”

The Financial Times reviewed my book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.

 

Click here to read the full Financial Times review (free membership required).

 

Excerpt from the book review:

Book…

Continue

Added by Eric Siegel on August 15, 2013 at 9:22am — 2 Comments

The New Predictive Profession – Odd Yet Newly Legitimate

Here's a review of my book Predictive Analytics from Robert Nisbet, Ph.D., a leading consultant, author, and predictive analytics instructor at University of California – Irvine (posted here with his permissoin).

 

Review of Predictive Analytics – The Power to Predict Who Will Click, Buy, Lie, or Die  By Eric Siegel.

Robert Nisbet, Ph.D.

March 21, 2013

 

Predictions have a problem.  They are viewed…

Continue

Added by Eric Siegel on August 12, 2013 at 12:16pm — 1 Comment

Book FAQ: Is the Book “Predictive Analytics” for Experts?

When you invest the time to read a book, you're investing a lot more than the $17 to buy it.

Many ask whether my book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die, is at the right level for their needs. Is it too advanced? (Quick answer: definitely not.) Will it instruct me on how to execute on predictive analytics? (Not directly – it is an industry primer rather than a…

Continue

Added by Eric Siegel on July 17, 2013 at 11:45am — No Comments

Predicting Lying and Predicting Dying

Who benefits by predicting your behavior? Organizations do—companies, governments, hospitals, and political campaigns. They employ predictive analytics, technology that learns from data to render per-person predictions, one individual at a time.

 

People have been struck by the final words in the title of my new book on this subject, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die (…

Continue

Added by Eric Siegel on April 4, 2013 at 1:53pm — No Comments

WSJ: HP Piloted Program to Predict Which Workers Would Quit

Joel Schectman at the Wall Street Journal wrote about a story broken in my new book, Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die.

 

Wall Street Journal Article:

Book: HP Piloted Program to Predict Which Workers Would Quit

 

Joel Schectman, Wall Street Journal

 

Hewlett Packard Co. tested a predictive scoring system that attempted…

Continue

Added by Eric Siegel on March 20, 2013 at 2:19pm — No Comments

Get "Predictive Analytics" - the Book - and Enjoy Free Online Training

Predictive Analytics: GET THE BOOK AND RECEIVE FREE ONLINE TRAINING

April 3rd is "Predictive Analytics" Day - not the science, the book! To build awareness of Eric Siegel's new, acclaimed book, "Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie, or Die" (published by Wiley Feb. 19), we're providing an offer ya can't refuse.

ORDER THE BOOK ON APRIL 3 VIA AMAZON (under $15) FOR:

1. Free access to the first of four modules of the…

Continue

Added by Eric Siegel on March 12, 2013 at 11:04am — No Comments

Monthly Archives

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service