Subscribe to DSC Newsletter

Discovering Content by Mining the Entity Web

November 18th, 2009 by Deep Dhillon

I had a blast last night presenting to CS students at the University of Washington. For those who missed the talk, the video is embedded below.

Abstract: Unstructured natural language text found in blogs, news and other web content is rich with semantic relations linking entities (people, places and things). At Evri, we are building a system which automatically reads web content similar to the way humans do. The system can be thought of as an army of 7th grade grammar students armed with a really large dictionary. The dictionary, or knowledge base, consists of relatively static information mined from structured and semi-structured publicly available information repositories like Wikipedia, Crunchbase, and Amazon. This large knowledge base is in turn used by a highly distributed search and indexing infrastructure to perform a deep linguistic analysis of many millions of documents ultimately culminating in a large set of semantic relationships expressing grammatical SVO style clause level relationships. This highly expressive, exacting, and scalable index makes possible a new generation of content discovery applications.

The talk slides used in the video are available HERE. It is best to download the slides and follow along as they are difficult to see in the video. In addition, the talk is broken up into multiple parts. Part 1, is shown below. Links to all parts are as follows: Part 1, Part 2, Part 3, Part 4, Part 5, and Part 6.

Full story at: http://blog.evri.com

Views: 175

Comment

You need to be a member of AnalyticBridge to add comments!

Join AnalyticBridge

On Data Science Central

© 2019   AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC   Powered by

Badges  |  Report an Issue  |  Privacy Policy  |  Terms of Service