A Data Science Central Community
A lot of companies spend a lot of time and money to get data related to customer preferences and behavior. At the same time huge amount of data can be extracted directly from Internet. Usually to get knowledge about some objects from Internet we use two stage process:
GoogMeter is a free and open source web comparator that measures Internet proximity between any Objects (Obj) and Properties (Prop). It shows number of pages that web search engines find for combinations of words taking in account their proximity in text. The most difficult and important tasks are to ask good questions and to analyze and interpret the answers. Couple of examples:
![]() ![]() ![]() How it works?googmeter gets number of found pages N(Obj, Prop) and creates contingency tables where rows correspond to Objects and columns to Properties. From the tables we calculate Totals by columns - Tot(Obj) , rows - Tot(Prop) and overall Tot and then empirical probabilities p(Obj) = Tot(Obj) /Tot , p(Prop) = Tot(Prop) /Tot. After it we obtain expected number of pages E(Obj, Prop) = Tot * p(Obj) * P(Prop) and indexes Ind(Obj, Prop) = 100* N(Obj, Prop) / E(Obj, Prop) . GoogMeter prints Number of found pages N(Obj, Prop) and Indexes Ind(Obj,Prop) and visualizes the table plotting horizontal bars or bubbles that colored green if Actual Numbers are greater than Expected and red in opposite case. There are too variants for bar's width (or bubble's volumes):
Volumes of blue bubbles are proportional to numbers of pages found, volumes of red and green bubbles are proportional to deviations of found numbers from expected. Anyway - green means "Yes" and red means "No" Click and Enjoy, GoogMeter is free and open source!
|
© 2019 AnalyticBridge.com is a subsidiary and dedicated channel of Data Science Central LLC
Powered by
Badges | Report an Issue | Privacy Policy | Terms of Service
Most Popular Content on DSC
To not miss this type of content in the future, subscribe to our newsletter.
Other popular resources
Archives: 2008-2014 | 2015-2016 | 2017-2019 | Book 1 | Book 2 | More
Most popular articles
You need to be a member of AnalyticBridge to add comments!
Join AnalyticBridge