- Titles from hundreds of Greek blogs of economic and social content.
Regarding the analysis of the new taxation plan the goal was to identify clusters of common content and thus finding the most frequently occurring requests / comments of citizens :
Text mining and Information Extraction techniques were used to annotate and identify sentiment in each phrase. For example as shown above, Cluster 5 (which contains 329 citizen messages) is about requests for a fair tax plan while Cluster10 contains messages with requests that tax fraud should be minimized. Pairwise correlations were also found between concepts :
In Greek, 'dikigoros' and 'iatros' means Lawyer and Doctor respectively. This correlation shows that these two professions are found together most of the times in the same phrase. It should be noted that these two professions are also frequently found in phrases/messages that talk about Tax fraud.
A very interesting application was the analysis of unstructured information found in Greek Blogs and Websites. The text was annotated with specific keywords / concepts such as Economy and Politicians.
For example we can identify the reasons on why Giorgos Papandreou (PM of Greece) is characterized in a bad way in blog posts (=what other concepts are found in Blog posts containing keywords 'Giorgos Papandreou' AND Bad Characterizations) :
(Note : PASOK = Governmental Political Party)
Politics = 120
Economy = 72
Economy, Politics = 40
PASOK = 24
Politics, PASOK, Referendum = 8
Economy, Politics, PASOK, Referendum, Immigrants = 8
Economy, Politics, Society = 8
Society, PASOK = 4
In other words : Giorgos Papandreou is criticized mainly for his Political decisions and the Economy followed by criticism on PASOK. Negative sentiment also exists because of the fact that a percentage of Greek citizens require that a referendum should take place concerning the latest decision of the Greek government to give to a large proportion of Immigrants the Greek citizenship