This article aims to predict the outcome of Hungarian parliamentary election in 2010. The research uses several text mining, and artificial intelligence techniques to give the best estimation about the possible outcome of the election.
The research consist of two main parts, (1) in the first part there is the collection of textual data, which is differentiated by the sources where the written text emerged (e.g. online news, social conversations like Twitter), (2) in the second part we process the collected data with angel framework, which is a network based natural language processing tool.
To determine which kind of data will be collected an effect analysis framework have been created, in which the we collected the effects that can influence the voter. The voters voting criteria also inspected, because we still don't know which is the most common choice criteria in parlamentary election.
In the results section the 'possible results' are collected due to different approaches, so we can compare them, which was the most effective prediction.
2. Effect analysis
Who will win?
(1) The party that is the most famous in media
(2) The party that brings out the most positive emotion to us
4. Collection of textual data
4.1 Online search engines
The first and most rough estimation came from the online search engines.
In this research we used Google Insight search, which collects the search phrases entered in Google search engine in a specific timerange. We inspected two timerange, first the year 2009 than 2010.
However the received data does not reflected the popularity of a party. Each party received near as much hits as the others.
4.2 People's interaction (communication)
We can follow the interaction among people by reading their online posts and comments. Nowadays most relevant sources in this field are social networking sites and twitter, however in hungary the usage of twitter is much behind than writing post in forums. Another place where opinons are shared among people is blogsphere. We choosed Blogpulse tool to analyse the interactions. We also take some effort to collect Twitter messages, but unfortunetly onyl the last 3 weeks of conversation is available for searching the parties name.