A new piece of software, with the aid of a supercomputer for processing, seems to have the ability to predict revolutions with stunning accuracy by analyzing news stories pertaining to the region in question. The software, developed by Kalev Leetaru of University of Illinois’ Institute for Computing in the Humanities, Arts and Social Science, was able to retroactively predict the recent unrest in Egypt.
By collecting and analyzing news stories from the U.S. Open Source Center, Britain’s BBC Monitoring, Times articles archived all the way back to 1945 and a variety of other sources, the software was able to detect a souring in tone matched only by the bombing of Iraqi troops in Kuwait in 1991 and the invasion of Iraq in 2003. While this spike didn’t necessarily predict a revolution, such a strong drop in sentiment devoid from any extreme outside influences certainly suggests it.
In a way, it’s surprising that simply mining news articles could produce such valuable information. In addition, the process of mining them, while power-intensive, is pretty simple. In the Egypt example, the software simply mined articles with references to specific Egyptian cities (not Egypt as a whole in order to avoid passing references) and took an inventory of words like “good” and “nice,” and weighed them against pairs like “horrendus” and “awful.” Although that sounds easy, the whole collection of articles included over 100 million pieces and 100 trillion relationships that needed to be parsed. For that reason, the 8.2 teraflop SGI Altix supercomputer was enlisted.
Predicting and monitoring civil unrest isn’t the only potential for this kind of program. In another test, a similar analysis of articles was found to provide the approximate location of Osama bin Laden, using only articles available before his capture. While the results were not specific enough to have justified a raid, they did highlight an area in which he was ultimately found.
Although the main application of this software is for journalistic data mining, it has also been put to work parsing thousands of books that have been digitized by Google Books. The scan produced all sorts of quantified data that can hopefully be of use in the in-depth study of the evolution of the written word over hundreds of years. Considering that the software has really only been put to work performing retroactive predictions as a sort of proof of concept, it’s definitely displayed its chops. While no predicition is really foolproof, the data provided by this software could provide valuable, collective insight on unrest that is only starting to bubble up now. What people decide to do with that data is anyone’s guess.
(via Singularity Hub)
Have a tip we should know? email@example.com