Gizem Korkmaz, Jose Cadena, Chris Kuhlman, Achla Marathe, Anil Vullikanti, Naren Ramakrishnan
Civil unrest events (protests, strikes, and “occupy” events) range from small, nonviolent protests that address specific issues to events that turn into large-scale riots. Detecting and forecasting these events is of key interest to social scientists and policy makers because they can lead to significant societal and cultural changes. We forecast civil unrest events in six countries in Latin America on a daily basis, from November 2012 through August 2014, using multiple data sources that capture social, political and economic contexts within which civil unrest occurs. The models contain predictors extracted from social media sites (Twitter and blogs) and news sources, in addition to volume of requests to Tor, a widely used anonymity network. Two political event databases and country-specific exchange rates are also used. Our forecasting models are evaluated using a Gold Standard Report, which is compiled by an independent group of social scientists and subject matter experts. We use logistic regression models with Lasso to select a sparse feature set from our diverse datasets. The experimental results, measured by F1-scores, are in the range 0.68–0.95, and demonstrate the efficacy of using a multi-source approach for predicting civil unrest. Case studies illustrate the insights into unrest events that are obtained with our method. The ablation study demonstrates the relative value of data sources for prediction. We find that social media and news are more informative than other data sources, including the political event databases, and enhance the prediction performance. However, social media increases the variation in the performance metrics.
- Date of publication:
- January 19, 2016
- Social Network Analysis and Mining
- Page number(s):
- Issue Number: