Arnold Boediardjo, Feng Chen, Chang-Tien Lu, Naren Ramakrishnan
This paper proposes three methods of association analysis that address two challenges of Big Data: capturing relatedness among real-world events in high data volumes, and modeling similar events that are described disparately under high data variability. The proposed methods take as input a set of geotemporally-encoded text streams about violent events called “storylines”. These storylines are associated for two purposes: to investigate if an event could occur again, and to measure influence, i.e., how one event could help explain the occurrence of another. The first proposed method, Distance-based Bayesian Inference, uses spatial distance to relate similar events that are described differently, addressing the challenge of high variability. The second and third methods, Spatial Association Index and Spatio-logical Inference, measure the influence of storylines in different locations, dealing with the high-volume challenge. Extensive experiments on social unrest in Mexico and wars in the Middle East showed that these methods can achieve precision and recall as high as 80 % in retrieval tasks that use both keywords and geospatial information as search criteria. In addition, the experiments demonstrated high effectiveness in uncovering real-world storylines for exploratory analysis.
- Date of publication:
- October 1, 2016
- Page number(s):
- Issue Number: