Annotating Twitter Data from Vulnerable Populations: Evaluating Disagreements Between Domain Experts and Graduate Student Annotators

Desmond U. Patton, Philipp Blandfort, William R. Frey, Michael B. Gaskell, Svebor Karaman

In: Proceedings of the 52nd Hawaii International Conference on System Sciences. Hawaii International Conference on System Sciences (HICSS-2019) 52nd January 8-11 Grand Wailea Hawaii United States IEEE Computer Society 2019.


Researchers in computer science have spent considerable time developing methods to increase the accuracy and richness of annotations. However, there is a dearth in research that examines the positionality of the annotator, how they are trained and what we can learn from disagreements between different groups of annotators. In this study, we use qualitative analysis, statistical and computational methods to compare annotations between Chicago-based domain experts and graduate students who annotated a total of 1,851 tweets with images that are a part of a larger corpora associated with the Chicago Gang Intervention Study, which aims to develop a computational system that detects aggression and loss among gang-involved youth in Chicago. We found evidence to support the study of disagreement between annotators and underscore the need for domain expertise when reviewing Twitter data from vulnerable populations. Implications for annotation and content moderation are discussed.

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence