Garbage in, garbage out? Do Machine Learning Application Papers in Social Computing Report Where Human-Labeled Training Data Comes From?

R. Stuart Geiger, Kevin Yu, Yanlai Yang, Mindy Dai, Jie Qiu, Rebekah Tang, and Jenny Huang

January 1, 2020

Abstract

In this paper, we investigate to what extent a sample of machine learning application papers in social computing --- specifically papers from ArXiv and traditional publications performing an ML classification task on Twitter data --- give specific details about whether such best practices were followed.

Read Paper