WSDM2021

On the impact of predicate complexity in crowdsourced classification tasks

Jorge Ramirez 1 Marcos Baez 2 Fabio Casati 3 Luca Cernuzzi 4 Boualem Benatallah 5 Ekaterina A. Taran 6 Veronika A. Malanina 6
1University of Trento, Paraguay
2Université Claude Bernard Lyon 1, France
3Tomsk Polytechnic University, Russia
4DEI - Universidad Católica, Paraguay
5The University of New South Wales, Australia

This paper explores and offers guidance on a specific and relevant problem in task design for crowdsourcing: how to formulate a complex question used to classify a set of items. In micro-task markets, classification is still among the most popular tasks. We situate our work in the context of information retrieval and multi-predicate classification, i.e., classifying a set of items based on a set of conditions.

Our experiments cover a wide range of tasks and domains, and also consider crowd workers alone and in tandem with machine learning classifiers.

We provide empirical evidence into how the resulting classification performance is affected by different predicate formulation strategies, emphasizing the importance of predicate formulation as a task design dimension in crowdsourcing.