Utilizing Online, Crowdsourced Platforms for Large-N Data Collection in Voice Science: Prices, Pitfalls, and Perks
Objective:_Online crowdsourced research services like Amazon’s Mechanical Turk and CloudResearch Connect allow researchers to collect data faster, cheaper, and on a larger scale than traditional samples. However, data quality concerns persist. This study outlines the costs, pitfalls, and benefits of crowdsourced research to enhance data validity and reliability.
Methods:_Surveys on vocal fatigue, personality traits, and communicative quality of life were collected from 495 Mechanical Turk users and 99 Connect users over six months. Identical data were gathered from 47 undergraduate students. Response quality was assessed using IP address/geocode matching, attention checks, response congruence on personality measures, language proficiency, cultural screening, survey duration, and open-ended questions. Data quality and sample characteristics were compared across platforms.
Results:_Of 641 responses, 81 (12.6%) met exclusion criteria: 6.4% from undergraduates, 14.9% from Mechanical Turk, and 4.0% from Connect. Few responses were excluded due to failed attention checks (n = 4), duplicate IDs (n = 8), or suspicious IP addresses (n = 10). Open-ended questions (n = 38), language and cultural screenings (n = 19), and personality measure congruence (n = 25) identified more low-quality responses.
Conclusions:_Traditional screening tools like attention checks and IP address screening were ineffective. We recommend language proficiency tests and open-ended questions to assess response quality. CloudResearch Connect showed superior demographic diversity, data quality, and completion times, supporting its use alongside traditional sampling.