#3903. Doing L2 speech research online: Why and how to collect online ratings data

September 2026publication date
Proposal available till 14-05-2025
4 total number of authors per manuscript0 $

The title of the journal is available only for the authors who have already paid for
Journal’s subject area:
Language and Linguistics;
Linguistics and Language;
Education;
Places in the authors’ list:
place 1place 2place 3place 4
FreeFreeFreeFree
2350 $1200 $1050 $900 $
Contract3903.1 Contract3903.2 Contract3903.3 Contract3903.4
1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)

Abstract:
Listener-based ratings have become a prominent means of defining second language (L2) users global speaking ability. However, in many teaching and research contexts, recruiting local listeners may not be possible or advisable. The goal of this study was to hone a reliable method of recruiting listeners to evaluate L2 speech samples online through Amazon Mechanical Turk (AMT) using a blocked rating design. Three groups of listeners were recruited: local laboratory raters and two AMT groups, one inclusive of the dialects to which L2 speakers had been exposed and another inclusive of a variety of dialects. Reliability was assessed using intraclass correlation coefficients, Rasch models, and mixed-effects models. Results indicate that online ratings can be highly reliable as long as appropriate quality control measures are adopted. The method and results can guide future work with online samples.
Keywords:
L2 speech samples; mixed-effects models; Amazon Mechanical Turk (AMT); quality control measures

Contacts :
0