The Perception of Covarying Strain and Roughness in Synthetic Female Singing Voices
Objective: Strain and roughness are psychoacoustic percepts commonly evaluated during auditory-perceptual voice evaluations. Previous multidimensional scaling research suggests that strain and roughness are perceptually separable from one another. However, multidimensional scaling studies are devoid of a predetermined classification scheme, which leads to the question: Are strain and roughness perceptually separable when a predetermined classification scheme is given to listeners? The present study seeks to answer this question by including a predetermined classification scheme using semantic labels of “strain” and “roughness” in the perceptual experimental task.
Methods/Design: This study utilizes a within-subjects factorial design, where listeners will be asked to rate their perception of strain and roughness in synthetic female singing voice stimuli with covarying levels of strain and roughness. 25 synthetic voice stimuli will be constructed at each of the pitches A3 and F5 on the vowel /ɑ/ with varying glottal excitation source slopes of -6, -9, -12, -15, and -18 dB/octave to simulate varying levels of strain. Sine wave amplitude modulations will then be applied to each stimulus with a modulation frequency of 35 Hz and varying modulation depths of 0, -5, -10, -15, and -20 dB to simulate varying levels of roughness. Stimuli will be presented binaurally to participants, who will rate their perception of strain and roughness on 100-point visual analog scales with endpoints labeled as “no strain – very strained” and “no roughness – very rough”, respectively.
Results: Based on previous multidimensional scaling research, it is anticipated that participants will perceive strain and roughness independently from one another when given a predetermined classification scheme.
Conclusions: Results from this study will improve our understanding of how listeners perceive pathological voices with covarying strain and roughness. Additionally, results may clarify the impact semantic labels have on auditory-perceptual voice evaluation ratings.