Perceptual and Computational Estimates of Pitch Height and Pitch Strength across Signal Types in Pediatric Dysphonic Voices


Background/Rationale: Conventional methods of acoustic analysis, especially in children with supraglottal vibratory sources, are severely constrained due to the lack of periodicity in these voices. From a psychoacoustic perspective, pitch perception has two components – pitch height (scale of low to high pitch) and pitch strength (pitch salience from weak to strong). Prior research has shown that replacing fundamental frequency (f0)-based measurements with pitch-based measurements allows automated analyses of aperiodic voices including automated signal typing and assessment of dysphonic voice quality in adult dysphonia.

Objective: This study investigated pitch-based metrics as correlates of voice signal types (Type 1, 2, & 3) in pediatric dysphonic voices. The computational estimates were validated with perceptual judgments of pitch height and strength.

Methods: PRAAT and TF32 provided computational estimates of f0 while Auditory-Sawtooth Waveform Pitch Estimator Prime (Aud-SWIPE_) algorithm provided pitch height and pitch strength estimates for 42 dysphonic /a/ vowels (14 stimuli per type, as established through expert judgments). Ten listeners assessed pitch height through a single-variable matching task and pitch strength through an anchored magnitude estimation task. Analyses of variance were conducted to examine the effects of signal type on pitch height and strength. The relationship between computational and perceptual estimates was analyzed using Pearson’s correlation coefficients.

Results: There was a significant difference in both computational and perceptual pitch strength estimates across signal types. Periodic (Type 1) signals exhibited greater pitch strength than Type 2 and 3 signals. The Aud-SWIPE_ generated robust computational estimates of pitch height, even for Type 3 signals, outperforming other f0 algorithms. The correlation between the Aud-SWIPE_ computational and the perceptual estimates of pitch height were r = 0.95 for Type 1 signals, r = 0.96 for Type 2 signals, and r = 0.62 for Type 3 signals. Listeners were able to appropriately assess pitch height in Type 2 and 3 signals, despite the absence of a clear fundamental frequency.

Conclusions: Pitch height and pitch strength can be successfully estimated in pediatric dysphonic voices for all signal types.

Lindsay
Yeonggwang
Supraja
Shaheen
Susan
Lisa
Barbara
Rahul
Alessandro
David
Wilson
Park
Anand
Awan
Brehm
Kelchner
Weinrich
Shrivastav
de Alarcon
Eddins