Establishing Cutoff Values of Spectral and Cepstral Measures in Cantonese-Speaking Patients with Voice Disorders


Objective: Voice disorders affect up to 30% of individuals during their lifetime, posing significant impact on their both social interaction and occupational functioning. Objective and quantitative measures such as acoustic analysis equip clinicians a valuable and cost-effective means to assessing vocal pathology. The present study aims to determine the diagnostic value of specific acoustic parameters in distinguishing between normal and pathological voices, with a focus on their clinical utility. Using receiver operating characteristic (ROC) analysis, the discriminative capabilities of spectral and cepstral acoustic features for the detection of voice disorders in a Cantonese-speaking population were determined.
Methods: Sustained productions of the vowel /a/ vowel obtained from both healthy individuals and dysphonic patients diagnosed with vocal fold pathologies by an otorhinolaryngologist were measured. Participants were instructed to sustain phonation at a comfortable pitch and loudness for as long as possible. Acoustic analysis was conducted using the central 80% of each recording, with spectral and cepstral features extracted via a Python-based program utilizing the Parselmouth interface to Praat. Forty-eight acoustic parameters were analyzed, including prominent features such as the Cepstral Spectral Index of Dysphonia (CSID), Smoothed Cepstral Peak Prominence (CPPS) with and without voice detection, jitter, shimmer, low-to-high frequency energy ratio (L/H ratio), harmonic-to-noise ratio (HNR), Degree of Voicelessness, and fundamental frequency (F0) measures.
Results and Conclusion: ROC analysis revealed discriminative cutoff values for distinguishing normal from pathological voices predominantly involve vocal cord palsy (mostly left-sided), nodules, cysts, polyps, thickening, and mobility issues across several key acoustic parameters. Metrics such as the area under the ROC curve (AUC ≥ 0.70),
sensitivity, specificity, false positive rate, positive predictive value, and Youden's J were calculated for each parameter. The identified threshold values for the standard deviation of the L/H Ratio, CSID with and without voice detection, Degree of Voicelessness (%), maximum F0, standard deviation of F0, and Number of Unvoiced Segments provide critical reference points for the clinical assessment of Cantonese-speaking individuals with voice disorders based on sustained vowel phonation.

Yat Chun
Manwa Lawrence
AU
Ng