Please use this searchable database to view abstract information from our 53rd Annual Symposium in 2024

Abstract Title

Objective Assessment of Functional Dysphonia based on Machine Learning and High-Speed Videoendsocopy

Abstract

Objective: Functional dysphonia (FD) refers to an impairment of voice production, characterized by limitations in vocal performance and acute or persistent changes in voice quality. Due to its diverse genesis in the absence of primary organic changes, there is currently no consensus on the visual assessment of FD. The use of quantitative methods could aid clinicians in standardizing the diagnosis of FD. High-speed videoendoscopy (HSV) is a promising method for the objective evaluation of voice disorders, as its high resolution (e.g. ≥4000 frames per second) allows the detailed analysis of vocal fold vibrations. In this study, we propose a machine learning based approach to objectively assess voice quality using parameters calculated from high-speed endoscopic videos. Our primary focus is to investigate the relationship between the vibratory characteristics of the vocal folds and the resulting voice quality.
Methods: We gathered HSV recordings of the sustained vowel /i/ from both healthy subjects and patients with functional dysphonia. All recordings have an assigned hoarseness rating H ϵ [0, 1, 2, 3], which was determined subjectively by an expert based on continuous speech of the respective subject. Glottis segmentation is used to determine the glottal area waveform (GAW) for 250ms of each recording. Subsequently, voice parameters describing glottal dynamics, mechanics and symmetry as well as signal periodicity and harmonicity were computed. We employed machine learning to classify HSV recordings into two levels of hoarseness H < 2 (normal, mild) and H ≥ 2 (moderate, severe), using the resulting output probabilities as interval-scaled severity ratings y ϵ [0, 1]. Additionally, glottal features significant for classification were identified using feature selection methods.
Results/Conclusion: The resulting classification model was evaluated regarding classification performance as well as correlation between predicted output probabilities and subjectively determined hoarseness ratings. In addition, relevant features were analyzed in terms of their correlation with hoarseness.

First NameTobias
Last NameSchraut
Author #2 First NameAnne
Author #2 Last NameSchuetzenberger
Author #3 First NameMelda
Author #3 Last NameKunduk
Author #4 First NameMatthias
Author #4 Last NameEchternach
Author #5 First NameMichael
Author #5 Last NameDoellinger