2024 Symposium Abstracts - THE VOICE FOUNDATION

Please use this searchable database to view abstract information from our 53rd Annual Symposium in 2024

Abstract Title	Machine Learning-Based Model for Automatic Prediction of Listeners’ Attitudes towards Dysphonic Voices
Abstract	Method: We used a database of connected speech samples (CAPE-V sentences) produced from 44 subjects of both genders, with different overall severity (OS) of vocal deviation (health, mild, moderate and intense) and different degrees of roughness (GR), breathiness (GB) and strain (GS). The samples were presented to 152 listeners of both genders who performed the judgment of 12 attitudes inserted in a semantic differential scale previously validated for this study. The results of the evaluation of the 12 attributes were used to categorize the valence of the judgment as positive, neutral or negative. From the six concatenated CAPE-V sentences, 25 acoustic measurements were extracted, including fundamental frequency (fo) measurements, traditional perturbation and noise measurements, and cepstral/spectral measurements. The database was partitioned into a training set and a test set. We used the upsampling method to balance the data and the k-fold cross validation method, repeating the validation procedure 30 times with subsets of 11 samples. We tested 10 ML classifiers and used the measures of accuracy, sensitivity, specificity, and Kappa to evaluate the performance of the classifier. Results: The Kernel Support Vector Machine (Kerner SVM) and Naive Bayes (NB) models obtained the best performances. The Kerner SVM selected four acoustic measures, including minimum and maximum fo, smoothed cepstral peak prominence (CPPS) and shimmer. The Kerner SVM showed an accuracy of 0.92, with values of 1.0 and 0.80 for sensitivity and specificity, respectively, as well as a Kappa of 1.0. NB performed similarly to Kerner SVM. However, the NB selected 23 acoustic measurements and was excluded from the analysis because it was considered a not parsimonious model. Conclusion: Among the tested models, the Kernel SVM model performed best in predicting the judgment of listeners’ attitudes towards dysphonic voices, based on a set of four acoustic measure.
First Name	Leonardo
Last Name	Lopes
Author #2 First Name	Deyverson
Author #2 Last Name	Evangelista
Author #3 First Name	Samuel
Author #3 Last Name	Abreu
Author #4 First Name	Marcelo
Author #4 Last Name	Ferreira