A Deep Learning Approach for Feedback in Resonant Voice Therapy
Objective: Resonant voice therapy (RVT) aims to improve vocal quality by facilitating efficient vocal fold vibration. RVT aims to generate voice more easily and more resonantly with less effort and impact on the vocal folds. To ensure positive treatment outcomes, voice therapies require patients to practice the learned techniques both inside and outside therapy sessions. However, feedback from the therapist is not available outside the therapy sessions when practicing at home. Patients often report difficulty evaluating the precision of the learned techniques without direct expert feedback. Hence, the goal of this work is to build a web-based application that utilizes deep learning models to provide immediate feedback for resonance recognition in voice signals in RVT.
Methods: For 69 normal young participants (26 males, 43 females), /m/ and /n/ for comfortable pitch and loudness and resonant versions, after training, were acoustically recorded. Common acoustic parameters as well as power spectrum and cepstrum were computed. Feature importance analysis was performed and different classification models were investigated on their performance.
Results / Conclusions: Using ResNet18-H-DNN, a hybrid neural network architecture that integrates multiple types of neural networks and trains on both textual and spectral data yielded best classification results with 82% accuracy. This best performing model was then implemented in an Android application and will now be tested in a further study on its applicability.