An Optimal Contact-Mechanically Consistent and Flow-Separation Adapted Modeling of Vocal Fold Dynamics
Objective: Single mass-spring-damper models of vocal folds have been shown to be effective in simulating vocal fold vibrations without added complexity. However, single degree-of-freedom models cannot sustain vocal fold oscillation in the presence of structural damping unless the source-tract interaction is taken into account. Moreover, existing lumped models struggle to accurately simulate the vocal fold closure during phonation, especially when there is a need to sustain the closure. This study aims to develop a reliable and simplified model of phonation with a single degree of freedom to simulate the sustained oscillation in a damped system without needing to incorporate the vocal tract model. Additionally, the proposed model can maintain the vocal fold closure for a longer duration than the existing models, being in line with physics of phonation.
Methods: High-speed videoendoscopy (HSV) data were collected from 4 normophonic subjects (2 male and 2 female) during the production of sustained vowel /i/. After deriving the model’s governing equations, the glottal area waveform (GAW), extracted using a deep learning-based image segmentation technique, is utilized for finding the optimum model parameters employing the particle swarm optimization algorithm. An additional resistance force is incorporated into the model to compensate for flow separation, which produces an imbalance of forces, required for the sustained oscillation. Additionally, an external structural force is added during the closure to prevent an earlier opening before the specified closure time. The 4th-order Runge-Kutta method is employed to solve the governing equations of the stiff system, ensuring an enhanced stability and accuracy for numerical calculations.
Results: The parameters of the model were optimized successfully for individual subjects using the particle swarm optimization method with minimal errors between the experimental GAW and the numerical estimations. The results indicate that the vocal folds vibrations can be sustained for various durations of closure in different subjects based on the experimental data.
Conclusions: The proposed model is robust in simulating the sustained phonation, without requiring the complex analysis of source-tract coupling. Furthermore, the model benefits from a minimal set of parameters while fully capturing the multi-physics of phonation and reducing the overall computational cost.