Please use this searchable database to view abstract information from our 53rd Annual Symposium in 2024

Abstract Title

Deep-Learning-Based Laryngeal Tissue Characterization Using High-Speed Videoendoscopy

Abstract

Objective: Understanding the movements of laryngeal tissues and their characteristics is essential for enhancing our knowledge of voice production mechanisms and dynamics. Toward this goal, in the current study, we aim to develop an intelligent tool utilizing deep learning to precisely recognize and classify vocal fold tissues in laryngeal high-speed videoendoscopy (HSV) recordings. Our goal is to introduce an automated tool to aid in enhancing our understanding of the movements of different tissues observable in HSV data during voice production in connected speech.
Methods: HSV data were obtained from a normophonic speaker during connected speech. The acquired images underwent a preprocessing step including noise reduction and contrast adjustment. For the segmentation process, a series of techniques were employed. Initially, a Region of Interest (ROI) was defined to focus on specific laryngeal structures. Morphological operations and connected component analysis (CCA) were applied to refine and label different tissue regions. The extracted features from segmented regions were then fed into a Convolutional Neural Network (CNN) model for classification. The model underwent rigorous training using a labeled dataset.
Results and Conclusions: The trained CNN model showcased its potential in the automated classification of laryngeal tissues. The proposed approach is successful at identifying specific anatomical structures involved in voice production, such as the glottal area, arytenoid cartilages, vocal folds, and the epiglottis. Subsequent analysis of segmented images allows us to extract different geometrical parameters from the identified tissues such as the area, boundary, and shape to characterize laryngeal tissue movements. The success of the proposed approach results in a robust and effective method for detecting various laryngeal tissues during production of running speech.
Acknowledgements: We acknowledge the support from NIH NIDCD K01DC017751, R21DC020003 and R01DC019402, and the ARO Young Investigator Program award W911NF-19-1-0444. Further, we thank Dr. Stephanie RC Zacharias, Mayo Clinic-Arizona, and the Michigan State University Discretionary Funding Initiative for their support for the data collection.

First NameSardar Nafis
Last NameBin Ali
Author #2 First NameMohsen
Author #2 Last NameZayernouri
Author #3 First NameDimitar
Author #3 Last NameD. Deliyski
Author #4 First NameMaryam
Author #4 Last NameNaghibolhosseini