Chaos Analysis of Connected Speech Using a Moving Window Technique


Objective: Many traditional acoustic analyses rely primarily on sustained vowels because they are highly repeatable and easily controlled, yet only capture a single, steady phonatory configuration. In contrast, connected speech reflects the dynamic interaction between the larynx and vocal tract, where coarticulation causes each sound to be influenced by those that precede and follow it. Thus, extracting voice data from connected speech may provide a more representative measure of everyday communication. This study is the first to apply a moving-window approach to connected speech to identify stable voice segments and examine whether linear and nonlinear acoustic parameters can better discriminate healthy from pathological phonation.

Methods: Voice samples of the first 12 seconds of the rainbow passage from the Kay Pentax Voice Database were analyzed for both healthy and pathological speakers. A moving window technique (0.25-second shift, and a 0.8-second window length) was used to compute linear acoustic metrics–percent jitter, percent shimmer, and signal-to-nose ratio (SNR)–and nonlinear parameters–nonlinear energy difference ratio (NEDR), spectrum convergence ratio (SCR), and correlation dimension (D2). For each parameter, the window exhibiting the lowest perturbation was identified, representing the most stable phonatory segment within the passage. Voice analysis should target the limit of stable phonation, rather than a point in the range selected at random. Currently, only descriptive comparisons have been made with preliminary data collected; full analysis will employ paired t-tests and receiver-operating-characteristic (ROC) analysis to quantify discrimination of pathology.

Results: Preliminary findings indicate lower NEDR and higher SCR values in healthy speech samples compared with pathological ones. Linear metrics did not differ between healthy and disordered speech. The window of least perturbation differs across metrics, even within the same linear or nonlinear category, indicating that each acoustic parameter captures distinct aspects of vocal stability.

Conclusions: These early results demonstrate the feasibility of using moving-window analysis for connected speech and suggest that nonlinear chaos-based metrics may more effectively distinguish healthy from pathological voices compared to linear metrics. Ongoing analyses over the coming months will verify these trends in a larger population, incorporate additional metrics, and apply statistical testing to establish a standardized, objective segmentation and acoustic analysis method of voice function in connected speech.

Owen
Grayson
Maiwand
Jakob
Jack
Wischhoff
Bienhold
Tarazi
Holm
Jiang