Relationship between Cepstral-Based Measures and Bio-Inspired Computational Measures in Pediatric Voice


Objective: Cepstral peak prominence (CPP) is a measure of the periodicity of the vocal signal reflecting overall harmonic structure. CPP measures that have provided valuable information related to voice include average (over time) CPP, standard (stdev) deviation of CP height, and CP fundamental frequency (1/peak quefrency). The goal of this study is to examine the potential relationship between cepstral measures and bio-inspired computational measures derived from models of auditory perception (pitch strength, pitch height, sharpness, a temporal envelope model).

Methods: Cepstral-based and bio-inspired measures were calculated for a sustained /a/ vowels from 93 children with voice disorders (60 male, 33 female, 4-11 years of age).

Results: Individual forward stepwise linear regressions were performed to identify which cepstral measures (average CPP, stdev of CP height, stdev of CP frequency) were key predictors of each bio-inspired measure (pitch strength, pitch height, sharpness, and temporal envelope). A stopping rule of minimum Bayesian information criterion (BIC) was used to determine the significant predictors in each regression. Age and sex were included as fixed variables all the final models, regardless of their significance. Model (left) and final regression parameters (right) are shown with resulting R2 values.
• Pitch strength: stdev CP height, average CPP (R2 = 0.82)
• Pitch height: stdev CP height, stdev CP frequency (R2 = 0.30)
• Temporal Envelope: average CPP, stdev CP height, stdev of CP frequency (R2 = 0.53)
• Sharpness: stdev CP frequency (R2 = 0.29)

Conclusions: This study demonstrated that pitch strength is strongly predicted by both average CPP and variability of CP height, consistent with previous research showing that these measures are closely related to the perception of breathiness. The temporal envelope model was predicted not only by average CPP, but also by the variability of the CP peak in both height and quefrency location. The predictive nature of CP variability aligns with the association between temporal envelope variability and pediatric roughness perception. Despite significant predictions, lower R2 for pitch height, temporal envelope, and sharpness illustrate the unique associations of those models with the voice samples. Additional relationships and how they may refine our understanding of pediatric voice will be discussed.

Elizabeth
Victoria
David
Kevin
Alessandro
Heller Murray
McKenna
Eddins
McElfresh
de Alarcon