The Acoustic Voice Hyperfunction Index (AVHI): Development and Preliminary Normative Modeling for Brazilian Portuguese Speakers
Introduction
Vocal hyperfunction (VHF) refers to excessive or imbalanced activity of intrinsic and extrinsic laryngeal muscles, resulting in strained or inefficient phonation. Although it contributes to many functional and organic dysphonias, its acoustic correlates—such as Cepstral Peak Prominence Smoothed (CPPS), harmonic amplitude difference (H1–H2*), spectral slope, and Relative Fundamental Frequency (RFF)—are often examined separately, limiting clinical use. The Acoustic Voice Hyperfunction Index (AVHI) was developed to integrate these parameters into a single SPL-calibrated score that reflects phonatory inefficiency linked to hyperfunctional voice use. The AVHI is designed for screening and longitudinal monitoring, not diagnosis. This study describes its development and the establishment of preliminary normative data for Brazilian Portuguese speakers.
Methods
The AVHI is implemented in Python with a Praat-based backend through Parselmouth and includes modules for SPL calibration, sustained-vowel and connected-speech analysis, and automated RFF estimation from voiced–unvoiced transitions. Calibration uses A-weighted RMS levels with geometric correction between 30 cm and 5 cm. The system extracts CPPS, H1–H2* (cepstrally corrected to reduce formant influence), alpha ratio, low–high spectral ratio, and spectral slope. RFF is computed from ten pre- and post-boundary cycles around voiceless consonants, yielding semitone-normalized means and slopes. Each feature is normalized to the control group’s mean and standard deviation and combined into three weighted subcomponents—Tension, Quality, and Spread—to form the composite AVHI score. The current validation uses a database from the Federal University of Paraíba (UFPB) containing recordings from Brazilian Portuguese speakers, including a control group with healthy voices and a clinical group with varied dysphonias. Linear regression on control data was used to generate the initial normative model.
Results
The system completed calibration, segmentation, and feature extraction without failures. Control-group data provided regression-based reference values for CPPS, H1–H2*, spectral slope, and RFF, defining the initial normative database for the AVHI. The next phase will apply these norms to speakers with voice disorders to observe index behavior across phonatory conditions. Validation of the clinical dataset is ongoing, and no inferential analyses are reported.
Conclusions
The Acoustic Voice Hyperfunction Index offers an integrated and reproducible framework for quantifying phonatory efficiency by combining spectral, cepstral, and vibratory stability measures into a single SPL-calibrated score. Establishing normative data for Brazilian Portuguese speakers represents an essential step toward validation and language-specific applicability. The system shows promise for future use in longitudinal monitoring and research, though current reference values apply only to Brazilian Portuguese. Continued validation will determine its sensitivity, reliability, and potential for broader cross-language adaptation in clinical and pedagogical contexts.