Evaluating Acoustic Representations and Normalization for Rhoticity Classification in Children with Speech Sound Disorders.
JASA express letters(2024)
摘要
The effects of different acoustic representations and normalizations were compared for classifiers predicting perception of children's rhotic versus derhotic //. Formant and Mel frequency cepstral coefficient (MFCC) representations for 350 speakers were z-standardized, either relative to values in the same utterance or age-and-sex data for typical //. Statistical modeling indicated age-and-sex normalization significantly increased classifier performances. Clinically interpretable formants performed similarly to MFCCs and were endorsed for deep neural network engineering, achieving mean test-participant-specific F1-score = 0.81 after personalization and replication (sigma(x) = 0.10, med = 0.83, n = 48). Shapley additive explanations analysis indicated the third formant most influenced fully rhotic predictions.
更多查看译文
关键词
Articulatory Phonetics,Speech Perception,Acoustic Phonetics,Speaker Verification,Acoustic Modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要