#3720. Fundamental frequency feature warping for frequency normalization and data augmentation in child automatic speech recognition
October 2026 | publication date |
Proposal available till | 08-06-2025 |
4 total number of authors per manuscript | 0 $ |
The title of the journal is available only for the authors who have already paid for |
|
|
Journal’s subject area: |
Language and Linguistics;
Linguistics and Language;
Communication;
Modeling and Simulation;
Computer Science Applications;
Computer Vision and Pattern Recognition;
Software; |
Places in the authors’ list:
1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)
More details about the manuscript: Science Citation Index Expanded or/and Social Sciences Citation Index
Abstract:
Effective child automatic speech recognition (ASR) systems have become increasingly important due to the growing use of interactive technology. The technique is inspired by the tonotopic distances between formants and fo, developed to model human vowel perception. The tonotopic distances are reformulated as a linear relationship between fo and vowel formants on the Mel scale. A single word ASR experiment and a continuous read speech ASR experiment are performed to evaluate the fo-based frequency normalization and data augmentation techniques. In the continuous speech experiment, the combination of fo-based frequency normalization and data augmentation resulted in a relative improvement of 19.3% over the baseline.
Keywords:
Child speech; Data augmentation; Frequency normalization; Fundamental frequency; Speech recognition
Contacts :