#3683. Overlapped Speech Detection and speaker counting using distant microphone arrays
October 2026 | publication date |
Proposal available till | 03-06-2025 |
4 total number of authors per manuscript | 0 $ |
The title of the journal is available only for the authors who have already paid for |
|
|
Journal’s subject area: |
Language and Linguistics;
Linguistics and Language;
Sociology and Political Science;
Speech and Hearing; |
Places in the authors’ list:
1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)
Abstract:
The research investigates the problem of detecting and counting simultaneous, overlapping speakers in a multichannel, distant-microphone scenario. The research considers a Temporal Convolutional Network (TCN) and a Transformer based architecture for this task, and compare them with previously proposed state-of-the art methods based on Recurrent Neural Networks (RNN) or hybrid Convolutional-Recurrent Neural Networks (CRNN). The research shows that the Transformer-based architecture performs best among all architectures and that neural network based spatial localization features outperform signal-based spatial features and significantly improve performance compared to single-channel features only.
Keywords:
Distant microphones; Overlapped Speech Detection; Spatial features; Speaker counting; Voice activity detection
Contacts :