#3683. Overlapped Speech Detection and speaker counting using distant microphone arrays

October 2026publication date
Proposal available till 03-06-2025
4 total number of authors per manuscript0 $

The title of the journal is available only for the authors who have already paid for
Journal’s subject area:
Language and Linguistics;
Linguistics and Language;
Sociology and Political Science;
Speech and Hearing;
Places in the authors’ list:
place 1place 2place 3place 4
FreeFreeFreeFree
2350 $1200 $1050 $900 $
Contract3683.1 Contract3683.2 Contract3683.3 Contract3683.4
1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)

Abstract:
The research investigates the problem of detecting and counting simultaneous, overlapping speakers in a multichannel, distant-microphone scenario. The research considers a Temporal Convolutional Network (TCN) and a Transformer based architecture for this task, and compare them with previously proposed state-of-the art methods based on Recurrent Neural Networks (RNN) or hybrid Convolutional-Recurrent Neural Networks (CRNN). The research shows that the Transformer-based architecture performs best among all architectures and that neural network based spatial localization features outperform signal-based spatial features and significantly improve performance compared to single-channel features only.
Keywords:
Distant microphones; Overlapped Speech Detection; Spatial features; Speaker counting; Voice activity detection

Contacts :
0