LogIn / Registration

Articles for sale

Head office address: Moscow, 123317, Moscow City, 8th Floor, Presnenskaya Embankment, 6, bldg. 2
article@123mi.ru

#3833. Corpora compilation for prosody-informed speech processing

November 2026	publication date
Proposal available till	10-07-2025
4 total number of authors per manuscript	0 $

The title of the journal is available only for the authors who have already paid for

Journal’s subject area:

Language and Linguistics;
Linguistics and Language;
Education;
Library and Information Sciences;

Places in the authors’ list:

place 1	place 2	place 3	place 4
Free	Free	Free	Free
2350 $	1200 $	1050 $	900 $
Contract №3833.1	Contract №3833.2	Contract №3833.3	Contract №3833.4

1 place - free (for sale)
2 place - free (for sale)
3 place - free (for sale)
4 place - free (for sale)

Abstract:
Research on speech technologies necessitates spoken data, which is usually obtained through read recorded speech, and specifically adapted to the research needs. When the aim is to deal with the prosody involved in speech, the available data must reflect natural and conversational speech, which is usually costly and difficult to get. This research presents a machine learning-oriented toolkit for collecting, handling, and visualization of speech data, using prosodic heuristic. The authors present two corpora resulting from these methodologies: PANTED corpus, containing 250 h of English speech from TED Talks, and Heroes corpus containing 8 h of parallel English and Spanish movie speech. We demonstrate their use in two deep learning-based applications: punctuation restoration and machine translation. The presented corpora are freely available to the research community.
Keywords:
Intensity; Parallel data; Pause; Punctuation; Speech corpus; Speech transcription; Spoken machine translation

Contacts :

Contact Info

Office

Sign up for a meeting through a call center: help@buy-sell-article.com
,