|
Upravlenie Bol'shimi Sistemami, 2017, Issue 66, Pages 6–24
(Mi ubs909)
|
|
|
|
Information Technology Applications in Control
Algorithms for interpretation of prosodic features in low-bitrate speech processing
M. Bessonova, M. Farkhadovb a «Russian peoples' friendship university», Moscow
b Institute of Control Sciences of RAS, Moscow
Abstract:
We study the language identification problem using prosodic features. Prosodic features such as melody, rhythm, timbre and others are difficult to formalize mathematically. Two algorithms for a complex description of prosodic features are proposed in the paper. The first is based on the broad phonetic categories, and the second is based on the cross-correlation of the speech melody and the short-term energy sequence. The fundamental frequency was estimated by MELP algorithm. The performance of the proposed algorithms was evaluated experimentally on a database of speech recordings obtained from Internet and therefore encoded by low-bitrate vocoders. The database includes ten different languages. The proposed algorithms provide a feature description and a multi-layer neural network was used as a language classifier. Both algorithms show satisfactory classification performance, but the broad phonetic categories approach performs slightly better than the cross-correlation function. These algorithms can be applied to a speech signal processed by low-bitrate vocoders without decoding to the original signal.
Keywords:
language identification, neural networks, speech prosodic features, broad phonetic categories.
Received: November 9, 2016 Published: March 31, 2017
Citation:
M. Bessonov, M. Farkhadov, “Algorithms for interpretation of prosodic features in low-bitrate speech processing”, UBS, 66 (2017), 6–24
Linking options:
https://www.mathnet.ru/eng/ubs909 https://www.mathnet.ru/eng/ubs/v66/p6
|
Statistics & downloads: |
Abstract page: | 144 | Full-text PDF : | 54 | References: | 31 |
|