Maxim Sidorov, Alexander Schmitt, Eugene S. Semenkin, “Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement”, J. Sib. Fed. Univ. Math. Phys., 8:2 (2015), 208

Journal of Siberian Federal University. Mathematics & Physics

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive
	Impact factor
	Guidelines for authors

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

J. Sib. Fed. Univ. Math. Phys.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Journal of Siberian Federal University. Mathematics & Physics, 2015, Volume 8, Issue 2, Pages 208–216 (Mi jsfu423)

Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement

Maxim Sidorov^a, Alexander Schmitt^a, Eugene S. Semenkin^b

^a Institute of Communications Engineering, Ulm University, Albert Einstein-Allee, 43, Ulm, 89081, Germany
^b Institute of Computer Science and Telecommunications, Siberian State Aerospace University, Krasnoyarskiy Rabochiy, 31, Krasnoyarsk, 660014, Russia

Full-text PDF (210 kB)

References:

PDF

HTML

Abstract: The ability of artificial systems to recognize paralinguistic signals, such as emotions, depression, or openness, is useful in various applications. However, the performance of such recognizers is not yet perfect. In this study we consider several directions which can significantly improve the performance of such systems. Firstly, we propose building speaker- or gender-specific emotion models. Thus, an emotion recognition (ER) procedure is followed by a gender- or speaker-identifier. Speaker- or gender-specific information is used either for including into the feature vector directly, or for creating separate emotion recognition models for each gender or speaker. Secondly, a feature selection procedure is an important part of any classification problem; therefore, we proposed using a feature selection technique, based on a genetic algorithm or an information gain approach. Both methods result in higher performance than baseline methods without any feature selection algorithms. Finally, we suggest analysing not only audio signals, but also combined audio-visual cues. The early fusion method (or feature-based fusion) has been used in our investigations to combine different modalities into a multimodal approach. The results obtained show that the multimodal approach outperforms single modalities on the considered corpora. The suggested methods have been evaluated on a number of emotional databases of three languages (English, German and Japanese), in both acted and non-acted settings. The results of numerical experiments are also shown in the study.

Keywords: recognition of paralinguistic signals, machine learning algorithms, speaker-adaptive emotion recognition, multimodal approach.

Received: 19.01.2015
Received in revised form: 25.02.2015
Accepted: 20.03.2015

Document Type: Article

UDC: 519.87

Language: English

Citation: Maxim Sidorov, Alexander Schmitt, Eugene S. Semenkin, “Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement”, J. Sib. Fed. Univ. Math. Phys., 8:2 (2015), 208–216

Citation in format AMSBIB

\Bibitem{SidSchSem15}

\by Maxim~Sidorov, Alexander~Schmitt, Eugene~S.~Semenkin

\paper Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement

\jour J. Sib. Fed. Univ. Math. Phys.

\yr 2015

\vol 8

\issue 2

\pages 208--216

\mathnet{http://mi.mathnet.ru/jsfu423}

Linking options:

https://www.mathnet.ru/eng/jsfu423

https://www.mathnet.ru/eng/jsfu/v8/i2/p208

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Журнал Сибирского федерального университета. Серия "Математика и физика"

Statistics & downloads:
Abstract page:	478
Full-text PDF :	70
References:	41

Что такое QR-код?

Registration to the website

Logotypes