I. A. Rakhmanenko, A. A. Shelupanov, E. Yu. Kostyuchenko, “Automatic text-independent speaker verification using convolutional deep belief network”, Computer Optics, 44:4 (2020), 596

Computer Optics

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Computer Optics:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Computer Optics, 2020, Volume 44, Issue 4, Pages 596–605
DOI: https://doi.org/10.18287/2412-6179-CO-621 (Mi co826)

This article is cited in 8 scientific papers (total in 8 papers)

IMAGE PROCESSING, PATTERN RECOGNITION

Automatic text-independent speaker verification using convolutional deep belief network

I. A. Rakhmanenko, A. A. Shelupanov, E. Yu. Kostyuchenko

Tomsk State University of Control Systems and Radioelectronics, prospect Lenina 40, 634050, Tomsk, Russia

Full-text PDF (1382 kB) Citations (8)

References:

PDF

HTML

DOI: https://doi.org/10.18287/2412-6179-CO-621

Abstract: This paper is devoted to the use of the convolutional deep belief network as a speech feature extractor for automatic text-independent speaker verification. The paper describes the scope and problems of automatic speaker verification systems. Types of modern speaker verification systems and types of speech features used in speaker verification systems are considered. The structure and learning algorithm of convolutional deep belief networks is described. The use of speech features extracted from three layers of a trained convolution deep belief network is proposed. Experimental studies of the proposed features were performed on two speech corpora: own speech corpus including audio recordings of 50 speakers and TIMIT speech corpus including audio recordings of 630 speakers. The accuracy of the proposed features was assessed using different types of classifiers. Direct use of these features did not increase the accuracy compared to the use of traditional spectral speech features, such as mel-frequency cepstral coefficients. However, the use of these features in the classifiers ensemble made it possible to achieve a reduction of the equal error rate to 0.21% on 50-speaker speech corpus and to 0.23% on the TIMIT speech corpus.

Keywords: speaker recognition, speaker verification, Gaussian mixture models, GMM-UBM system, speech features, speech processing, deep learning, neural networks, pattern recognition.

Funding agency	Grant number
Ministry of Science and Higher Education of the Russian Federation	8.9628.2017/8.9
The work was funded within the basic part of the government project of the Russian Federation Education and Science Ministry, project 8.9628.2017/8.9.

Received: 20.08.2019
Accepted: 13.10.2019

Document Type: Article

Language: Russian

Citation: I. A. Rakhmanenko, A. A. Shelupanov, E. Yu. Kostyuchenko, “Automatic text-independent speaker verification using convolutional deep belief network”, Computer Optics, 44:4 (2020), 596–605

Citation in format AMSBIB

\Bibitem{RakSheKos20}

\by I.~A.~Rakhmanenko, A.~A.~Shelupanov, E.~Yu.~Kostyuchenko

\paper Automatic text-independent speaker verification using convolutional deep belief network

\jour Computer Optics

\yr 2020

\vol 44

\issue 4

\pages 596--605

\mathnet{http://mi.mathnet.ru/co826}

\crossref{https://doi.org/10.18287/2412-6179-CO-621}

Linking options:

https://www.mathnet.ru/eng/co826

https://www.mathnet.ru/eng/co/v44/i4/p596

This publication is cited in the following 8 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Что такое QR-код?

Registration to the website

Logotypes