Abstract:
This paper presents a survey of basic methods for acoustic and language model development based on artificial neural networks for automatic speech recognition systems. The hybrid and tandem approaches for combination of Hidden Markov Models and artificial neural networks for acoustic modelling are given. The creation of language models using feedforward and recurrent neural networks is described. The survey of researches, conducted in this field, shows that application of artificial neural networks at the stages of both acoustic and language modeling allows decreasing word error rate.
Keywords:
automatic speech recognition; neural networks; acoustic models; language models.
This research is partially supported by the Council for Grants of the President of the Russian Federation (projects No. MK-5209.2015.8 and MD-254.2017.8), by the Russian Foundation for Basic Research (projects No. 15-07-04415 and 15-07-04322), and by state research № 0073-2014-0005 and № 0073-2015-0007.
Bibliographic databases:
Document Type:
Article
UDC:
004.522
Language: Russian
Citation:
I. S. Kipyatkova, A. A. Karpov, “Variants of deep artificial neural networks for speech recognition systems”, Tr. SPIIRAN, 49 (2016), 80–103
\Bibitem{KipKar16}
\by I.~S.~Kipyatkova, A.~A.~Karpov
\paper Variants of deep artificial neural networks for speech recognition systems
\jour Tr. SPIIRAN
\yr 2016
\vol 49
\pages 80--103
\mathnet{http://mi.mathnet.ru/trspy918}
\crossref{https://doi.org/10.15622/sp.49.5}
\elib{https://elibrary.ru/item.asp?id=27657124}
Linking options:
https://www.mathnet.ru/eng/trspy918
https://www.mathnet.ru/eng/trspy/v49/p80
This publication is cited in the following 11 articles:
Evgeny Kostyuchenko, Ivan Rakhmanenko, Lidiya Balatskaya, Lecture Notes in Computer Science, 13721, Speech and Computer, 2022, 382
Alexey Kashevnik, Igor Lashkov, Alexandr Axyonov, Denis Ivanko, Dmitry Ryumin, Artem Kolchin, Alexey Karpov, “Multimodal Corpus Design for Audio-Visual Speech Recognition in Vehicle Cabin”, IEEE Access, 9 (2021), 34986
A.A. Axyonov, D.V. Ivanko, I.B. Lashkov, D.A. Ryumin, A.M. Kashevnik, A.A. Karpov, “A methodology of multimodal corpus creation for audio-visual speech recognition in assistive transport systems”, I&C, 5 (2020), 87
Dmitrii Levonevskii, Olga Shumskaya, Alena Velichko, Mikhael Uzdiaev, Dmitrii Malov, Smart Innovation, Systems and Technologies, 154, Proceedings of 14th International Conference on Electromechanics and Robotics “Zavalishin's Readings”, 2020, 511
V. V. Kozhevnikov, M. Yu. Leontev, V. V. Prikhodko, V. A. Sergeev, A. N. Fomin, “Neirosetevye tekhnologii postroeniya intellektualnykh sistem upravleniya robotami”, Uchenye zapiski UlGU. Seriya “Matematika i informatsionnye tekhnologii”, 2019, no. 2, 36–53
Dariya Novokhrestova, Evgeny Kostyuchenko, Roman Meshcheryakov, Lecture Notes in Computer Science, 11096, Speech and Computer, 2018, 461
A. A. Karpov, R. M. Yusupov, “Multimodal Interfaces of Human–Computer Interaction”, Her. Russ. Acad. Sci., 88:1 (2018), 67
Irina Kipyatkova, Lecture Notes in Computer Science, 11096, Speech and Computer, 2018, 291