|
Zapiski Nauchnykh Seminarov POMI, 2023, Volume 529, Pages 43–53
(Mi znsl7418)
|
|
|
|
Wav2Vec2 without Attention: do you need Hopfield Networks for Self-Supervised Learning of Speech Representations?
D. Grebenkin, I. Bondarenko Laboratory of Applied Digital Technologies, Novosibirsk State University
Abstract:
In this work, we consider the possibility of replacing multi-head attention with dense associative memory (DAM) layers in the wav2vec2 automatic speech recognition algorithm. We examine the hypothesis that the concept of modern Hopfield networks is more suitable for restoration of missing fragments of the audio signal task and speech-to-text task than multi-head attention. Our experiments indicate that the model with the new architecture allows to improve the quality of speech recognition and can be used for pretraining the models on a large amount of audio data.
Key words and phrases:
speech recognition, self-attention, 2 associative memory.
Received: 06.09.2023
Citation:
D. Grebenkin, I. Bondarenko, “Wav2Vec2 without Attention: do you need Hopfield Networks for Self-Supervised Learning of Speech Representations?”, Investigations on applied mathematics and informatics. Part II–1, Zap. Nauchn. Sem. POMI, 529, POMI, St. Petersburg, 2023, 43–53
Linking options:
https://www.mathnet.ru/eng/znsl7418 https://www.mathnet.ru/eng/znsl/v529/p43
|
Statistics & downloads: |
Abstract page: | 123 | Full-text PDF : | 67 | References: | 20 |
|