|
Analysis of signals, audio and video information
Segmentation of noisy speech signals
A. G. Shishkin, S. D. Protserov Lomonosov Moscow State University, Moscow, Russia
Abstract:
One of the most important problems in digital speech signal processing is to determine which parts of input acoustic signal contain speech, and which contain background noise or silence. This problem arises in many important practical applications, such as speech analysis in voice command systems, transmission of speech over the network and automated speech recognition. However, most of the existing systems designed for automated speech analysis are unable to solve this problem efficiently if the signal-to-noise ratio is too low. Moreover, their parameters have to be tuned separately for different noise levels. This prevents fully automated segmentation of noisy speech signals. In this paper we design a system for automated segmentation of speech signals distorted by additive noise of different type and intensity. Our system is based on three different convolutional neural network models and is capable of efficiently determining speech and silence segments in noisy signals with a wide range of noise intensity and different noise types.
Keywords:
speech signal, convolutional neural network, segmentation, digital signal processing.
Citation:
A. G. Shishkin, S. D. Protserov, “Segmentation of noisy speech signals”, Artificial Intelligence and Decision Making, 2021, no. 1, 75–85; Scientific and Technical Information Processing, 49:5 (2022), 356–363
Linking options:
https://www.mathnet.ru/eng/iipr93 https://www.mathnet.ru/eng/iipr/y2021/i1/p75
|
Statistics & downloads: |
Abstract page: | 23 | Full-text PDF : | 12 | References: | 1 |
|