Abstract:
Proposed are theoretical basis and algorithmic implementation of spectral-analytical method of recognition of repeats in character sequences. The theoretical justification is based on the theorem on equivalent representation of the character sequence by the vector of continuous characteristic functions. Comparison of fragments of characteristic functions is performed in the standard metric in Euclidean space of expansion coefficients of the Fourier series of orthogonal polynomials. An essential feature of this approach is the ability to evaluate repeats at different scales. Another important feature is the possibility of efficient parallelization of data. In the development of algorithms we preferred scheme of computing with a minimal amount of references to memory, implying repetitive calculations and evaluations on demand. In this paradigm, proposed is an algorithm for calculating the coefficients of expansions in the orthogonal polynomials through the use of recurrence relations. It is shown that the algorithm for calculating the coefficients of expansions in the orthogonal polynomials can be effectively vectorized by computing with a fixed vector length. Parallelization and vectorization implemented using the OpenMP standard and extension Cilk Plus of language C/C++. The developed method effectively scales, depending on the parameters of the problem and the number of processor cores on systems with shared memory.
Citation:
A. N. Pankratov, R. K. Tetuev, M. I. Pyatkov, V. P. Toigildin, N. N. Popova, “Spectral analytical method of recognition of inexact repeats in character sequences”, Proceedings of ISP RAS, 27:6 (2015), 335–344
\Bibitem{PanTetPya15}
\by A.~N.~Pankratov, R.~K.~Tetuev, M.~I.~Pyatkov, V.~P.~Toigildin, N.~N.~Popova
\paper Spectral analytical method of recognition of inexact repeats in character sequences
\jour Proceedings of ISP RAS
\yr 2015
\vol 27
\issue 6
\pages 335--344
\mathnet{http://mi.mathnet.ru/tisp201}
\crossref{https://doi.org/10.15514/ISPRAS-2015-27(6)-21}
\elib{https://elibrary.ru/item.asp?id=25476316}
Linking options:
https://www.mathnet.ru/eng/tisp201
https://www.mathnet.ru/eng/tisp/v27/i6/p335
This publication is cited in the following 7 articles:
S.A. Makhortykh, Proceedings of the International Conference “Mathematical Biology and Bioinformatics”, 9, Proceedings of the International Conference “Mathematical Biology and Bioinformatics”, 2022
A. N. Pankratov, N. M. Pankratova, “Spectral Method for Detecting Inexact Repeats in Character Sequences”, Pattern Recognit. Image Anal., 32:3 (2022), 622
D. A. Tikhonov, L. I. Kulikova, A. V. Efimov, “The study of interhelical angles in the structural motifs formed by two helices”, Matem. biologiya i bioinform., 14, Suppl. (2019), 1–17
Dmitry Koznov, Dmitry Luciv, George Chernishev, Dmitry Grigoryev, 2019 Actual Problems of Systems and Software Engineering (APSSE), 2019, 126
D. A. Tikhonov, L. I. Kulikova, A. V. Efimov, “The study of interhelical angles in pairs of helices in protein molecules”, Preprinty IPM im. M. V. Keldysha, 2018, 176, 25 pp.
A.N. Pankratov, R.K. Tetuev, M.I. Pyatkov, Proceedings of the International Conference “Mathematical Biology and Bioinformatics”, 7, Proceedings of the International Conference “Mathematical Biology and Bioinformatics”, 2018
D. A. Tikhonov, L. I. Kulikova, A. V. Efimov, “Issledovanie mezhspiralnykh uglov v strukturnykh motivakh, obrazovannykh dvumya spiralyami”, Matem. biologiya i bioinform., 12:1 (2017), 83–101