Matematicheskaya Biologiya i Bioinformatika
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Mat. Biolog. Bioinform.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Matematicheskaya Biologiya i Bioinformatika, 2020, Volume 15, Issue 2, Pages 313–337
DOI: https://doi.org/10.17537/2020.15.313
(Mi mbb435)
 

This article is cited in 1 scientific paper (total in 1 paper)

Review Articles

The complexity of DNA sequences. Different approaches and definitions

V. D. Gusev, L. A. Miroshnichenko

Sobolev Institute of Mathematics, Siberian Branch of the Russian Academy of Sciences, Novosibirsk
Full-text PDF (978 kB) Citations (1)
References:
Abstract: An important quantitative characteristic of symbolic sequence (texts, strings) is complexity, which reflects at the intuitive level the degree of their “non-randomness”. A.N. Kolmogorov formulated the most general definition of complexity. He proposed measuring the complexity of an object (symbolic sequence) by the length of the shortest descriptions by which this object can be uniquely reconstructed. Since there is no program guaranteed to search for the shortest description, in practice, various algorithmic approximations considered in this paper are used for this purpose. Along with definitions of complexity, suggesting the possibility of reconstruction a sequence from its "description", a number of measures are considered that do not imply such restoration. They are based on the calculation of some quantitative characteristics. Of interest is not only a quantitative assessment of complexity, but also the identification and classification of structural regularities that determine its specific value. In one form or another, they are expressed in the demonstration of repetition in the broadest sense. The considered measures of complexity are conventionally divided into statistical ones that take into account the frequency of occurrence of symbols or short “words” in the text, “dictionary” ones that estimate the number of different “subwords” and “structural” ones based on the identification of long repeating fragments of text and the determination of relationships between them. Most of the methods are designed for sequences of an arbitrary linguistic nature. The special attention paid to DNA sequences, reflected in the title of the article, is due to the importance of the object, manifestations of repetition of different types, and numerous examples of using the concept of complexity in solving problems of classification and evolution of various biological objects. Local structural features found in the sliding window mode in DNA sequences are of considerable interest, since zones of low complexity in the genomes of various organisms are often associated with the regulation of basic genetic processes.
Key words: DNA sequences, algorithms, complexity, entropy, data compression, statistical measures, linguistic measure of complexity, structural measures of complexity.
Funding agency Grant number
Ministry of Education and Science of the Russian Federation 0314-2019-0015
The study was carried out within the framework of the state contract of the Sobolev Institute of Mathematics (project no. 0314-2019-0015).
Received 23.10.2020, 14.11.2020, Published 30.11.2020
Document Type: Article
Language: Russian
Citation: V. D. Gusev, L. A. Miroshnichenko, “The complexity of DNA sequences. Different approaches and definitions”, Mat. Biolog. Bioinform., 15:2 (2020), 313–337
Citation in format AMSBIB
\Bibitem{GusMir20}
\by V.~D.~Gusev, L.~A.~Miroshnichenko
\paper The complexity of DNA sequences. Different approaches and definitions
\jour Mat. Biolog. Bioinform.
\yr 2020
\vol 15
\issue 2
\pages 313--337
\mathnet{http://mi.mathnet.ru/mbb435}
\crossref{https://doi.org/10.17537/2020.15.313}
Linking options:
  • https://www.mathnet.ru/eng/mbb435
  • https://www.mathnet.ru/eng/mbb/v15/i2/p313
  • This publication is cited in the following 1 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Statistics & downloads:
    Abstract page:156
    Full-text PDF :206
    References:13
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024