Vestnik Sankt-Peterburgskogo Universiteta. Seriya 10. Prikladnaya Matematika. Informatika. Protsessy Upravleniya
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Vestnik S.-Petersburg Univ. Ser. 10. Prikl. Mat. Inform. Prots. Upr.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Vestnik Sankt-Peterburgskogo Universiteta. Seriya 10. Prikladnaya Matematika. Informatika. Protsessy Upravleniya, 2017, Volume 13, Issue 2, Pages 168–181
DOI: https://doi.org/10.21638/11701/spbu10.2017.204
(Mi vspui330)
 

Computer science

The streaming processing of SAR data in distributed environment with Apache Spark

V. P. Potapov, M. A. Kostylev, S. E. Popov

Institute of Computational Technologies of the Siberian Branch of the Russian Academy of Sciences, 6, Academician M. A. Lavrentiev pr., Novosibirsk, 630090, Russian Federation
References:
Abstract: This article presents a modern approach to creating a distributed program complex based on mass-parallel technology for pre- and postprocessing of SAR images. The unique features of the system is the ability to work in real time mode with huge amounts of streaming data and applying existing algorithms that are not used for distributed processing on multiple nodes without changing the algorithms' implementation. A comparison has been made of distributed processing technologies based on which we have selected Apache Spark. The ability to organise automatic processing of input SAR images as a sequence of operations which should be performed based on defined conditions is demonstrated. The results of processing store in the system as fault tolerant distributed collections of data (RDD-Resilient Distributed Data), which allows getting and saving the intermediate results in the distributed file system HDFS as and when new space images became available and processed by the sequence of algorithms. This article described the implementation for the specific tasks of SAR data processing based on the suggested approach is described (phase estimation, coregistration, interferogram creation and phase unwrapping with region growing method). A scheme of the phase unwrapping algorithm with the ability to use GPU and NVIDIA CUDA technology is presented. An adaptation of the algorithm for the mass-parallel systems is shown. The algorithm implementation focused on processing pair of SAR images on one node. Performance growth is achieved by simultaneous processing multiple images whose number is equal to cluster nodes count. An example of methods implementation for working with streaming binary data (BinaryRecordStream) which perform monitoring of new SAR data in distributed file system HDFS and reading$\backslash$writing this data as binary files with fixed bytes size is shown. A directory and size of one record are used as the input parameters. The results of testing developed algorithms on demonstration cluster is presented. A possibility of getting up to eight times better processing speed using eight nodes in a cluster for the same images count in comparison with sequential processing on one node is shown. Results of testing provide the ability to improve the performance of presented algorithms without any changes in implementation and this in turn justifies the utility of applying distributed approach for SAR data processing. Refs 26. Figs 4. Tables 3.
Keywords: Apache Spark, Apache Hadoop, distributed information systems, sar interfometry, processing algorithms.
Received: September 15, 2016
Accepted: April 11, 2017
Bibliographic databases:
Document Type: Article
UDC: 004.042
Language: Russian
Citation: V. P. Potapov, M. A. Kostylev, S. E. Popov, “The streaming processing of SAR data in distributed environment with Apache Spark”, Vestnik S.-Petersburg Univ. Ser. 10. Prikl. Mat. Inform. Prots. Upr., 13:2 (2017), 168–181
Citation in format AMSBIB
\Bibitem{PotKosPop17}
\by V.~P.~Potapov, M.~A.~Kostylev, S.~E.~Popov
\paper The streaming processing of SAR data in distributed environment with Apache Spark
\jour Vestnik S.-Petersburg Univ. Ser. 10. Prikl. Mat. Inform. Prots. Upr.
\yr 2017
\vol 13
\issue 2
\pages 168--181
\mathnet{http://mi.mathnet.ru/vspui330}
\crossref{https://doi.org/10.21638/11701/spbu10.2017.204}
\elib{https://elibrary.ru/item.asp?id=29816739}
Linking options:
  • https://www.mathnet.ru/eng/vspui330
  • https://www.mathnet.ru/eng/vspui/v13/i2/p168
  • Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Вестник Санкт-Петербургского университета. Серия 10. Прикладная математика. Информатика. Процессы управления
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2025