Borisenko O., Pastukhov R., Kuznetsov S., “Deploying Apache Spark virtual clusters in cloud environments using orchestration technologies”, Proceedings of ISP RAS, 28:6 (2016), 111

Proceedings of the Institute for System Programming of the RAS

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Proceedings of ISP RAS:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Proceedings of the Institute for System Programming of the RAS, 2016, Volume 28, Issue 6, Pages 111–120
DOI: https://doi.org/10.15514/ISPRAS-2016-28(6)-8 (Mi tisp88)

This article is cited in 1 scientific paper (total in 1 paper)

Deploying Apache Spark virtual clusters in cloud environments using orchestration technologies

Borisenko O.^a, Pastukhov R.^a, Kuznetsov S.^abc

^a Institute for System Programming of the Russian Academy of Sciences
^b Lomonosov Moscow State University
^c Moscow Institute of Physics and Technology (State University)

Full-text PDF (680 kB) Citations (1)

References:

PDF

HTML

DOI: https://doi.org/10.15514/ISPRAS-2016-28(6)-8

Abstract: Apache Spark is a framework providing fast computations on Big Data using MapReduce model. With cloud environments Big Data processing becomes more flexible since they allow to create virtual clusters on-demand. One of the most powerful open-source cloud environments is Openstack. The main goal of this project is to provide an ability to create virtual clusters with Apache Spark and other Big Data tools in Openstack. There exist three approaches to do it. The first one is to use Openstack REST APIs to create instances and then deploy the environment. This approach is used by Apache Spark core team to create clusters in propriatary Amazon EC2 cloud. Almost the same method has been implemented for Openstack environments. Although since Openstack API changes frequently this solution is deprecated since Kilo release. The second approach is to integrate virtual clusters creation as a built-in service for Openstack. ISP RAS has provided several patches implementing universal Spark Job engine for Openstack Sahara and Openstack Swift integration with Apache Spark as a drop-in replacement for Apache Hadoop. This approach allows to use Spark clusters as a service in PaaS service model. Since Openstack releases are less frequent than Apache Spark this approach may be not convenient for developers using the latest releases. The third solution implemented uses Ansible for orchestration purposes. We implement the solution in loosely coupled way and provide an ability to add any auxiliary tool or even to use another cloud environment. Also, it provides an ability to choose any Apache Spark and Apache Hadoop versions to deploy in virtual clusters. All the listed approaches are available under Apache 2.0 license.

Keywords: Apache Spark, Openstack, Amazon EC2, Map-Reduce, HDFS, virtual cluster, cloud computing, Big Data, Apache Ignite.

Funding agency	Grant number
Russian Foundation for Basic Research	14-07-00602

Bibliographic databases:

Document Type: Article

Language: Russian

Citation: Borisenko O., Pastukhov R., Kuznetsov S., “Deploying Apache Spark virtual clusters in cloud environments using orchestration technologies”, Proceedings of ISP RAS, 28:6 (2016), 111–120

Citation in format AMSBIB

\Bibitem{BorPasKuz16}

\by Borisenko~O., Pastukhov~R., Kuznetsov~S.

\paper Deploying Apache Spark virtual clusters in cloud environments using orchestration technologies

\jour Proceedings of ISP RAS

\yr 2016

\vol 28

\issue 6

\pages 111--120

\mathnet{http://mi.mathnet.ru/tisp88}

\crossref{https://doi.org/10.15514/ISPRAS-2016-28(6)-8}

\elib{https://elibrary.ru/item.asp?id=27679173}

Linking options:

https://www.mathnet.ru/eng/tisp88

https://www.mathnet.ru/eng/tisp/v28/i6/p111

This publication is cited in the following 1 articles:

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Proceedings of the Institute for System Programming of the RAS

Statistics & downloads:
Abstract page:	266
Full-text PDF :	90
References:	40

Что такое QR-код?

Registration to the website

Logotypes