D. D. Yantsen, M. L. Zymbler, “Representative sampling algorithm for database systems based on the partitioned parallelism”, Vestn. YuUrGU. Ser. Vych. Matem. Inform., 3:4 (2014), 36

Loading [MathJax]/jax/output/SVG/config.js

Vestnik Yuzhno-Ural'skogo Gosudarstvennogo Universiteta. Seriya "Vychislitelnaya Matematika i Informatika"

RUS ENG

JOURNALS PEOPLE ORGANISATIONS CONFERENCES SEMINARS VIDEO LIBRARY PACKAGE AMSBIB

JavaScript is disabled in your browser. Please switch it on to enable full functionality of the website

	General information
	Latest issue
	Archive

	Search papers
	Search references

	RSS
	Latest issue
	Current issues
	Archive issues
	What is RSS

Vestn. YuUrGU. Ser. Vych. Matem. Inform.:
Year:
Volume:
Issue:
Page:
	Find

Personal entry:
Login:
Password:
	Save password
	Enter
	Forgotten password?
	Register

Vestnik Yuzhno-Ural'skogo Gosudarstvennogo Universiteta. Seriya "Vychislitelnaya Matematika i Informatika", 2014, Volume 3, Issue 4, Pages 36–50
DOI: https://doi.org/10.14529/cmse140402 (Mi vyurv54)

Computer Science, Engineering and Control

Representative sampling algorithm for database systems based on the partitioned parallelism

D. D. Yantsen, M. L. Zymbler

South Ural State University (Chelyabinsk, Russian Federation)

Full-text PDF (1568 kB)

References:

PDF

HTML

DOI: https://doi.org/10.14529/cmse140402

Abstract: Sampling is a popular approach to very large databases processing in a wide range of applications, e.g. data mining, histograms construction, query execution cost estimation, etc. Use of either the sample instead of the original database can reduce the accuracy of the results, but offset by a reduction of time executing processing. Representative sampling allows you to save the sample of certain characteristics of the database. However, existing algorithms for representative sampling can not be used for pas-parallel database systems because it does not take into account the characteristics of the data distribution fissionable by the compute nodes of the cluster system. In this paper we propose al-representative sampling algorithm for parallel relational database systems based on the slice of parallelism. The results of computational experiments on the proposed algorithm, showing adequate maintenance of representativity database properties distributed across the nodes of a cluster system.

Keywords: relational databases, parallel database systems, representative sampling.

Funding agency	Grant number
Russian Foundation for Basic Research	12-07-00443-а

Received: 11.08.2014

Document Type: Article

UDC: 004.65, 004.622

Language: Russian

Citation: D. D. Yantsen, M. L. Zymbler, “Representative sampling algorithm for database systems based on the partitioned parallelism”, Vestn. YuUrGU. Ser. Vych. Matem. Inform., 3:4 (2014), 36–50

Citation in format AMSBIB

\Bibitem{YanTsy14}

\by D.~D.~Yantsen, M.~L.~Zymbler

\paper Representative sampling algorithm for database systems based on the partitioned parallelism

\jour Vestn. YuUrGU. Ser. Vych. Matem. Inform.

\yr 2014

\vol 3

\issue 4

\pages 36--50

\mathnet{http://mi.mathnet.ru/vyurv54}

\crossref{https://doi.org/10.14529/cmse140402}

Linking options:

https://www.mathnet.ru/eng/vyurv54

https://www.mathnet.ru/eng/vyurv/v3/i4/p36

Citing articles in Google Scholar: Russian citations, English citations
Related articles in Google Scholar: Russian articles, English articles

Vestnik Yuzhno-Ural'skogo Gosudarstvennogo Universiteta. Seriya "Vychislitelnaya Matematika i Informatika"

Statistics & downloads:
Abstract page:	195
Full-text PDF :	52
References:	44

Registration to the website

Logotypes