|
Universal technology of information objects proximity assessment
L. A. Kuznetsov Russian Presidential Academy of National Economy and Public Administration (Lipetsk Branch), 3 Internatsional'naya Str., Lipetskaya oblast, Lipetsk 398050, Russian Federation
Abstract:
The paper outlines the technology used to determine the degree of similarity of information objects, which are represented by text or graphic images. Objects are formalized by probabilistic models. The structure of the model is set by an algebra on a minimum set of graphic components of an object. Quantitative characteristics of the structure of objects are the probability distributions on the algebra. The amount of information in objects is estimated by entropy. The similarity measure of information objects is based on entropy. The paper describes the method of estimating the proximity of text and graphic objects. The paper provides several examples of estimation algorithms implementation. It is shown that the developed method is more efficient compared to the methods described in the literature. The technology used to form images of information objects and to compare their semantic content is universal. It is possible to adapt the technology to the meaningful characteristics of objects being analyzed.
Keywords:
information object; text; image; probabilistic model; semantic similarity; entropy; measure of similarity.
Received: 10.12.2013
Citation:
L. A. Kuznetsov, “Universal technology of information objects proximity assessment”, Inform. Primen., 8:2 (2014), 130–144
Linking options:
https://www.mathnet.ru/eng/ia318 https://www.mathnet.ru/eng/ia/v8/i2/p130
|
|