|
This article is cited in 6 scientific papers (total in 6 papers)
Conceptual framework for supracorpora databases
M. G. Kruzhkov Federal Research Center "Computer Science and Control" of the Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119133, Russian Federation
Abstract:
The paper provides an overview of the concept, main structural constituents, and functions of supracorpora databases (SCDB). Supracorpora databases represent a novel type of structured information resources that significantly expand capabilities of linguistic text corpora, parallel corpora in particular. The paper outlines principle features and limitations of parallel corpora and demonstrates how SCDBs allow extending these features and overcoming the limitations. Supracorpora databases allow linguistic experts to establish, record, and annotate translation correspondences between language units in the source and target texts while relying on faceted classification categories composed by the researchers themselves according to their requirements. The article also describes the general structure of SCDB architecture developed in FRC CSC RAS which incorporates corpus and subcorpus constituents that interact with one another as a part of a common database.
Keywords:
corpus linguistics, supracorpora database, parallel corpus, linguistic annotation, information technologies, faceted classification.
Received: 14.08.2021
Citation:
M. G. Kruzhkov, “Conceptual framework for supracorpora databases”, Sistemy i Sredstva Inform., 31:3 (2021), 101–112
Linking options:
https://www.mathnet.ru/eng/ssi785 https://www.mathnet.ru/eng/ssi/v31/i3/p101
|
|