|
This article is cited in 4 scientific papers (total in 4 papers)
Approaches to annotation of discourse relations in linguistic corpora
M. G. Kruzhkov Institute of Informatics Problems, Federal Research Center “Computer Science and Control” of the Russian Academy of Sciences, 44-2
Vavilov Str., Moscow 119333, Russian Federation
Abstract:
This paper examines the Supracorpora Database of Connectives
(SCDB-Connectives) that is based on data from parallel corpora. The
SCDB-Connectives provides structural and semantic annotation of Russian
connectives and their translation correspondences in French (and, eventually, in other
languages). The SCDB-Connectives annotation approach is compared to the latest
developments in the area of annotation of discourse relations — the annotated corpus
of discourse relations Penn Discourse Treebank (PDTB) and the proposed standard
for annotation of semantic relations ISO 24617-8, some of the important differences
are discussed. Penn Discourse Treebank and ISO 24617-8 support annotation of both explicit
and implicit discourse relations
while SCDB-Connectives only annotates explicit relations,
i. e., those expressed by connectives. Furthermore, PDTB and ISO 24617-8 provide
a superior framework for annotating text spans as relation arguments, which allows
annotating attribution for these arguments, such as source and type of the linked
propositions. In addition, ISO 24617-8 specifies argument roles for asymmetrical
discourse relations. On the other hand, the principle advantage of the
SCDB-Connectives is that it supports annotation of both connectives and their translation
correspondences in parallel corpora, opening up new possibilities for contrastive
studies. The SCDB-Connectives is based on a relational database rather than on the
XML format, which helps to manage complex cross-linguistic data efficiently.
Benefits of semantic annotation of connectives for both theoretical and practical
purposes are also discussed.
Keywords:
discourse relations; discourse connectives; corpus linguistics; parallel
corpora; supracorpora databases.
Received: 07.09.2017
Citation:
M. G. Kruzhkov, “Approaches to annotation of discourse relations in linguistic corpora”, Inform. Primen., 11:4 (2017), 118–125
Linking options:
https://www.mathnet.ru/eng/ia509 https://www.mathnet.ru/eng/ia/v11/i4/p118
|
Statistics & downloads: |
Abstract page: | 233 | Full-text PDF : | 107 | References: | 26 |
|