Informatika i Ee Primeneniya [Informatics and its Applications]
RUS  ENG    JOURNALS   PEOPLE   ORGANISATIONS   CONFERENCES   SEMINARS   VIDEO LIBRARY   PACKAGE AMSBIB  
General information
Latest issue
Archive
Impact factor

Search papers
Search references

RSS
Latest issue
Current issues
Archive issues
What is RSS



Inform. Primen.:
Year:
Volume:
Issue:
Page:
Find






Personal entry:
Login:
Password:
Save password
Enter
Forgotten password?
Register


Informatika i Ee Primeneniya [Informatics and its Applications], 2014, Volume 8, Issue 2, Pages 98–110
DOI: https://doi.org/10.14357/19922264140210
(Mi ia315)
 

This article is cited in 9 scientific papers (total in 9 papers)

Information technologies for corpus studies: underpinnings for cross-linguistic database creation

N. V. Buntmana, Anna A. Zaliznyakbc, I. M. Zatsmanb, M. G. Kruzhkovb, E. Yu. Loshchilovab, D. V. Sitchinavad

a Faculty of Foreign Languages and Area Studies, M.V. Lomonosov Moscow State University, 31-a Lomonosov Str., Moscow 119192, Russian Federation
b Institute of Informatics Problems, Russian Academy of Sciences, 44-2 Vavilov Str., Moscow 119333, Russian Federation
c Institute of Linguistics, Russian Academy of Sciences, 1-1 Bolshyi Kislovskiy pereulok, Moscow 125009, Russian Federation
d Institute of Russian Language, Russian Academy of Sciences, 18/2 Volkhonka Str., Moscow 119019, Russian Federation
Full-text PDF (447 kB) Citations (9)
References:
Abstract: Information technology for creation of cross-linguistic databases of Russian texts with French translations (also known as parallel texts) is considered. The underlying principles of the developed database provide a unique combination of three types of bilingual search: lexical, grammatical, and lexico-grammatical. A distinctive feature of the considered technology is simultaneous creation of Russian-French parallel subcorpus within the National Russian Corpus and of the cross-linguistic database of Russian verbal lexico-grammatical forms and their French functional equivalents. The subcorpus and the database have different levels of alignment: the former is aligned at the level of sentences, and the later at the level of constructions. The academic relevance of the developed database is due to its support of bilingual contrastive grammar development, as well as to its role in creation of Russian grammar based on the modern empirical base and information technologies of corpus linguistics. The main practical application of the database consists in improvement of quality of machine translation.
Keywords: parallel corpus; information technology; cross-linguistic databases; bilingual lexical grammar search; corpus linguistics; contrastive grammar.
Received: 29.03.2014
Bibliographic databases:
Document Type: Article
Language: Russian
Citation: N. V. Buntman, Anna A. Zaliznyak, I. M. Zatsman, M. G. Kruzhkov, E. Yu. Loshchilova, D. V. Sitchinava, “Information technologies for corpus studies: underpinnings for cross-linguistic database creation”, Inform. Primen., 8:2 (2014), 98–110
Citation in format AMSBIB
\Bibitem{BunZalZat14}
\by N.~V.~Buntman, Anna~A.~Zaliznyak, I.~M.~Zatsman, M.~G.~Kruzhkov, E.~Yu.~Loshchilova, D.~V.~Sitchinava
\paper Information technologies for corpus studies: underpinnings for cross-linguistic database creation
\jour Inform. Primen.
\yr 2014
\vol 8
\issue 2
\pages 98--110
\mathnet{http://mi.mathnet.ru/ia315}
\crossref{https://doi.org/10.14357/19922264140210}
\elib{https://elibrary.ru/item.asp?id=21646367}
Linking options:
  • https://www.mathnet.ru/eng/ia315
  • https://www.mathnet.ru/eng/ia/v8/i2/p98
  • This publication is cited in the following 9 articles:
    Citing articles in Google Scholar: Russian citations, English citations
    Related articles in Google Scholar: Russian articles, English articles
    Информатика и её применения
     
      Contact us:
     Terms of Use  Registration to the website  Logotypes © Steklov Mathematical Institute RAS, 2024