|
Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, 2013, Volume 155, Book 4, Pages 99–108
(Mi uzku1245)
|
|
|
|
Semi-automatic generation of linear event extraction patterns for free texts
D. Dzendzikab, S. Serebryakovb a Saint-Petersburg State University, Saint Petersburg, Russia
b Hewlett-Packard Laboratories, Saint Petersburg, Russia
Abstract:
In this paper we describe semi-automatic approach to generating event extraction patterns for free texts. The algorithm is composed of four steps: we automatically extract possible events from a corpus of free documents, cluster them using dependency-based parse tree paths, validate random samples from each cluster and generate linear patterns using positive event clusters. We compare it with the system that uses handcrafted patterns.
Keywords:
event extraction, linear patterns, regular expressions, TextMARKER, RUTA.
Received: 31.07.2013
Citation:
D. Dzendzik, S. Serebryakov, “Semi-automatic generation of linear event extraction patterns for free texts”, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, 155, no. 4, Kazan University, Kazan, 2013, 99–108
Linking options:
https://www.mathnet.ru/eng/uzku1245 https://www.mathnet.ru/eng/uzku/v155/i4/p99
|
|