Синтаксически управляемая разметка нестандартных текстов (на материале "Катехизиса" 1595 г. М. Даукши)

Collection:
Mokslo publikacijos / Scientific publications
Document Type:
Straipsnis / Article
Language:
Rusų kalba / Russian
Title:
Синтаксически управляемая разметка нестандартных текстов (на материале "Катехизиса" 1595 г. М. Даукши)
Alternative Title:
Method of syntactically-constrained morphological annotation (as applied to "Katechismas" of 1595 by M. Daukša)
In the Journal:
Индоевропейское языкознание и классическая филология [Indo-European linguistics and classical philology.]. 2018, 22 (1), p. 38-49. Материалы чтений, посвященных памяти профессора И. М. Тронского, 18-20 июня 2018 г
Keywords:
LT
Daukša; Istoriniai tekstynai; Istorinis korpusas; Kalbinė variacija; Katekizmas; Lietuvių kalba; Lingvistinis variavimas; Morfologinis anotavimas; Morfologinė anotacija (ženklinimas)
EN
Cathechism; Daukša; Historical corpora; Linguistic variation; Lithuanian language; Morphological annotation
Summary / Abstract:

ENIn the article, the existing methods of morphological annotation for historical corpora are analysed. A new method of an unsupervised dictionary-free morphological tagging is proposed which is based on applying syntactical dependency constraints to a set of possible morphological interpretations of word finals. The procedure starts with a draft set of orthographical, morphological and syntactic rules that are adjusted and refined as the analysed text is processed. The method is specifically tailored to the highly-inflectionate languages of ‘classical’ Indo-European type. The ambiguity of annotation is further reduced by applying a set of language-neutral constraints, such as the well-known principle of projectivity or the minimization of possible word stems. The application of the method to tagging M. Daukša's Cathechism of 1595 is described. [From the publication]

ISSN:
2306-9015, 9785020403512
Permalink:
https://www.lituanistika.lt/content/81237
Updated:
2020-03-26 06:30:47
Metrics:
Views: 8
Export: