A Comparison of Lithuanian morphological analyzers

Collection:
Mokslo publikacijos / Scientific publications
Document Type:
Knygos dalis / Part of the book
Language:
Anglų kalba / English
Title:
A Comparison of Lithuanian morphological analyzers
Summary / Abstract:

LTReikšminiai žodžiai: Aukso standarto tekstynai; Lietuviški morfologiniai anotatoriai; Lithuanian morphological analysers; Gold-standart corpus; Experimental evaluation.

ENIn this paper we present the comparative research work disclosing strengths and weaknesses of two the most popular and publicly available Lithuanian morphological analyzers, in particular, Lemuoklis and Semantika.lt. Their lemmatization, part-of-speech tagging, and fined-grained annotation of the morphological categories (as case, gender, tense, etc.) performance was evaluated on the morphologically annotated gold standard corpus composed of four domains, in particular, administrative, fiction, scientific and periodical texts. Semantika.lt significantly outperformed Lemuoklis by ∼ 1.7%, ∼ 2.5%, and ∼ 8.1% on the lemmatization, part-of-speech tagging, and fine-grained annotation tasks achieving ∼ 98.0%, ∼ 95.3% and, ∼ 86.8% of the accuracy, respectively. Semantika.lt was also superior on the administrative, fiction, and periodical texts; however, Lemuoklis yielded similar performance on the scientific texts and even bypassed Semantika.lt in the fine-grained annotation task. [From the publication]

DOI:
10.1007/978-3-319-64206-2
ISBN:
9783319642055
Related Publications:
Permalink:
https://www.lituanistika.lt/content/76974
Updated:
2022-03-08 15:19:37
Metrics:
Views: 33
Export: