The Digitization of the dictionary of the Lithuanian language

Collection:
Mokslo publikacijos / Scientific publications
Document Type:
Knygos dalis / Part of the book
Language:
Anglų kalba / English
Title:
The Digitization of the dictionary of the Lithuanian language
Keywords:
LT
Leksikografija / Lexicography.
Summary / Abstract:

LTReikšminiai žodžiai: Elektroninis žodynas; Kompiuterinė / elektroninė leksikografija; Kompiuterinė leksikografija; Kompiuterinė, arba elektroninė leksikografija; Leksikografijos duomenų bazė; Leksikografinė duomenų bazė; Teksto atpažinimas; Computer lexicography; Computer or electronic lexicography; Computer/electronic lexicography; Electronic dictionary; Electronic dictionary, Lithuanian language; Lexicographic database; Text recognition.

ENThe Dictionary of the Lithuanian Language (Lietuvių kalbos žodynas, LKŽ) is a lexicographic milestone that covers the lexicon of Lithuanian from 1547 to 2001. The digitization of the Dictionary aims creation of a lexicographic database with search engines allowing the instant retrieval of information. We present a technology for converting the text of the Dictionary to a set of records of structured data. The program automatically determines what the portions of text stand for (e.g. headwords, grammatical properties, meanings of a word, illustrations, etc.), and puts them into corresponding fields in a data structure that is created for each entry. The main concept of the process is that transferring the human-oriented text of the Dictionary to the structured data is being performed fully automatically. A non-automatic, manually performed entering of data would have been a tough task because of the formidable size of the Dictionary (22,000 pages, 11,000,000 words of text). [From the publication]

ISBN:
9789955704539
Related Publications:
Permalink:
https://www.lituanistika.lt/content/71700
Updated:
2019-12-15 12:21:04
Metrics:
Views: 36
Export: