Lietuvių kalbos gramatikos informacinė sistema: morfologija

Direct Link:
Mokslo publikacijos / Scientific publications
Document Type:
Straipsnis / Article
Lietuvių kalba / Lithuanian
Lietuvių kalbos gramatikos informacinė sistema: morfologija
Alternative Title:
Information system of Lithuanian grammar
In the Journal:
Lietuvių kalba. 2016, 10, 1 pdf (20 psl.)
Vilnius. Vilniaus kraštas (Vilnius region); Lietuva (Lithuania); Morfologija / Morphology; Žodžių daryba. Žodžio dalys / Word formation. Parts of a word.
Summary / Abstract:

LTReikšminiai žodžiai: Gramatika; Gramatiniai požymiai; Informacinė sistema; Kompiuterine kalbotyra; Kompiuterinė lingvistika; Morfema; Morfeminis; Morfologija; Tekstynai; Computational linguistics; Computer linguistics; Corpora; Grammar; Grammatical features; Information system; Morpheme; Morphemic; Morphology.

ENThe article presents a brief overview of studies in the field of computational morphology in Latvian, Czech, Russian, and English. A more extensive discussion is provided on such studies carried out in Lithuania. Morphemes are marked in the database developed by the Institute of Mathematics and Informatics in Vilnius University by the use of different fonts. A particularly uninformative way of graphical marking of morphemes has been noted in the database of Vytautas Magnus University in which morphemes are separated from each other by using hyphens. Occasionally, such information is even misleading as a result of the use of the same marking in words that have different morphemic structure. Examples of German and Estonian morphological analysers demonstrate that such tools are suitable only for specialists since they include a lot of abbreviations and symbols that are not comprehensible to the wide public. Probably the most comprehensive information in this respect is provided in the Russian morphological analyser. It has been noted that the morphological analysing tools of Lithuanian make a lot of mistakes: in some information systems one may find words that do not exist in Lithuanian, e.g.*blizgėjas; yet others are unable to recognise a lot of Lithuanian words, e.g. toliaregis, apyrankė, nebeatsinešdavau, while the system in such cases reports that "this text is not in Lithuanian or it is grammatically incorrect." The article describes an information system of Lithuanian grammar which is in its initial stages of development and which is targeted at non-professionals and which, for that matter, provides morphological information in a particularly clear and explicit fashion. In addition, one of the key goals of the information system reported is accuracy and reliability of information therefore it makes use of an error-protection tool. The words will be added to the database by putting them into a generalised format of a Lithuanian word.The users of the tool are provided with two types of information about words: morphological and morphemic information. The part on morphology provides all relevant data about the word as a whole, i.e. part of speech the word belongs to and its relevant grammatical features. In the case of a noun, for example, the tool indicates its case, number, gender, etc; whereas relevant grammatical information about verbs indicated by the system includes the tense, person, number, mood, and so on. In addition, the tool shows the lemma of a word and, in the case of derivatives and compounds, it also shows the underlying words. The morphemic part includes a graphic representation of the structure of a word by providing not only its segmentation into morphemes but also indicating detailed information about each morpheme. Different types of morphemes are marked using different colours followed by more detailed information about the relevant features of that morpheme, e.g. suffixes: derivational, inflectional; ending: pronominal ending, shortened ending, etc. The article presents figures with tentative results of word analysis of a test sample. [From the publication]

Related Publications:
2020-08-22 11:23:08
Views: 103    Downloads: 11