Comparison of deep learning approaches for Lithuanian sentiment analysis

Direct Link:
Collection:
Mokslo publikacijos / Scientific publications
Document Type:
Straipsnis / Article
Language:
Anglų kalba / English
Title:
Comparison of deep learning approaches for Lithuanian sentiment analysis
In the Journal:
Baltic journal of modern computing [BJMC]. 2022, vol. 10, iss. 3, p. 283-294
Keywords:
LT
Lietuvių kalba / Lithuanian language; Kalbos vartojimas. Sociolingvistika / Language use. Sociolinguistics.
Summary / Abstract:

ENSentiment analysis is one of the oldest Natural Language Processing problems, still relevant and challenging today. It is usually formulated and solved as a supervised machine learning problem. In this research, we are solving the three-class sentiment analysis problem for the non-normative Lithuanian language. The contribution of our research is related to applying the innovative BERT-based multilingual sentence transformer models to the Lithuanian sentiment analysis problem. For comparison purposes, we have also investigated traditional Deep Learning approaches, such as fastText or BERT word embeddings with the Convolutional Neural Network as the classifier. The best accuracy ∼0.788 was achieved with the purely monolingual model, i.e., fastText (trained on the very large and diverse Lithuanian corpus) and the Convolutional Neural Network (refined in various text classification tasks). The backbone of the second-best approach (reaching ∼0.762) is the multilingual sentence-transformer-based model, which is the trend in text classification tasks, especially for the English language. Keywords: Sentiment analysis, monolingual vs. multilingual models, word vs. sentence embeddings, transformer models, the Lithuanian language. [From the publication]

DOI:
10.22364/bjmc.2022.10.3.02
ISSN:
2255-8950; 2255-8942
Related Publications:
A Comparison of approaches for sentiment classification on Lithuanian internet comments / Jurgita Kapočiūtė-Dzikienė, Algis Krupavičius, Tomas Krilavičius. Proceedings of the 4th biennial international workshop on Balto-Slavic natural language processing. Stroudsburg (PA): Association for Computational Linguistics, 2013. P. 2-11.
Permalink:
https://www.lituanistika.lt/content/105555
Updated:
2023-11-23 21:47:07
Metrics:
Views: 12    Downloads: 1
Export: