"The Corpus of Lithuanian Children Language": development and application for modern studies in language acquisition

Kalbotyra . 2018, t. 71, p. 7-25
Tekstynų lingvistika; Kalbos įsisavinimas; Vaikų kalba; Lietuvių kalba
Corpus linguistics; Language acquisition; Child language; Lithuanian
ENThis paper describes The Corpus of Lithuanian Children’s Language and its possible applications for modern studies on the first language acquisition. First of all, the procedure of data collection for the Corpus is discussed. Furthermore, the main methodological principles of longitudinal and experimental data compilation and transciption are decribed. Finally, different studies in developmental psycholinguistics which have been carried out so far and which demonstrate possible ways of the application of the Corpus data for different scientific purposes are introduced. The Corpus of Lithuanian Children’s Language developed at Vytautas Magnus University comprises typical and atypical, longitudinal and experimental data of the Lithuanian language development. The Corpus was compiled using different methodological approaches, such as natural observation and semi-experiment. The longitudinal data (conversations between the target children and their caretakers) compiled according to the requirement of natural observation includes transcribed and morphologically annotated speech of two typically-developing children, one late talker, one early talker, one child from a low SES family, and a pair of twins. The data was collected during the period of 1993–2017 and and it can be divided into three cohorts. The semi-experimental data (~ 124 hours) comes from numerous studies in narratives and spontaneous dialogues elicited from typically-developing and language-impaired monolingual and bilingual (pre-) school age children From the very beginning of data collection for the The Corpus of Lithuanian Children’s Language, studies in the develomental changes of typical child language have been carried out.Over the past decade, these studies have been supplemented by statistical analysis of elicited semi-experimental data; the majority of these studies deal with typical vs. atypical (delayed or impaired) language acquisition and with differences between acquision of Lithuanian in a monolingual vs. bi-/polylingual settings. The paper provides an overview of data of The Corpus of Lithuanian Children’s Language, which have been collected from 1993 but still needed to be structurized according to the employed methodology of data compilation and possible applications for different scientific purposes. [From the publication]

