Words and Paradigms in the Mental Lexicon

    Loo, Kaidi
  • This dissertation examines the comprehension and production of Estonian case-inflected nouns. Estonian is a morphologically complex Finno-Ugric language with 14 cases in both singular and plural for each noun. Because storing millions of forms in memory seems implausible, languages like Estonian are often taken to be prime candidates for rule-driven morpheme-based processing. However, not all Estonian nouns actually occur in all their 28 cases, but only in cases that make sense based on the meaning of the word. For instance, for jalg 'foot/leg' the nominative plural jalad 'feet/legs' is very common, whereas the essive singular case jalana 'as a foot/leg' rarely ever gets used. Furthermore, Estonian inflected forms cluster into inflectional paradigms, which typically come with only a few inflected variants from which other forms in the paradigm can be predicted. Hence, the number of forms that a speaker would need to memorize is much smaller than the number of forms that one can understand or produce. Based on these observations, we aimed to clarify lexical-distributional properties that co-determine Estonian processing. Using a large number of items and generalized mixed effects modeling, we tested the influence of a number of lexical measures, such as lemma frequency, whole-word frequency, morphological family size, inflectional entropy, orthographic length and orthographic neighbourhood density (all calculated on the basis of a 15-million token Estonian corpus). Importantly, we hypothesized that a new measure, the number of attested forms of a given paradigm, i.e., the forms that actually get used, may affect morphological processing in Estonian. We conducted four psycholinguistic experiments: two word naming tasks, a lexical decision and a semantic categorization experiment with native speakers of Estonian, varying in age (21-69 years). Results of the semantic categorization task with 200 inflected forms showed a facilitatory effect of inflectional paradigm size in both response times and accuracy. In the word naming, which had with similar number of items, a facilitatory effect of whole-word frequency was found. In the two remaining large-scale-studies, a lexical decision task and a word naming task, with over 2,000 inflected forms, both whole-word frequency and inflectional paradigm size again emerged as the strongest predictors. In line with the behavioural data, eye movement data collected during the word naming task further confirmed whole-word frequency and inflectional paradigm size as the main predictors of Estonian inflected word processing. Further analyses of pupil dilation supported these findings, but also suggested large individual differences in processing patterns. In summary, our findings suggest a surprising amount of item-specific knowledge is available during language processing. This contradicts a purely decompositional approach to the processing of complex words, even for a language as morphologically rich as Estonian.

    Spring 2018
    Doctor of Philosophy
