An analysis of the processing of multiword units in sentence reading and unit presentation using eye movement data: Implications for theories of MWUs

    Columbus, Georgina C
  • Multiword units (MWUs) have been the subject of much research in psycholinguistics, due to their syntactic and semantic idiosyncrasies. While studies traditionally focused on idioms (a piece of cake), more recent work has focused on another type: lexical bundles (in the middle of). How MWUs are stored and retrieved remains a central question in the literature, the answer to which will add to our understanding of language processing. To date though, there have been few investigations comparing the processing of different types of MWUs. This dissertation aims to fill that gap through analyses of eye movement data during normal sentence reading and trigram reading. The sentence reading results suggest that the familiarity rating for the MWU types analysed here is a relevant predictor of MWU processing. Surprisingly however, individual word frequency has more predictive capacity for MWU reading times than does MWU frequency. Much of the variance is explained by individual word frequency instead. Overall, the three MWU types investigated here are distinguished from one another in fixation durations on words, particularly for idioms and lexical bundles. For sentence reading times, in contrast, the effects of the MWU types are cancelled out, suggesting that processing difficulties may have been resolved at the sentence level. The second study investigates MWU type effects while reading them without context. Each MWU in this study is a trigram taken from the Google Web1T n-gram corpus (Brants & Franz, 2006) using stratified sampling across n-gram frequencies. The trigrams were coded for MWU type based on the categories used in Chapter 1. The results show that MWU effects are visible at the trigram level even without context. Somewhat surprisingly, however, there is also evidence of MWU types affecting processing of the first word in the first fixation duration, and of the first bigram in the subgaze duration. The findings suggest the semantic composition of MWUs is apparent to the reader very early. Our results support a usage-based model of language access and storage, such as those put forward by Bybee (e.g., 2006), Pierrehumbert (2001) and Bod (1998), where individual and unit frequency both affect reading times.

    Doctor of Philosophy
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
    University of Alberta
    • Department of Linguistics
    • R. Harald Baayen (Linguistics)
    • Patrick A. Bolger (Linguistics)
    • R. Harald Baayen (Linguistics)
    • Patrick A. Bolger (Linguistics)
    • John Newman (Linguistics)
    • Inbal Arnon (Psychology, University of Haifa)
    • Jeremy Caplan (Psychology)