Modelling phonetic reduction in a corpus of spoken English using Random Forests and Mixed-Effects Regression

  • Author / Creator
    Dilts, Philip C
  • In this thesis, phonetic reduction in the Buckeye Corpus (Pitt et al. 2005) of conversational speech is modelled using advanced statistical techniques. Two measures of phonetic reduction are modelled, reduction in the duration of words and deletion of segments from words. Statistical modelling techniques are used to predict how much of each type of reduction is observed in the corpus. Predictor variables are selected from a number of broad classes, including demographic, phonetic, predictability, syntactic, semantic, and pragmatic variables. The broad scope of these variables leads to a generalizable picture of the factors leading to reduction in spontaneous speech. Two modelling techniques with complementary properties are applied to the modelling task: Random Forest (RF) models (Breiman 2001), and Linear Mixed-Effect Regression (LMER) Models. RF models can be used to model complex interactions and highly co-linear predictor variables much more easily than LMER models can. Conversely, LMER models allow each word form and speaker to differ in their response to reduction-predicting variables. LMER models can also easily incorporate predictor variables composed of a large number of unordered categories. Both of these properties of LMER models are effectively impossible to incorporate into current RF models on the scale required for the present study. Results relating to the variables or combinations of variables that correlate with reduction or improve model prediction are described. Possible explanations for the results and implications for the nature of the processes underlying reduction during spontaneous speech are explored. Results relating to the modelling process are also discussed. In particular, random forest modelling indicated that several potential interactions between variables were overlooked in initial LMER modelling. When these interactions were included in a second round of LMER modelling, several were found to improve prediction significantly. The results of the present study may lead to improvements in speech recognition and speech production technologies. The results also suggest that random forests can be used to improve regression models of language data.

  • Subjects / Keywords
  • Graduation date
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.