Usage
  • 45 views
  • 67 downloads

Development and Evaluation of Interpretable Machine Learning Models for Mitigating Winter Road Safety – An Empirical Investigation

  • Author / Creator
    Shuai, Zehua
  • In Canada, winter crashes account for a significant portion of crashes each year. This thesis investigates the utility of machine learning (ML) for understanding and mitigating winter road risks. Despite their potential to achieve high predictive performance in the face of complex data structures, ML models often lack interpretability as their predictions lack sufficient explanations for users to validate. Numerous studies have utilized ML models in traffic safety modeling; however, the transparency of the model predictions remains a challenge. To address this problem, this study developed highly interpretable ML models with the assistance of explainable artificial intelligence (XAI), which helps determine feature contributions and directionality towards the ML predictions. The first part of this thesis analyzed the characteristics of snowstorm-related crashes, while the second part developed a winter crash frequency model. Both high-complexity ML models were then evaluated using SHapley Additive exPlanations (SHAP) for interpretability to understand the inner working mechanisms of the model predictions.
    In the first part, a model for the classification of crash-inducing snowstorm events was built using a dataset of 231 snowstorm events occurring over 21 friction testing routes in the City of Edmonton. The issue was addressed by integrating SHAP with a Support Vector Machine (SVM) model. Using the Radial Basis Function (RBF) kernel, the SVM model achieved an accuracy rate of 87.2% and a high recall rate of 80%. SHAP global explanations revealed that duration, road length, and precipitation were the most significant factors influencing crash-inducing snowstorms, along with some counterintuitive feature characteristics. To understand these counterintuitive features more clearly, local explanations were applied to closely examine representative snowstorm events, confirming the model's applicability in practical scenarios and informing future enhancements. This study also highlighted the critical role of maintenance activities, such as plowing and anti-icing, in mitigating accident risks.
    In the second part, an in-depth analysis of winter crash frequency was conducted using a dataset of 26,970 winter crashes over four years period. In the data collection step, Ordinary Kriging (OK) was evaluated as a valuable tool to interpolate traffic volume at unknown locations. The analysis first explored spatial patterns through Hot Spot Analysis (HSA), identifying high and low crash clusters. High crash frequency areas were associated with high traffic volume, high functional road class, and commercial land use, while low crash frequency areas were typically residential with lower traffic volumes and speed limits. Next, both micro and macro level variables were fused to build a crash frequency model. Three high-performance tree-based models – XGBoost, Random Forest, and LightGBM – were compared. XGBoost emerged as the best-performing model with a testing R2 value of 92.67%, MAE of 3.64, and RMSE of 5.77. With the larger dataset, significantly more stable SHAP analysis results were obtained, enhancing the understanding of feature interactions. The global analysis indicated that road type, speed limit, and the presence of traffic enforcement cameras contributed most to the model. Key characteristics between high and low crash frequency locations were differentiated using local explanations.
    The framework presented in this thesis underscores the importance of integrating interpretability techniques in practical applications to enhance winter road safety. More interpretable models provide greater insights into the fairness and trustworthiness of the model decisions, enhancing the understanding of winter road safety, and aiding maintenance personnel in effective decision-making processes and resource allocations. This thesis also recommends larger datasets for more stable models and consistent predictions, ultimately improving model reliability and decision-making processes.

  • Subjects / Keywords
  • Graduation date
    Fall 2024
  • Type of Item
    Thesis
  • Degree
    Master of Science
  • DOI
    https://doi.org/10.7939/r3-j5zr-hd14
  • License
    This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.