Impacts of Model Choice in XAI

Alexander, Graham

doi:doi:10.7939/r3-k41k-p339

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

148 views
161 downloads

Impacts of Model Choice in XAI

Author / Creator

Alexander, Graham
Explainable artificial intelligence models are becoming increasingly important as restrictions grow for corporate use of blackbox models whose predictions affect people’s lives and yet cannot be interpreted. Black boxes do not convey
trust to end-users and are difficult to train and debug for developers.
Model agnostic explanation methods, like SHAP [23], can be used post hoc to shed light on these blackbox predictions. With access to a model’s predictions, SHAP can generate scores for relative feature importance. This work focuses on explanations generated for Natural Language Processing (NLP) where the features that SHAP uses are words.
There are currently no generally accepted methods to generate explanations in NLP. However, SHAP can calculate importance scores for each word where the most important words can be taken as the explanation. SHAP should be structure-agnostic, meaning it should not be influenced by the number or types of layers in the model, it should only be influenced by the quality of the predition. Otherwise, SHAP predictions cannot be fairly compared across models because SHAP may be biased towards certain structures.
Importance scores from SHAP are converted to a mask to either include or ignore each word of the input, providing the generated explanation. The Eraser [10] dataset provides human annotated explanations for NLP tasks that can be used as a gold standard by comparing them to the explanations generated by SHAP. An F1 score can then be used as a notion of the quality of the explanation by comparing the generated explanation to the human annotated explanation.
This work investigates whether the quality of explanations generated by SHAP is structure agnostic. Using a dataset with ground truth explanations in a sentiment analysis task, we compare the SHAP output across different types
of models. Our main finding is that CNN models using intrinsic explanation underperformed CNN models without intrinsic explanation, while having nearly identical accuracy. These findings demonstrate that the underlying
model can impact SHAP’s performance and may favour certain structures of models.
Subjects / Keywords
Graduation date

Fall 2023
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-k41k-p339
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Barbosa, Denilson (Computing Science)