Towards Automated and Accurate Radiology Report Generation

Nguyen, Tran Nhat Hoang

doi:doi:10.7939/r3-0efj-8c40

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

291 views
589 downloads

Towards Automated and Accurate Radiology Report Generation

Author / Creator

Nguyen, Tran Nhat Hoang
Radiology reports are the primary medium through which physicians communicate findings and diagnoses from patients' medical scans. Examples include radiology reports for chest radiographs, CT scans of the brain, medical reports of retinal images, and more. However, the process of writing medical reports is tedious, error-prone, and time-consuming, even for experienced radiologists. Moreover, a Covid-19 or similar pandemic could exacerbate the existing problems to all health care systems worldwide. Therefore, this thesis explores the ability to automate diagnosing diseases and accurately generate radiology reports to alleviate the burdens of medical doctors.
This thesis describes a new fully end-to-end differentiable paradigm that consists of three major complementary modules: Classifier, Generator, and Interpreter. Particularly, taking the chest radiographs and related information as inputs, the classifier module produces state-aware disease embeddings by polarizing visual disease features into different directions, referred to as disease states (e.g., positive, negative, uncertain, or unmentioned). With the awareness of the disease states, a semantic version of the disease representation is formed, referred to as EnricheD DIsease Embeddings (EDDIE), and passed to a transformer-based generator to produce meaningful medical reports. The generated reports are fed to the interpreter to ensure consistency with respect to the disease classification checklist. This three-step approach ensures that the visual information is always semantic enough to generate medical reports. Meanwhile, the generated reports must exactly describe the detected diseases, avoiding overfitting to any dominant class (e.g., due to imbalanced datasets) or language metric (i.e., by cheating the generation process).
The proposed model is evaluated on different datasets with commonly-used metrics concerning language fluency, clinical accuracy, and human evaluation. Empirical evaluations demonstrate that the proposed model can make more accurate diagnoses and generate more fluent and precise reports than existing baselines. Moreover, noticeable performance gains are consistently observed when additional contextual information is available, such as the patients' clinical background documents and extra scans from different views.
Subjects / Keywords
Graduation date

Spring 2022
Type of Item

Thesis
Degree

Master of Science
DOI

https://doi.org/10.7939/r3-0efj-8c40
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Master's
Department
- Department of Electrical and Computer Engineering
Specialization
- Signal and Image Processing
Supervisor / co-supervisor and their department(s)
- Cheng, Li (ECE)