Towards Automated and Accurate Radiology Report Generation

  • Author / Creator
    Nguyen, Tran Nhat Hoang
  • Radiology reports are the primary medium through which physicians communicate findings and diagnoses from patients' medical scans. Examples include radiology reports for chest radiographs, CT scans of the brain, medical reports of retinal images, and more. However, the process of writing medical reports is tedious, error-prone, and time-consuming, even for experienced radiologists. Moreover, a Covid-19 or similar pandemic could exacerbate the existing problems to all health care systems worldwide. Therefore, this thesis explores the ability to automate diagnosing diseases and accurately generate radiology reports to alleviate the burdens of medical doctors.
    This thesis describes a new fully end-to-end differentiable paradigm that consists of three major complementary modules: Classifier, Generator, and Interpreter. Particularly, taking the chest radiographs and related information as inputs, the classifier module produces state-aware disease embeddings by polarizing visual disease features into different directions, referred to as disease states (e.g., positive, negative, uncertain, or unmentioned). With the awareness of the disease states, a semantic version of the disease representation is formed, referred to as EnricheD DIsease Embeddings (EDDIE), and passed to a transformer-based generator to produce meaningful medical reports. The generated reports are fed to the interpreter to ensure consistency with respect to the disease classification checklist. This three-step approach ensures that the visual information is always semantic enough to generate medical reports. Meanwhile, the generated reports must exactly describe the detected diseases, avoiding overfitting to any dominant class (e.g., due to imbalanced datasets) or language metric (i.e., by cheating the generation process).
    The proposed model is evaluated on different datasets with commonly-used metrics concerning language fluency, clinical accuracy, and human evaluation. Empirical evaluations demonstrate that the proposed model can make more accurate diagnoses and generate more fluent and precise reports than existing baselines. Moreover, noticeable performance gains are consistently observed when additional contextual information is available, such as the patients' clinical background documents and extra scans from different views.

  • Subjects / Keywords
  • Graduation date
    Spring 2022
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.