Leveraging Natural language Processing and Machine Learning Techniques to find Frailty Deficits from Clinical Dataset

  • Author / Creator
    Bazrafkan, Mehrnoosh
  • Introduction Frailty is a syndrome that is often associated with aging. It
    can be identified through specific frailty scales or a comprehensive assessment
    by a healthcare provider. In Alberta, it appears that there are no specific
    billing or diagnostic codes for frailty. So, healthcare providers may use specific
    assessments or codes related to conditions such as muscle weakness or
    decreased physical activity to identify frailty. Purpose This project aims to
    leverage Natural Language Processing algorithms to extract frailty keywords
    from structured and Unstructured clinical datasets to identify frailty deficits
    and classify patients into frail and non-frail classes using Machine Learning
    algorithms. Methods The dataset included 450 patients over the age of 60,
    medical information related to diseases, and clinical frailty scales. We first
    clean medical notes using NLP techniques and removing negation terms, then
    extract keywords from clinical notes and structured datasets, and finally, we
    use resampling techniques to deal with imbalanced clinical datasets, and we
    feed these extracted keywords into machine learning classifiers to classify patients
    as frail or not frail. Results There are many different types of machine
    learning classifiers that have been used for this task, Random Forest and Decision
    Three with 0.95 performed better than LR, KNN, NB, SVM, and neural
    network models. Conclusion Natural Language Processing algorithms can
    effectively extract frailty keywords using Electronic Medical Record (EMR)
    notes. Moreover, comparing the results shows that using both structured and
    unstructured data gives better results than using only structured data.

  • Subjects / Keywords
  • Graduation date
    Spring 2023
  • Type of Item
  • Degree
    Master of Science
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.