Towards Natural Language Modelling of Clinical Depression

  • Author / Creator
    Farruque, Nawshad
  • Traditional survey based methods for clinical depression detection are not always effective; the patient may not reflect their actual mental health condition because of the cognitive bias exhibited while filling out questionnaires about depression. Established through ample earlier work, social media language has been found to be a reflection of a user's real-time mental health status. Being influenced by this potential of social media posts, in this dissertation, we describe a framework for natural language modelling of clinical depression from public social media posts, e.g., tweets from a Twitter user's timeline. Such modelling requires extraction of depression symptoms from the social media posts, then following clinical psychiatry guidelines to calculate depression scores for all two-weeks episodes; then, based on these scores, we infer whether a user is depressed or not. In this process, the first important challenge is the data scarcity for developing a Depression Symptoms Detection (DSD) model.To address data scarcity, we follow two steps. First, we curate a Clinical Expert Annotated Depression Symptoms tweets (CEADS) dataset. We bring important innovations for curating a better quality of CEADS dataset that reflects both clinicians' insights and depression symptoms distribution of self-disclosing depressed Twitter users. Second, we train our DSD model using CEADS dataset and further make the model robust with the help of our proposed Semi-supervised Learning (SSL) framework. In this framework, we iteratively harvest depression symptoms tweets and re-train our DSD model. Moreover, we propose a Zero-Shot Learning (ZSL) model to make our iterative data harvesting process more effective. Further, with the help of the DSD model, we develop our Temporal User-level Clinical Depression Detection (TUD) model that can extract clinical depression scores through a user's Twitter timeline; much like what a depression rating scale, e.g., Patient Health Questionnaire - 9 (PHQ-9) would do. Finally, we draw insightful conclusions on user-level clinical depression modelling by using the following: (1) depression score based features, (2) pure semantic representation based features, along with (3) their temporal representations and (4) experimentations with various clinical depression detection settings in several data distributions. To the best of our knowledge, our experimentations and analyses are unique in the literature.

  • Subjects / Keywords
  • Graduation date
    Spring 2023
  • Type of Item
  • Degree
    Doctor of Philosophy
  • DOI
  • License
    This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.