Towards Natural Language Modelling of Clinical Depression

Farruque, Nawshad

doi:doi:10.7939/r3-2y0q-mv51

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

323 views
459 downloads

Towards Natural Language Modelling of Clinical Depression

Author / Creator

Farruque, Nawshad
Traditional survey based methods for clinical depression detection are not always effective; the patient may not reflect their actual mental health condition because of the cognitive bias exhibited while filling out questionnaires about depression. Established through ample earlier work, social media language has been found to be a reflection of a user's real-time mental health status. Being influenced by this potential of social media posts, in this dissertation, we describe a framework for natural language modelling of clinical depression from public social media posts, e.g., tweets from a Twitter user's timeline. Such modelling requires extraction of depression symptoms from the social media posts, then following clinical psychiatry guidelines to calculate depression scores for all two-weeks episodes; then, based on these scores, we infer whether a user is depressed or not. In this process, the first important challenge is the data scarcity for developing a Depression Symptoms Detection (DSD) model.To address data scarcity, we follow two steps. First, we curate a Clinical Expert Annotated Depression Symptoms tweets (CEADS) dataset. We bring important innovations for curating a better quality of CEADS dataset that reflects both clinicians' insights and depression symptoms distribution of self-disclosing depressed Twitter users. Second, we train our DSD model using CEADS dataset and further make the model robust with the help of our proposed Semi-supervised Learning (SSL) framework. In this framework, we iteratively harvest depression symptoms tweets and re-train our DSD model. Moreover, we propose a Zero-Shot Learning (ZSL) model to make our iterative data harvesting process more effective. Further, with the help of the DSD model, we develop our Temporal User-level Clinical Depression Detection (TUD) model that can extract clinical depression scores through a user's Twitter timeline; much like what a depression rating scale, e.g., Patient Health Questionnaire - 9 (PHQ-9) would do. Finally, we draw insightful conclusions on user-level clinical depression modelling by using the following: (1) depression score based features, (2) pure semantic representation based features, along with (3) their temporal representations and (4) experimentations with various clinical depression detection settings in several data distributions. To the best of our knowledge, our experimentations and analyses are unique in the literature.
Subjects / Keywords
Graduation date

Spring 2023
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/r3-2y0q-mv51
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Computing Science
Supervisor / co-supervisor and their department(s)
- Goebel, Randy (Computing Science)
- Zaiane, Osmar (Computing Science)