Download the full-sized PDF of Development and Validation of an Automated Essay Scoring Framework by Integrating Deep Features of English LanguageDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Development and Validation of an Automated Essay Scoring Framework by Integrating Deep Features of English Language Open Access


Other title
automated scoring
feature extraction
essay evaluation
machine learning
large-scale assessment
Type of item
Degree grantor
University of Alberta
Author or creator
Latifi, Syed Muhammad Fahad
Supervisor and department
Gierl, Mark (Educational Psychology)
Examining committee member and department
Bulut, Okan (Educational Psychology)
Lai, Hollis (Faculty of Medicine)
Cormier, Damien (Educational Psychology)
Gierl, Mark (Educational Psychology)
Department of Educational Psychology
Measurement, Evaluation and Cognition
Date accepted
Graduation date
2016-06:Fall 2016
Doctor of Philosophy
Degree level
Automated scoring methods have become an important topic for the assessments of 21st century skills. Recent development in computational linguistics and natural language processing has given rise to more rational based methods for the extraction and modeling of language features. The language features from Coh-Metrix are based on theoretical and empirical foundations from psycholinguistics, discourse processing, corpus linguistics, and computing science. The primary purpose of this research was to study the effectiveness of Coh-Metrix features for the development and validation of three-staged automated essay scoring (AES) framework, using essay samples that were collected in a standardized testing situation. A second purpose of this study was to evaluate: 1) the scoring concordance and discrepancy between an AES framework and gold-standard, 2) features informedness as a function of dimensionality reduction, 3) two distinct machine learning methods, and 4) the scoring performance relative to human raters and current state-of-the-art in AES. This study was conducted using the methods and processes from data sciences, however, the foundational methodology comes from the field of machine learning and natural language processing. Moreover, the human raters were considered the “gold standard” and, hence, the validation process relies primarily on the evaluation of scores produced by the AES framework with the scores produced by the human raters. The finding from this study clearly suggests the value and effectiveness of Coh-Metrix features for the development of automated scoring framework. The measures of concordance confirm that the features which were used for the development of scoring models had reliably captured the construct of writing quality, and no systematic pattern of discrepancy was found in the machine scoring. However, the studied features had varying degree of informedness across essay types and the ensemble-based machine learning consistently performed better. On aggregate, the AES framework was found superior than the studied state-of-the-art in machine scoring. Finally, the limitations of this study were described and the directions of future research were discussed.
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (PDF/A)
Mime type: application/pdf
File size: 2799180
Last modified: 2016:11:16 14:25:04-07:00
Filename: Latifi_SyedMuhammadFahad_201609_PhD.pdf
Original checksum: 6457f7e71b5d1526fe481a2eb68b8304
Activity of users you follow
User Activity Date