- 40 views
- 71 downloads
Are Large Language Models Good Essay Graders?
-
- Author / Creator
- Kundu, Anindita
-
We evaluate the effectiveness of Large Language Models (LLMs) in assessing essay quality, focusing on their alignment with human grading processes. Specifically, we investigate the applicability of LLMs such as GPT-3.5T and Llama-2 in the Automated Essay Scoring (AES) task, a crucial natural language processing (NLP) application in education. Our study explores both zero-shot and few-shot learning approaches, employing various prompting techniques to enhance performance.
Utilizing the ASAP dataset, a well-known dataset for the AES task, we compare the numeric grade provided by the LLMs to human rater-provided scores. Our research reveals that both approaches GPT-3.5T and Llama-2 generally assign lower scores compared to those provided by the human raters. Furthermore, neither LLM correlates well with the human scores. In particular, GPT-3.5T tends to be harsher and further misaligned with human evaluations compared to Llama-2. On the other hand, both LLMs not only can reliably detect spelling and grammar mistakes but also seem to take those mistakes into account when computing their score. Additionally, we extended our analysis to include the most recent release, Llama-3, which shows promising improvements in alignment with human scores. This suggests that newer generations of LLMs have the potential to be more effective in AES tasks. Overall, our results offer a cautiously optimistic view of using LLMs as tools to assist in the grading of written essays, highlighting both their current limitations and their future potential. -
- Subjects / Keywords
-
- Graduation date
- Fall 2024
-
- Type of Item
- Thesis
-
- Degree
- Master of Science
-
- License
- This thesis is made available by the University of Alberta Library with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.