Methods for determining whether subscore reporting is warranted in large-scale achievement assessments

Babenko, Oksana Illivna

doi:doi:10.7939/R3QP9P

This decommissioned ERA site remains active temporarily to support our final migration steps to https://ualberta.scholaris.ca, ERA's new home. All new collections and items, including Spring 2025 theses, are at that site. For assistance, please contact erahelp@ualberta.ca.

View

Download

Communities and Collections

Graduate and Postdoctoral Studies (GPS), Faculty of / Theses and Dissertations

Usage

420 views
272 downloads

Methods for determining whether subscore reporting is warranted in large-scale achievement assessments

Author / Creator

Babenko, Oksana Illivna
Officials of large-scale assessment programs often want to report subscale scores in addition to the total test score. However, in addition to the reliability of reported scores, evidence that subscales reveal real differences in student performances must be obtained in order to support reporting of subscale scores. In this study, two correlational methods, including correlations corrected for attenuation, r’, and the proportional reduction of the mean squared error, PRMSE (Haberman, 2005; Sinharay et al., 2007), and the agreement method (Kelley, 1923) for determining whether subscore reporting is warranted in large-scale achievement assessments were examined. Whereas correlation-based methods consider student performances on pairs of measures in terms of ranked positions, the agreement method takes into account actual differences between students’ standard scores on the pairs of measures being compared. The correlational methods revealed that with one possible subscale difference, the subscales did not differ among themselves and from the total test for the English Reading (N = 128,089) and Mathematics (N = 127,596) assessments considered in this study. In contrast, Kelley’s agreement method one to five percent students had differences between their scores on the English Reading subscales that were greater than the difference expected due to the chance. However, with two exceptions for the Mathematics assessment, the results of the agreement method were uninterpretable. In agreement with Sinharay, et al. (2007), it was concluded that for the detection methods to work, three conditions need to be met, one substantive (multidimensional construct for which scores are wanted for each dimension), and two statistical (high reliabilities of and low intercorrelations among subscales). The results for replicated random samples (n = 250, 500, 1,000, 2,000, and 5,000) revealed that the statistics for the three detection methods were accurate and precise estimators of the corresponding population parameters.
Subjects / Keywords
- Large-scale assessment of student achievement
- Subscore reporting
Graduation date

Fall 2011
Type of Item

Thesis
Degree

Doctor of Philosophy
DOI

https://doi.org/10.7939/R3QP9P
License

This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for non-commercial purposes. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.

Language

English
Institution

University of Alberta
Degree level

Doctoral
Department
- Department of Educational Psychology
Supervisor / co-supervisor and their department(s)
- Cui, Ying (Educational Psychology)
- Rogers, W. Todd (Educational Psychology)
Examining committee members and their departments
- Anderson, John (Educational Psychology and Leadership Studies)
- Mrazik, Martin (Educational Psychology)
- Parrila, Rauno (Educational Psychology)
- Norris, Stephen (Educational Policy Studies)