ERA Banner
Download Add to Cart Share
More Like This
  • http://hdl.handle.net/10402/era.27793
  • Methods for determining whether subscore reporting is warranted in large-scale achievement assessments
  • Babenko, Oksana Illivna
  • English
  • subscore reporting
    large-scale assessment of student achievement
  • Sep 30, 2011 3:56 PM
  • Thesis
  • English
  • Adobe PDF
  • 560019 bytes
  • Officials of large-scale assessment programs often want to report subscale scores in addition to the total test score. However, in addition to the reliability of reported scores, evidence that subscales reveal real differences in student performances must be obtained in order to support reporting of subscale scores. In this study, two correlational methods, including correlations corrected for attenuation, r’, and the proportional reduction of the mean squared error, PRMSE (Haberman, 2005; Sinharay et al., 2007), and the agreement method (Kelley, 1923) for determining whether subscore reporting is warranted in large-scale achievement assessments were examined. Whereas correlation-based methods consider student performances on pairs of measures in terms of ranked positions, the agreement method takes into account actual differences between students’ standard scores on the pairs of measures being compared. The correlational methods revealed that with one possible subscale difference, the subscales did not differ among themselves and from the total test for the English Reading (N = 128,089) and Mathematics (N = 127,596) assessments considered in this study. In contrast, Kelley’s agreement method one to five percent students had differences between their scores on the English Reading subscales that were greater than the difference expected due to the chance. However, with two exceptions for the Mathematics assessment, the results of the agreement method were uninterpretable. In agreement with Sinharay, et al. (2007), it was concluded that for the detection methods to work, three conditions need to be met, one substantive (multidimensional construct for which scores are wanted for each dimension), and two statistical (high reliabilities of and low intercorrelations among subscales). The results for replicated random samples (n = 250, 500, 1,000, 2,000, and 5,000) revealed that the statistics for the three detection methods were accurate and precise estimators of the corresponding population parameters.
  • Doctoral
  • Doctor of Philosophy
  • Department of Educational Psychology
  • Fall 2011
  • Rogers, W. Todd (Educational Psychology)
    Cui, Ying (Educational Psychology)
  • Parrila, Rauno (Educational Psychology)
    Mrazik, Martin (Educational Psychology)
    Norris, Stephen (Educational Policy Studies)
    Anderson, John (Educational Psychology and Leadership Studies)