Download the full-sized PDF of Comparing the correctness of classical test theory  and item response theory in evaluating the consistency and accurancy of student proficiency classificationsDownload the full-sized PDF



Permanent link (DOI):


Export to: EndNote  |  Zotero  |  Mendeley


This file is in the following communities:

Graduate Studies and Research, Faculty of


This file is in the following collections:

Theses and Dissertations

Comparing the correctness of classical test theory and item response theory in evaluating the consistency and accurancy of student proficiency classifications Open Access


Other title
Evaluating decision consistency and accuracy
Type of item
Degree grantor
University of Alberta
Author or creator
Gundula, Augustine M
Supervisor and department
Rogers, Todd (Educational Psychology)
Buck, George (Educational Psychology)
Examining committee member and department
Pertersen, Stewart (Phys Ed and Rec Faculty)
Plake, Barbara (University of Nebraska-Lincoln)
Bouffard, Marcel (Phys Ed and Rec Faculty)
Whelton, William (Educational Psychology)
Department of Educational Psychology
Measurement, Evaluation and Cognition
Date accepted
Graduation date
Doctor of Philosophy
Degree level
The purposes of this study were: 1) to compare the values of decision consistency (DC) and decision accuracy (DA) yielded by three commonly used estimation procedures: Livingston-Lewis (LL) and the compound multinomial procedure (CM) procedures, both of which are based on classical test theory approach, and Lee’s IRT procedure based on item response theory approach and 2) to determine how accurate and precise these procedures are. Two population data sources were used: the Junior Reading (N = 128,103) and Mathematics (N = 127,639) assessments administered by the Education Quality and Accountability Office (EQAO) and the three entrance examinations administered by the University of Malawi (U of M; N = 6,191). To determine the degree of bias and the level of precision for both DC and DA, 100 replicated random samples corresponding to four sample sizes (n = 1,500, 3,000, 4,500, 6,000) for the EQAO populations and two sample sizes (n = 1,500, 3,000) for the U of M population were selected. At the population level, there was an interaction between the three procedures and the four cut-scores. While the differences between the values of DC and the values of DA among the three procedures tended to be small for one or both extreme cut-scores, the differences tended to be larger when the cut-score was closer to the population mean. The IRT procedure tended to provide the highest values for both DC and DA, followed in turn by the CM and LL procedures. At the sample level, the estimates of DC and DA yielded by the three estimation procedures were unbiased and precise. Consequently, the findings at the population are applicable at the sample level. Therefore, based on the findings of the present study, the compound multinomial procedure should be used to determine DC and DA when classical test score theory is used to analyze a test and its items and the IRT procedure should be used to determine DC and DA when item response theory is used to analyze a test and its items.
Permission is hereby granted to the University of Alberta Libraries to reproduce single copies of this thesis and to lend or sell such copies for private, scholarly or scientific research purposes only. Where the thesis is converted to, or otherwise made available in digital form, the University of Alberta will advise potential users of the thesis of these terms. The author reserves all other publication and other rights in association with the copyright in the thesis and, except as herein before provided, neither the thesis nor any substantial portion thereof may be printed or otherwise reproduced in any material form whatsoever without the author's prior written permission.
Citation for previous publication

File Details

Date Uploaded
Date Modified
Audit Status
Audits have not yet been run on this file.
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 469076
Last modified: 2015:10:12 13:10:32-06:00
Filename: Gundula Augustine.pdf
Original checksum: 343bbd1832bcd84f1f0abdf68b8da8f0
Well formed: false
Valid: false
Status message: Invalid page tree node offset=338972
Status message: Unexpected error in findFonts java.lang.ClassCastException: edu.harvard.hul.ois.jhove.module.pdf.PdfSimpleObject cannot be cast to edu.harvard.hul.ois.jhove.module.pdf.PdfDictionary offset=3065
Page count: 100
Activity of users you follow
User Activity Date