Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Click here for FREE ACCESS to this landmark database

Click here to sign up for SAGE Journal Email Alerts today!

Sign In to gain access to subscriptions and/or personal tools.
Educational and Psychological Measurement
This Article
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
0013164407301546v1
68/1/25    most recent
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Ping Yin
Right arrow Articles by Sconing, J.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Estimating Standard Errors of Cut Scores for Item Rating and Mapmark Procedures

A Generalizability Theory Approach

Ping Yin

ACT, Inc., ping.yin{at}act.org

James Sconing

ACT, Inc.

Standard-setting methods are widely used to determine cut scores on a test that examinees must meet for a certain performance standard. Because standard setting is a measurement procedure, it is important to evaluate variability of cut scores resulting from the standard-setting process. Generalizability theory is used in this study to estimate standard errors of cut scores resulting from two standard-setting methods: item rating (Angoff-based) and mapmark (bookmark-based) methods. In this study, two different generalizability (G) study designs and four different decision (D) study designs were examined, and the impact of varying different aspects of the study design and universes of generalization was examined. Results suggest that cut scores were generally consistent for both methods. The first round standard setting contributed the most to the overall variability for the mapmark method. Also, it is clear that there is no one standard error associated with a certain cut score.

Key Words: standard setting • item rating method • mapmark method • generalizability theory

This version was published on February 1, 2008

Educational and Psychological Measurement, Vol. 68, No. 1, 25-41 (2008)
DOI: 10.1177/0013164407301546


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?