Valid and Reliable Assessments

ACTE Quality CTE Program of Study Framework

Public

Discussion:

Valid and Reliable Assessments | Origin: HQ103

What stood out to me most was adaptive comparative judgment – judges pick the ‘better’ of two pieces of work, over and over, and the win-loss records sort everything into a reliable rank order without scoring line by line. It hit .96 reliability across judges from different backgrounds, which for open-ended work is remarkable, and I don't need the software – a bracket or a rank-order lineup gets most of the way there. It fixes a problem I'd named but never solved: performance and portfolio work is where trades scoring falls apart, because two instructors looking at the same project disagree. The module also named something I already run — using my advisory council to review assessments is a validity strategy, not just alignment — and the simplest check in it stuck with me: if students pass my interim assessment but fail the credential exam, the assessment isn't measuring what I thought. That's the standard candle, the assessment you trust and calibrate the rest of the pathway against. I'll pilot a low-tech comparative-judgment process on portfolio work and make "does this predict the credential exam" an explicit question in my summer cohort's review gate.

What stood out to me most was adaptive comparative judgment – judges pick the ‘better’ of two pieces of work, over and over, and the win-loss records sort everything into a reliable rank order without scoring line by line. It hit .96 reliability across judges from different backgrounds, which for open-ended work is remarkable, and I don't need the software – a bracket or a rank-order lineup gets most of the way there. It fixes a problem I'd named but never solved: performance and portfolio work is where trades scoring falls apart, because two instructors looking at the same project disagree. The module also named something I already run — using my advisory council to review assessments is a validity strategy, not just alignment — and the simplest check in it stuck with me: if students pass my interim assessment but fail the credential exam, the assessment isn't measuring what I thought. That's the standard candle, the assessment you trust and calibrate the rest of the pathway against. I'll pilot a low-tech comparative-judgment process on portfolio work and make "does this predict the credential exam" an explicit question in my summer cohort's review gate.

Related Learning Opportunities

HQ101 - High-Quality CTE: Standards-aligned and Integrated Curriculum

HQ102 - High-Quality CTE: Sequencing and Articulation

HQ103 - High-Quality CTE: Student Assessment

HQ104 - High-Quality CTE: Prepared and Effective Program Staff

HQ105 - High-Quality CTE: Engaging Instruction

HQ106 - High-Quality CTE: Access and Supports

HQ107 - High-Quality CTE: Facilities, Equipment, Technology and Materials

HQ108 - High-Quality CTE: Business and Community Partnerships

HQ109 - High-Quality CTE: Student Career Development

HQ110 - High-Quality CTE: Career and Technical Student Organizations (CTSOs)

HQ111 - High-Quality CTE: Work-Based Learning

HQ112 - High-Quality CTE: Data and Program Improvement