What stood out to me most was adaptive comparative judgment – judges pick the ‘better’ of two pieces of work, over and over, and the win-loss records sort everything into a reliable rank order without scoring line by line. It hit .96 reliability across judges from different backgrounds, which for open-ended work is remarkable, and I don't need the software – a bracket or a rank-order lineup gets most of the way there. It fixes a problem I'd named but never solved: performance and portfolio work is where trades scoring falls apart, because two instructors looking at the same project disagree. The module also named something I already run — using my advisory council to review assessments is a validity strategy, not just alignment — and the simplest check in it stuck with me: if students pass my interim assessment but fail the credential exam, the assessment isn't measuring what I thought. That's the standard candle, the assessment you trust and calibrate the rest of the pathway against. I'll pilot a low-tech comparative-judgment process on portfolio work and make "does this predict the credential exam" an explicit question in my summer cohort's review gate.