Public
Activity Feed Discussions Blogs Bookmarks Files

Common mistakes to rubrics and disadvantages

They take more time to create and use.
There are more possibilities for raters to disagree. It is more difficult to achieve intra- and inter-rater reliability on all of the dimensions in an analytic rubric than on a single score yielded by a holistic rubric.
There is some evidence that raters tend to evaluate grammar-related categories more harshly than they do other categories (McNamara, 1996), thereby overemphasizing the role of accuracy in providing a profile of learners' proficiency.
There is some evidence that when raters are asked to make multiple judgement, they really make one. Care must be taken to avoid a "halo effect" and focus on the individual criteria to assure that diverse information about the learner's performance is not lost.

Michael,

Thanks for the references. When you speak of rater reliability, this is an important component if institutions are providing the rubrics to many instructors who teach the same course. Thanks for your input.

I wonder if adding very specific quantitative criteria for the grammar category might help with reliability. For example, superior might be 0-1 mistakes, acceptable might be 2-5, and weak might be 5+ (depending on the assignment).

Krystal,

Many people do this, but when we get into quantity, the rubric becomes more of a checklist instead of a rubric. If you need to do this for a few criteria, you may be OK, but if all of your criteria are like this in the rubric, it's not a rubric - it's a checklist. Make sense?

Rezaei, A. R., & Lovorn, M. (2010). Reliability and validity of rubrics for assessment through
writing. Assessing Writing.
doi:10.1016/j.asw.2010.01.003

The above citation (sorry if the format does not display correctly here) is an article that I recently read & critiqued. It is directly related to the initial poster's comment about grammar being a primary focus for raters, and how this causes an inaccurate or unreliable score.
The research indicates that the scoring of grammar & mechanics i fairly easy for raters. The most difficult task as a rater is determining the level of critical thinking. Therefore, it's the content of the student work that is often rated incorrectly. For example, an essay riddled with mechanical errors, yet displaying high level thinking, will be scored lower than the opposite - an essay with minimal mechanical errors, but contains no in-depth thoughts/ideas.

Actually, as a teacher, I view it as a double-edged sword. There are pros and cons for individuals in this type of situation.

Melissa,

Thanks for the resource. It's great when we get resources here as we can continue to build upon them. Thanks for the input.

I think the difficulty sometimes comes from deciding what things deserve to have a quantitative value vs. not. My course is a science general education course so the criteria that has a quantitative value is often whether the students get a certain amount of facts straight.

Dania,

Interesting thought. Quantity or quality? Give us food for thought.

One mistake I have noticed over and over in rubrics that have been provided to me is providing points for work that is not performed. Using a discussion rubric again, responding to two other students by the deadline may be worth 5 points, responding to one student by the deadline may be worth 3 points, failing to respond may be worth 1 point. Should they receive a point for no work? Is the distribution for the other posts fair?

Angela,

I agree. I do not provide any points for work not performed. I have a zero (0) column. Thanks for your input.

Sign In to comment