I do not believe in "grading on the curve." This approach sets grading standards based completely on the group of students who happen to be in the course at that time. I set my own standards and created my assessments to reflect them; I graded on a percent correct basis.

Students knew the percent correct cut-offs that were used to assign grades. Total points from the exams and the in-class quizzes summed together were used to assign final course grades. I did "wiggle" the cut-offs down (never up) as seemed appropriate when I assigned final grades. For example, if a student overall got 85% of the available points, that student was guaranteed an A. However, sometimes that cut-off was lowered to 84% of the available points for a grade of A. This process allowed me to compensate for exams that were too difficult.

I also looked at the pattern of grades each student received across all assessments before assigning final grades. Because statistics is a hierarchical discipline, my tests were also hierarchical. That is, items on each successive exam required understanding of concepts from earlier parts of the course. If a student was on a borderline between two grades, I allowed the grades on assessments near the end of the course to determine that student's grade because it gave students multiple opportunities to show their understanding.

Students received one point for each multiple-choice and matching item they correctly answered. The point values for the problem and interpretation items varied depending on what was required. I scored them analytically based on a rubric I created before grading the exams (although I sometimes altered the rubric while grading when students did things that I hadn't anticipated). Students could receive partial credit for these two types of items. If a student made a numerical mistake that meant all subsequent answers were incorrect, I would not give credit for the mistake but would grade the rest of the item using the student's incorrect answer as the basis for subsequent work. I was interested in what students did and did not understand, so I do not believe in allowing a simple numerical mistake to invalidate all of the remaining parts of the item.

I always did a simple item analysis to examine the quality of my items and to identify common student misunderstandings. I selected two groups of four or five test papers; one group included the highest total test scores (the high achievers) while the other contained the lowest total test scores (the low achievers). I calculated the percentage of students who correctly answered each item for the two groups of papers combined. This number served as an estimate for the overall percentage of students in the entire class who correctly answered each item. I used this percentage to determine the difficulty of each item, and so what percentage of students understood the concepts underlying each item. I also calculated the percentage of students in each of the two groups who correctly answered each item. I then subtracted the percentage of low achievers who correctly answered the item from the percentage of high achievers who did so. My tests were meant to differentiate among students based on their understanding. This percentage difference estimates how well each item succeeds in doing so. When most items differentiated the two groups, the test exhibited reasonable internal consistency (or reliability).

This process also allowed me to identify items that were too difficult (or ambiguous). When I found extremely difficult items, I removed them from the students' test scores and used them as extra credit.

See all answers to this question

See all of Candace's answers