Regular Size Text ButtonLarge Size Text Button
 

Comparing Rubrics

Analysis on the Explore Tab.

The purpose of this report is to determine the extent to which performance on one rubric predicts performance on another. It could be used, for example, to determine whether coursework scores predict field performance.

The result, if there is a significant relationship, will show you a P score. If the score is <.001 (a 1 in 1000 chance of the relationship happening randomly) then the result is happening because of what you are doing (teaching and learning) and not just because they are good students.

The question here is whether or not performance on one assessment is significantly related to another selected performance AND whether or not this relationship (if any) is anything but random. You can see the relationship demonstrated in the scatter plot shown below:

Here is an excerpt from an example report…

Rubric 26: RPR S Planning and management of instruction
Rubric 23: RPR P Instructional strategies for reading and writing
118 students have valid scores on both these rubrics.


Crosstab

Score on rubric 26

0

1

2

3

Score on rubric 23

0

1

1

-

-

1

1

2

1

-

2

-

5

89

6

3

-

2

9

1

 …this part of the report shows the combined distribution of scores across the two rubrics. This should be reviewed visually first, to see if the pattern suggests problems which might arise in the interpretation of the data. It is quite common for example, for the vast majority of the scores to be in a single category, in which case any interpretation will be based only on a few outliers. You can also see relationships well in the scatter plot above.

The next section of the report shows overall statistics for the ‘predictor’ rubric…

Overall mean score on 23: 2.0
Standard deviation: 0.45
Standard error of mean: 0.04

… this is the mean and standard deviation on the first rubric, across all samples. The ‘standard error of mean’ (SEM) is the standard deviation divided by the square root of (N-1), the usual sampling formula for this statistic. In general terms, the SEM indicates how accurately we have estimated the mean. If the SEM is large, differences are unlikely to be significant, because they will be ‘buried in the noise’ as an engineer would say it.

The final section of the report attempts to find significant differences in performance on the second rubric, among groups whose performance differed on the first rubric…

Criterion Level

% equal or higher on 26

mean score on 23

't' statistic

remarks

2

89.8

2.1

1.23

Students scoring 2 or more on the 'RPR S Planning and management of instruction' did not differ significantly on the 'RPR P Instructional strategies for reading and writing' (p>.10)

3

5.9

2.2

2.62

Students scoring 3 or more on the 'RPR S Planning and management of instruction' scored much better than average on the 'RPR P Instructional strategies for reading and writing' (p<.001)

…each row looks at the group of students who performed at or higher than a specific level on the first rubric. For example, the last row looks at students who performed at 3.0 or higher on rubric 26.

The ‘% equal or higher’ column shows what proportion this group represents of the whole population.

The ‘mean score on’ column shows the average score of this group compared to the whole sample. In this case, the mean for the subgroup was 2.2, compared to 2.0 for the whole sample (indicated previously).

The ‘t’ statistic is the difference between this group mean and the overall mean, expressed in units of the SEM. This statistical measure is the simplest way of comparing two populations. High numbers are ‘significant.’ If the number is negative, this indicates that the subgroup performed less well on the target rubric – similar to a negative correlation.

The ‘remarks’ column provides a verbal interpretation of the results on this row. It will either say it ‘did not differ significantly’ with P>.01, or that is was ‘better’ (or worse) with p<.10, or that it was ‘much better’ (or worse) with p<.001. In this case, the first row showed no significant difference, and the second row showed a very significant difference. Significant here simply means ‘unlikely to have arisen by chance’ – it is the statistical interpretation of the word. It is up to the reader to assess whether a difference of only 0.2 of a score level is ‘significant’ in real-world terms.

We use a process of separating out individual groups here, rather than the more common regression-correlation statistics, for a number of reasons. Firstly, assessment data is usually highly concentrated, and not spread across the whole range of scores. Regression is dominated by the effects of outliers, and is likely to provide a misleading picture. Secondly, the number of assessments is often quite small, and the correlation is likely to be non-significant overall, even where individual comparisons may have revealed a prediction. Finally, the analysis of individual levels allows us to derive a simple English-language statement supported by the data, where a correlation is often very global and hard to interpret.

Norms are not needed for these statistics – the analysis provides its own control variables.

Sampling

Assessments are included in the analysis wherever a single student has been assessed on both rubrics. Both assessments must have been fully completed; partial assessments are not counted. Again, the units of comparison are assessments on the whole rubric, not on individual criteria.

After running the report, you can save it externally or keep a record of it in the library.

Divider

Home > Standard | Custom | Explore | Libraries