Inter-Rater Reliability Reporting in Taskstream LAT by Watermark

If you have utilized the Multiple Evaluation or Outside Evaluation feature within any of your DRF Programs, the Performance by DRF Category Report will be able to show you the full multiple evaluation history. It will also give you the ability to compare specific evaluators.

Reporting for Multiple Evaluations

To access the Performance by DRF Category Report, first click the TS Coordinator menu.

A navigation page showing the DRF assessment system options, with the DRF Program Reports link highlighted among the available tools for templates, programs, and reporting.

Click the link for the Performance by DRF Category Report.

A Performance/Outcome Assessment Reports menu showing the option titled Performance by DRF Category highlighted, with a description indicating that the report shows outcomes for individuals or groups on each requirement or category of the DRF.

Select whether to run the report on a single DRF template or on one OR multiple DRF programs.

A report selection panel showing options to choose a DRF template or program for running a report, with radio buttons labeled Select by DRF Template and Select by Program.

Select on whom you wish to report.

A report selection panel showing options for choosing whom to include in the report, including all authors in selected programs, a random sample of authors, authors evaluated by a particular evaluator, authors grouped with a particular evaluator, a single author with a name entry field, advanced search filters, and authors eligible for an outside evaluation.

All Authors in one or more programs using selected DRF Template or Program(s)

This option provides a view of the names of the Programs where the selected DRF Template was used. Select the Programs you want to run reports on.

Random sample of Authors in one or more Programs

This option prompts you to select the sample size of the group you want to return. You can select to include anywhere from 1% of the Authors enrolled in the program to 50% of the Authors enrolled in the Program. You will need to choose from which Programs you want to pull the sample from.

All Authors evaluated by a particular Evaluator

This option provides a list of Evaluators in each Program using the selected DRF Template. Select the Evaluators you wish to include in your report.

All Authors grouped with a particular Evaluator

This option provides a list of Evaluators in each Program using the selected DRF Template with whom Authors are grouped. Select the Evaluators you wish to include in your report.

A single Author

Type a name in the search box provided.

Advanced search

This option offers search filter by demographics collected by your organization.

A notice panel showing a message that organizations not currently collecting demographic information can contact the support team at support@watermarkinsights.com to explore filtering options for reporting data.

Authors that are eligible for an outside evaluation (enabled by special permission)

These evaluations are used as a basis for comparison to the original evaluation (e.g., for measuring inter-rater reliability). If you select this option, the next screen will prompt you to select which programs on which to run a report to compare outside evaluation with final evaluations. You will not be able to filter by date.

In the Filter by evaluation date area, you can choose whether you want to:

  • Include all evaluations
  • Include only items evaluated between certain dates

Please note: When there are multiple evaluations for a single item, the system ONLY includes the latest evaluation.

A filter panel showing options to include all evaluations or limit items by an evaluation date range, with fields for entering start and end dates, and a Continue button highlighted along with a Cancel button.

Click Continue to generate the report.

From the main results page, click the link (or magnifying glass icon) for a category or requirement in the DRF to view the details for that area.

A Folio Areas Assessed panel showing a linked item labeled Science Standards, accompanied by a magnifying‑glass icon, with the access level listed as Subset.

The category/requirement report shows the performance of the Authors who have access to the DRF area selected.

A DRF Program Report page showing final scores for authors, with filters for changing the view, displaying demographics, and jumping to specific rubric tasks. The table lists author names, final scores, rubric links, individual criterion scores, average rubric score, submission dates, evaluation dates, and evaluator names.

From the Change View drop down menu, choose Multiple Evaluation & Reconciliation History.

A DRF Program Report settings panel showing options to change the report view using a dropdown menu set to Multiple Evaluation & Reconciliation History, with a Go button highlighted. Additional filters include selecting authors, comparing ranges, displaying demographics, and showing individual evaluations, standard deviation, or rubric criterion scores.

The initial view will show you how many evaluations each Author’s submission had as well as results if multiple evaluations were performed. It is recommended to click the boxes in the Show menu to display additional details.

A DRF Program Report showing individual evaluations, standard deviation, and rubric criterion scores selected and highlighted, with a results table listing an author’s evaluation details, including completion count, average score, range, standard deviation, reconciled evaluation scores, evaluator names, and individual criterion scores for each rubric category.

You can also compare results between two specific evaluators using the Compare menus.

A comparison interface showing two dropdown menus for selecting names to compare, with a ‘Go’ button.

If there is enough data to compare between the two evaluators, the system will calculate the percent agreement as well as the Pearson Correlation Coefficient.

A Summary Statistics section showing percent agreement with score pair details, an input field for adjusting the score range with a Change button, and a Pearson Correlation Coefficient value with score pair details.

If you use the Compare menus to select to compare two (2) Evaluators:

  • One Evaluator’s scores are being compared against another Evaluator’s scores.
  • For each Author evaluated by the selected Evaluators, you can see a column for the individual evaluations completed by each Evaluator and a column for the range between those scores.

What is the Percent Agreement?

It is the number of evaluations where both Evaluators are within a certain score range, out of the number of evaluations where both Evaluators have entered a score (the number of score pairs).

To change the score range:

  1. Enter the score range you want to use.
  2. To display those evaluations completed by both selected Evaluators that fall within your indicated range, click Change.

What is the Pearson Correlation Coefficient?

This is one of several measurements that can be used for measuring inter-rater reliability.

A notice section showing a message explaining that Taskstream LAT provides the computation regardless of appropriateness and recommending users have or acquire expertise in inter‑rater reliability before using the coefficient for analysis.

The formula for Pearson Correlation Coefficient is as follows:

A mathematical formula section showing the equation for the correlation coefficient r, with a numerator representing the sum of cross‑products minus the product of sums divided by N, and a denominator showing the square root of the product of variance‑related terms for X and Y.

Where:
r = Pearson Correlation Coefficient
X = Eval 1
Y = Eval 2
N = the number of score pairs

If a rubric was used for evaluation, you also have the ability to view the comparison of criterion scores that were given between the selected Evaluators. The Summary Statistics display the Percent Agreement and Pearson Correlation Coefficient between the individual rubric criterion scores for all valid pairs.

Reporting for Outside Evaluations

To enable Outside Evaluations on a requirement in your DRF Program, access the TS Coordinator menu.

A navigation screen showing the Taskstream LAT header with menu options and a highlighted TS Coordinator tab, along with instructions to click the DRF Program Reports link, and a DRF Assessment System section showing various program options including a highlighted DRF Program Reports link.

Click the link for the Performance by DRF Category Report.

A report menu section showing Performance/Outcome Assessment Reports with a highlighted Performance by DRF Category link and a description explaining that it shows outcomes for individual or group on each requirement or category of the DRF.

Select whether to run the report on a single DRF template or on one OR multiple DRF programs.

A selection panel showing options to choose a DRF template or programs for running a report, with radio buttons for Select by DRF Template and Select by Program.

In the Select on whom you wish to report section, select Authors that are eligible for Outside Evaluation.

A selection panel showing multiple options for choosing whom to run the report on, including authors in programs, random samples, authors evaluated by or grouped with an evaluator, a single author with a name field, and an advanced search option, with the highlighted choice for authors that are eligible for an outside evaluation.

In the Filter by evaluation date area, you can choose whether you want to:

  • Include all evaluations
  • Include only items evaluated between certain dates

Please note: When there are multiple evaluations for a single item, the system ONLY includes the latest evaluation.

A filter panel showing options to include all evaluations or limit items by a selected date range with start and end date fields, along with Cancel and a highlighted Continue button.

Click Continue to generate the report.

In the main results, you will see a summary of results for completed outside evaluations.

A results table showing folio areas requiring outside evaluation, the number of authors evaluated, the highlighted Results for Group (Outside Evaluation) column, and corresponding average scores with bar graphs.

Click the link or magnifying glass for a specific requirement to drill into the detailed results.

A list of folio areas requiring outside evaluation showing links for Science Standards and the highlighted Instructional Design option, each with a magnifying glass icon.

From the detailed results you will be able to compare the original final evaluation that was released to the student to the outcome of the outside evaluation. The percent agreement and Pearson Correlation Coefficient are also calculated here. For an explanation of these calculations, view the equation under "What is the Pearson Correlation Coefficient?"

A scoring table showing author names with email addresses, their recorded scores, and buttons to view work and evaluations for each entry, along with a Summary Statistics panel showing percent agreement with score pair details, a score range field with a Change button, and the Pearson Correlation Coefficient value.

Click the link below to compare the Taskstream features of Multiple Evaluations versus Outside Evaluation:

Multiple Evaluation vs. Outside Evaluation

Was this article helpful?
0 out of 1 found this helpful

Articles in this section

See more
How to Contact Support
There are many ways to reach out! Click the icon above for our support options.
Watermark Academy
Click the icon above to access the Watermark Academy for consultation, training, and implementation companion courses.
Customer Community
Can’t find the answer? Ask fellow users how they’re making the most of Watermark in our Community!