Data Science at Watermark Insights
The Predict functionality for the Student Success & Engagement (SS&E) platform is carried out by the Data Science team at Watermark Insights. This team sits in the center of the Watermark Suite, utilizing data from multiple sources to align and inform the SS&E Predict functionality. As the team works to more deeply understand the relationship of data across each platform, the strength of each individual platform is increased through these learnings.
Outcomes Measured
As of January 1st, 2024, predictions offered by Data Science for the SS&E solution focus on the two student outcomes defined below. These are the two outcomes predicted for the risk levels and risk factors analyses, and define the boundaries for what success means within SS&E.
Persistence:
When a student graduates, earns a credential, or attempts credit in an academic term within the next six months of the evaluated term ending. Commonly called “term-to-term persistence” or “term-to-term retention”.
Course Completion:
When a student earns an A, B, C, Pass, or Satisfactory in the enrolled course. D grades, Fails, and Withdrawals do not count as a completed course.
Model Philosophy
SS&E’s predictive models are centered around the students they are evaluating. These models combine your institution’s data with a suite of institutional data to find deeper insights into every type of learner, meaning we can improve prediction accuracy over isolated institution data. Students are grouped together by similar populations across institutions to most closely approximate their student profiles. This allows the creation of specialized models that maximize the information available.
As students move through their programs, they are associated with whichever model best fits their student profile, allowing our models to accommodate changes in student profiles throughout their educational career.
Risk Factors and Risk Levels
We perform two distinct analyses as part of the SS&E Predict functionality.
Risk Levels
We use various modeling techniques to build models using all data available to us about each student for each term and course they are enrolled in. With that model, we provide a probability of persistence and for completing each course they are enrolled in. Students will have one persistence risk score for the term and one course completion risk score for every course they are enrolled in. Separately for persistence and for each course, they are grouped into one of three categories based on their score relative to their peers. These groupings are always split in the following way
- Low Risk - Top 60% of Student Scores
- Medium Risk - Next 30% of Student Scores
- High Risk - Bottom 10% of Student Scores
Institutions often develop varying engagement strategies by level. For example, high risk students may require more hands-on engagement from the institution than the students in the medium or low risk categories.
Where do Risk Levels Appear?
Within the SS&E solution, risk Levels can be seen on the student tab in multiple places. When viewing students in a filter or population, their persistence and highest course completion risk level will display below their profile picture, allowing you to see their risk across both dimensions at a glance. The persistence risk indicator will be on the left, and the course completion risk indicator will be on the right.
On each student’s profile, the persistence score will be displayed on the top right corner with the overall percent likelihood they will persist as predicted by the persistence model.
Clicking on the progress bar shows the top risk and success factors.
The course completion risk level will likewise be available for each course under the course tab in the student page. The percentages next to the risk label are the probability that the student will pass the course successfully. Clicking on the progress bar will show the top risk and success factors for this enrollment.
Risk Factors
A risk factor is a description of a single characteristic of a student or their enrollments, and can provide context around why a student has the risk level they do, or what challenges they may be facing
Historical data is used to determine factors most associated with risk and success at each institution through analysis of average success rates. These factors are ranked based on the historical success rates within an institution’s student body
The three factors most associated with risk and success for each term and course a student is enrolled in are displayed in SS&E
An example of all the data used to rank the factors can be seen below.
In the example above, a subset of all historical data is used, a percentage of students are calculated calculated in each category that persisted to the following term (success rate), and the percentage of students who were part of that population were also calculated. Once that happens, factors are ranked based on the success rate, and assign any factors that are above the average success rate as “Success Factors”, and assign “Risk Factors” to those below the average success rate.