Overview
How to configure and manage automated duplicate detection (eg. Deduplication) to help prevent duplicate person records from being created in SS&E as they're entered into the system.
Note: Automated Duplicate Detection can only be enabled via a feature flag in Institution Administration that is accessible to any Watermark user login.
Automatic Duplicate Detection
To start using Duplicate Detection, please reach out to your CSM to coordinate activation.
Automated Duplicate Detection includes the following functionality:
- Automated Duplicate Detection will allow institution's to easily merge prospects with their corresponding applicant/student record as prospects move through their application and enrollment process.
- The duplicate detection job runs automatically and can also be run on-demand from within SS&E.
- Only prospect person types are allowed to be merged into another record or deleted.
- Prospects can only be merged into another prospect, an applicant or a student person type record.
- Prospect merging is case sensitive. Field matching will only occur when there is an exact match, including case and/or extra white spaces. If there is an extra space at the end of one of the names, the system will not find a match. If the case is different, the system will not find a match.
When does the Automatic Duplicate Detection job run?
- By default, an automated duplicate detection job runs on a daily schedule twice a day, at 8am and at 1pm.
- The duplicate detection job will run automatically when a new duplicate detection rule set is created.
-
The duplicate detection job will also run automatically after a rule set is edited in a way that may change the duplicate records found, such as:
-
- Active/inactive is changed in either direction (Active <-> Inactive)
- When enabled fields (set to "Yes") are changed in either direction (Yes <-> No)
-
After the automated duplicate detection release:
Based on each individual institution's duplicate detection rule sets, SS&E will automatically run the duplicate detection job as person records are entered into SS&E and present potential duplicates on a user friendly interface for merging decisions.
- Prospects may be entered via the bulk prospects CSV import process, an Inquiry Form, or by "individual" prospect entry from the Prospect person type page.
How to setup automated duplicate detection rule set(s)
Duplicate detection is configured by setting up a "Duplicate Detection Rule Set" on the "Duplicate Detection" menu listed under People & Roles Administration.
- The Duplicate Detection Rule Set displays which fields are configured for the system to match duplicate records on.
- Once potential duplicates are identified, the system will list the matching records in a user friendly interface where they can first be reviewed, and then be merged or ignored.
- SS&E allows each institution to create up to two duplicate detection rule sets.
-
- For example, one rule set may be set a certain way and run all the time, while another rule set may be more refined and run on a case by case basis when searching for something specific.
- In any case, if two rule sets are active, then they will both run automatically and produce two separate lists of potential duplicates.
-
Rule Set Fields
The available duplicate detection rule set fields are:
- First Name
- Last Name
- Personal Email Address
- Birth Date
- Current Addresses - Address Line 1, City, State, Zip Code
Duplication Detection Attributes
- Person records must be associated with any of the following in order to be considered a duplicate that a prospect can be merged with:
- An active prospect person type
- An active applicant person type
- A current student status
- For information on student statuses, please see the Student User Status article
- An "active" Person Type means that a prospect or applicant person type appears on the person record in People Administration. This is unrelated to a person record having an active (enabled) or inactive user status which does not matter.
-
- By design, prospects default to the Inactive person record status. This indicates that prospect person type users cannot login to SS&E.
-
- By design, person records with a staff or faculty role are excluded from being considered for merging, and will not be considered by automated duplicate detection.
- When searching on 'Address', only current person addresses will be considered by the automated duplicate detection process.
For more information about how a Person Address is set to current, click here.
To learn more about person types, click here.
How to configure Automated Duplicate Detection
To set up automated duplicate detection (deduplication), from SS&E Administration -> People & Roles -> select Duplicate Detection.
- Select "New" to create up to two duplicate detection rule sets maximum.
- Set the rule set to Active to enable the rule set so that it will run automatically.
- Set the rule set to Inactive to disable the rule set so that it will not run.
- On the "Info" tab, select the field(s) that the system will use to search for matching records and detect duplicates on.
- In each rule set, enable the fields (set to 'Yes') that the system should use to detect duplicates on.
- If multiple fields are selecting within a rule set, all of the selected values must match.
- Select "Edit" to change the duplicate person detection rule.
- Select "Detect New Duplicates" from the navigation bar to run the duplicate detection job on demand.
- Whenever a rule set is created or edited in a way that the detected duplicates would change, the duplicate detection job will run automatically.
Duplicate Detection Interface
The duplicate detection interface has five tabs used to review and merge/ignore the results of a specific duplicate detection rule set.
The following five tabs are further described below.
Info Tab
The info tab displays information about the specific deduplication rule set.
- Select the rule set name to view the rules set details
- Only active duplicate detection rule sets run automatically
- Inactive rule sets are not considered in automated duplicate detection and cannot be run manually either
"Duplicate Email" Rule Set Example
-
In this rule set, the Info tab shows that only 'Personal Email Address' is enabled, eg. set to Yes
- The Auto Mergeable tab shows that one match was found based on this specific rule set. Looking at the records will show that they have the same exact personal email address.
Manual Duplicate Detection
The duplicate detection job can also be run manually on-demand at any time from SS&E Administration > People & Roles -> Duplicate Detection, then selecting 'Detect New Duplicates'.
Detect New Duplicates Job - Automated vs. On Demand
Automated Deduplication Job
The duplicate detection job does NOT run automatically after person records are imported from the SIS system.
While the duplicate detection job is running and the system is detecting new duplicates, the following messages will display on the UI:
- On the Edit Rule Set page:
"New duplicates are currently being detected. Edits to rule sets will not take affect until this process completes."
- On the Auto Mergeable tab:
"New duplicates are currently being detected. They will be added to the list after this process completes."
On-Demand Manual Deduplication Job
The Duplicate Detection job can be run manually in Duplicate Detection Administration from the secondary navigation bar.
When manually running the Detect New Duplicates job, the following message will display:
"New duplicates are currently being detected. They will be added to the list after this process completes."
Auto Mergeable Tab
The Auto Mergeable tab displays duplicate person records that can only be merged in one direction.
- On the Auto Mergeable tab, the left side of the record will ALWAYS be a prospect person type, and the right side of the record will always be an applicant or a student person type, eg. not a prospect.
- Check the box in the first column on any duplicate records that you wish to ignore/you do not wish to auto merge and then select the bulk action 'Ignore' displayed below the search fields.
- Select “Auto Merge” from the secondary navigation bar to bulk merge all of the unchecked duplicates found for the specific rule set.
- Auto Merge will always merge the left side prospect record into the right side person record.
On the Auto Mergeable tab, users can also:
- Review/merge auto mergeable records individually by selecting a record which will open the Duplicate Review screen.
- If a record should not be merged, mark the item as ignored and then use the Ignored bulk action.
Review Tab
The Review tab displays potential duplicate person records that can be merged in either direction.
Select a record pair by clicking in the "Review and Merge" column which will open the Duplicate Review screen where users can manually review/merge/ignore the pair of detected duplicates.
- On the Review tab, it is most likely that the potential duplicates are both prospect person types.
- On the Review tab, users can merge the duplicate person records in either direction.
- Users should always manually review the detected duplicates displayed on the Review tab.
- For each pair of potential duplicates, users should review each record to decide if they are duplicates and if/how to merge them together.
- If the records are NOT duplicates, select "ignore" to mark the item as ignored.
-
- Ignored duplicates will move automatically to the Ignored tab where they can be reviewed again to be ignored or unignored.
-
Duplicate Review Screen Example
- In the above example, after reviewing both records, we decided to merge the right side record into the left side record.
- To merge the records on the Duplicate Review screen, after selecting the arrow pointing left, a message displayed notifying the user that "This action is irreversible. Are you sure you want to merge right into left?"
- Select OK to complete the merge.
Unmergeable Tab
The Unmergeable tab displays duplicate person records that were detected by the specific duplicate detection rule set and the system determined that they are not eligible to be merged in either direction.
Ignored Tab
The Ignored tab displays all duplicate person records that have been marked as ignored.
- Users should review the duplicates displayed on the Ignored list using the Duplicate Review screen.
- After reviewing the ignored duplicates, select 'Unignore' if records should be merged instead.
What's Changed?
Before:
Prospect merging could only be done with a manual process. Users had to know who/how to find the duplicate person record to merge a prospect with.
After the automated duplicate detection release:
Based on each individual institution's duplicate detection rule sets, SS&E will automatically run the duplicate detection job as person records are entered into SS&E and present potential duplicates on a user friendly interface for merging decisions.
- Prospects may be entered via the bulk prospects CSV import process, an Inquiry Form, or by "individual" prospect entry from the Prospect person type page.