Auto-Merging Duplicates - Data Doctor Knowledge Base

Auto-Merging Duplicates

Auto-Merge takes the manual work out of duplicate cleanup by automatically merging duplicate records based on your defined rules and confidence thresholds. It's designed to handle high-volume deduplication safely while maintaining a complete audit trail for every merge.

How Auto-Merge Works

When Auto-Merge runs, it processes duplicate groups identified by your rules and intelligently combines them into single, authoritative records. Here's what happens behind the scenes:

Confidence Filtering

Only merges duplicates that meet your confidence threshold, ensuring high-quality matches are processed while questionable matches are left for manual review.

Smart Record Selection

Automatically determines the master record based on your preference: most recently modified (default) or oldest record in the group.

Complete Artifact Creation

Before any record is deleted, a comprehensive artifact is created containing all field values and relationship data for full recoverability.

Reversible Merges

Every auto-merge can be undone. Artifacts store everything needed to restore deleted records and reconstruct relationships.

Understanding Confidence Scores

When your duplicate rule uses fuzzy matching, each duplicate group receives a confidence score representing how closely the records match. This score is calculated based on the similarity of field values across your matching criteria.

High
90-100%
Medium
70-89%
Low
50-69%

Example: "John Smith" vs "Jon Smith" might score 92% confidence, while "John Smith" vs "J. Smithson" might score 68%. Auto-Merge lets you set the minimum confidence threshold—only groups meeting or exceeding this threshold will be automatically merged.

Rules using only exact matching will always show 100% confidence, as there's no variation in match quality.

Master Record Selection

When merging a duplicate group, one record is designated as the master (kept) and others are merged into it. You can configure which record becomes the master:

  • Recently Modified (Default): The record with the most recent LastModifiedDate becomes the master. This prioritizes records that have been actively maintained and are likely to have the most current information.
  • Oldest Record: The record with the earliest CreatedDate becomes the master. This preserves your original record of truth and is useful when historical continuity matters (e.g., maintaining the original account creation date for tenure tracking).

The Artifact System

Every time a record is deleted through Auto-Merge, Data Doctor creates a detailed merge artifact that captures:

  • All field values from the deleted record at the time of merge
  • Parent relationships (lookup and master-detail fields pointing to other records)
  • Child relationships (related records that referenced the deleted record)
  • The master record it was merged into
  • Timestamp and user context of the merge operation

Important Limitations: While the artifact system enables merge reversal, there are practical constraints to be aware of:

  • Time Sensitivity: The longer you wait to undo a merge, the more your data may have changed. Related records might have been modified, deleted, or had new relationships created—making a clean restoration more complex.
  • Cascading Changes: If child records were re-parented to the master during merge, and those records have since been updated, restoring the original parent relationship may not restore the child record's previous state.
  • Formula & Rollup Fields: Calculated fields are not stored in artifacts. After restoration, these fields will recalculate based on current data, which may differ from their original values.
  • External System Sync: If your Salesforce data syncs with external systems, those systems won't automatically reflect an undo operation. Manual reconciliation may be required.
  • Storage Considerations: Artifacts consume storage. For high-volume merge operations, consider your org's storage limits and establish artifact retention policies.

Best Practice: Review Auto-Merge results regularly, especially when first enabling the feature. Address any incorrect merges promptly—the sooner you undo a merge, the cleaner the restoration will be.

Scheduling Auto-Merge

Auto-Merge can be configured to run automatically after your duplicate rule completes its scan. When enabled, it processes newly identified duplicates without manual intervention.

Smart Schedule
Let Data Doctor automatically determine the optimal schedule based on your object's size and existing scheduled rules. Auto-Merge will run immediately after the rule scan completes.
Manual Schedule
Set your own frequency (daily, weekly, monthly), preferred day, and time. Auto-Merge executes immediately following the scheduled scan, so you maintain full control over when merging occurs.
No Schedule
Run the rule only when triggered manually. Auto-Merge will still execute after manual rule runs if enabled, giving you on-demand deduplication.

To enable scheduled Auto-Merge, toggle the Auto-Merge option when configuring your rule's schedule settings. You'll also set your confidence threshold and master record preference at this time.

Running Auto-Merge Manually

You can trigger Auto-Merge on demand from two locations within Data Doctor:

Object Health Dashboard

Data Doctor → Object Health → [Select Object]

View duplicate metrics for any object and run Auto-Merge across all rules for that object. Ideal for periodic bulk cleanup or when you want to process duplicates across multiple rules simultaneously.

Rule Manager

Data Doctor → Rules → [Select Rule] → Actions

Run Auto-Merge for a specific rule. Use this when you want targeted cleanup based on a particular matching strategy, or when testing a new rule's Auto-Merge configuration before enabling it on a schedule.

Tip: Before running Auto-Merge for the first time on a rule, consider doing a manual review of the identified duplicates to ensure your confidence threshold and matching logic are producing accurate results.

Common Questions

What happens to child records when duplicates are merged?

Child records (related lists, activities, etc.) from the deleted record are automatically re-parented to the master record. This ensures no data is orphaned during the merge process.

Can I exclude certain duplicate groups from Auto-Merge?

Yes. In the Duplicate Manager, you can mark specific duplicate groups as "Ignored." Ignored groups will be skipped by Auto-Merge even if they meet your confidence threshold.

How do I know what Auto-Merge has done?

Every Auto-Merge operation is logged in the Merge History, accessible from the Data Doctor dashboard. You can view which records were merged, when, by which rule, and access the artifact for each deleted record.

Need Help? If you have questions about configuring Auto-Merge settings or want guidance on setting appropriate confidence thresholds for your data, contact our support team.