Salesforce Duplicate Management: 5 Phase Process to Deduplicating Your Salesforce Org Data
Everyone working with Salesforce has experienced duplicates. In your role as a salesperson, you just followed up on a lead only to find out a colleague is already in contact with this person – that’s a duplicate lead!
Not only does anecdotal evidence suggests duplicates are a problem – research from Experian shows that 83% of companies encounter challenges in cross-channel marketing, with inaccurate data as the main reason for these challenges.
Salesforce has a deduplication feature built in, known as Matching Rules and Duplicate Rules. In this post, we will have a quick look at the basics of this feature, go through the process of deduplication, and how to decide what’s the best approach for your org.
What is Salesforce Duplicate Management?
The Salesforce Duplicate Management feature consists of Matching Rules and Duplicate Rules.
- Matching rule: Consists of criteria to identify duplicate records. Salesforce comes with three standard rules: one for business accounts, one for contacts and leads and one for person accounts.
- Duplicate rule: Determines in what situations the matching rule is applied and what should happen when a duplicate is found using the matching rule.
Duplicate rules can trigger two possible outcomes:
- an alert for the user creating a duplicate
- block the creation of a duplicate record
Recently, Salesforce has improved Duplicate Management with a batch process – basically, the capabilities to run the deduplication periodically to detect duplicates in mass, in your existing data. A batch process runs one matching rule at a time, and is available for users on ‘Performance’ and ‘Unlimited’ editions of Salesforce.
Ideal Scenarios for Salesforce Duplicate Management
Working with Salesforce Duplicate Management is ideally suited to SMB’s or companies who are new to Salesforce. When you work with standard objects, small record counts (< 10,000 records) and have one or two ways new records enter your Salesforce organization, there is probably no need to look further.
The number of newly created records is also an important consideration. Tens of new records each day are easily deduplicated with Salesforce Duplicate Management. If we are talking about hundreds or even thousands, automation and customization become more important.
For example, a legal firm who uses Salesforce to manage the relationships with their 100 clients and a handful of new leads each month will do fine with Salesforce Duplicate Management.
When your organization does not fit this description, read on.
Signs It’s Time to Look for Other Solutions
In order to save you time looking for an optimal solution, I have listed some important limitations to Salesforce Duplicate Management.
- You don’t want to lose any data.
Salesforce only alerts when a duplicate is created by a manual insert. For all other entry methods, Salesforce will block the insert of a duplicate. This means you will lose relevant data when a duplicate is created by API or import. Other solutions will automatically merge the two records or store them in a list for a manual review. Both of these options guarantee you will not lose any important data!
- You need to batch deduplicate other objects than Lead, Contact and (Person) Account.
It is pretty obvious that for many organizations these objects are the most important ones. But if you want to batch process cases, opportunities or custom objects, you’ll need to look further.
- You need to process large amounts of data.
We all want SFDC to be faster, especially when running large batch jobs. If the waiting gets on your nerves or you need your data scrubbed FAST, there are solutions out there that let you offload the heavy processing to your computer or their infrastructure. Salesforce recommends using other solutions when you have many records.
“In an org with many records, duplicate jobs can fail.” (Source)
Another thing to consider is that you need to manually process all found duplicate pairs from a Salesforce batch job. This will take enormous amounts of time when working with large amounts of data. Other solutions let you automate the processing as well.
- You need a cross-object batch process.
Chances are pretty big that your new Lead is already present in your Salesforce organization but as a Contact. To make sure you don’t work with duplicate data, cross object matching is imperative. Salesforce is able to do alerting on manual insert for all objects, but batch jobs are restricted to
- You need to automatically process duplicates.
Well, blocking is also an automatic action but according to us, blocking is not desirable. If your marketing automation processes run all the time, you cannot wait for manual review of duplicates. In that case, the fully automatic merging of duplicates saves you time, does not delete valuable data and makes sure your marketing and sales machine keeps on running 24/7.
By now, you probably have a rough idea if you are fine with Salesforce Duplicate Management or you need another solution. Don’t start looking at random solutions, but follow a process for the best results.
Deduplicating your Salesforce Org Data: A 5 Phase Process
How do you go about deduplicating your Salesforce data? Maybe Salesforce Duplicate Management fits your needs, but should you need more deduplication capabilities, the AppExchange offers many alternatives.
I recommend tackling deduplication in five phases to guarantee you can achieve the best results:
- Data Requirements
- Process Requirements
- Tooling Selection
Phase 1: Data Requirements
In the first phase of a deduplication project, you focus on your data requirements alone.
1. List all objects that need deduplicating
Make a list of all objects that need to be deduplicated. These can be standard or custom objects. Indicate if you want to look for duplicates within the object, across objects or both.
2. List relevant fields for each object
All deduplication tools find duplicate records based on the values of fields. Most tools have some default settings to match objects such as leads, contacts and accounts, but you will get the best results when you fine-tune the settings.
In order to fine-tune, you need to list the fields that can give away a duplicate record. To use ‘Contacts’ as an example, relevant fields are ‘First Name’, ‘Last Name’, ‘Email Address’ and ‘Account Name’. Your organization could use relevant fields such as ‘Birthdate’ as well. Salesforce has a page on their matching rules.
3. Indicate matching method for all fields
Different matching options are available depending on the tool. Generally speaking, you have 3 types of matching algorithms:
- Exact, eg. ‘ACME Inc.’ matches only with ‘ACME Inc.’
- Partial exact, ‘ACME’ matches with ‘ACME Inc.’
- Fuzzy, ‘AKME Inc.’ matches with ‘ACME Inc.’
Some tools offer matching algorithms for specific fields (domain names, telephone numbers).
In this stage, it’s OK to just choose from the main three. When starting the implementation check if there are more tailored algorithms available.
4. List words to ignore
Especially for Accounts, a lot of words are not really part of an organizations’ name. Think of Acme LLC vs Acme. It is a good idea to add words like plc, LLC, GmbH, etc. to an ignore list. A lot of tools have the ability to ignore these words in matching.
|Account||Account Name||Fuzzy||Plc, llc, inc, corp, group|
|Account||Account Phone||Partial Exact|
Phase 2: Process Requirements
In the second phase of a deduplication project, you focus on the process. Answer the following questions:
- Do I need cleanup, prevention or both?
- Do I prefer an all-in-one solution or are multiple tools OK?
- Do I need to automate?
- Who needs to have access? Perform merges?
- Which methods are used to create new records?
- Are there compliance and legal issues to consider (GDPR for example)?
- What is your budget? You should base this on the cost of having duplicates, but that’s a topic for another article.
Phase 3: Tool Selection
In addition to Salesforce Duplicate Management, the AppExchange offers many other deduplication tools. A quick search on ‘duplicates’ in the Salesforce AppExchange shows you all the major deduplication apps. Most of them offer a free version or a trial. As you have listed all your requirements in the previous steps, now it’s pretty easy to narrow down your tool selection based on the information on the vendors’ websites and the AppExchange. Select the best two or three solutions for more thorough testing in a trial.
Don’t just test the app – also ask support a few questions. Before paying for an app, you need to make sure both the app and support meet your expectations!
Phase 4: Implementation
When you have selected your tool of choice, start configuring your matching and look at the results but do not automate anything yet. Switch to automatic merging when you have found the optimal settings.
Merging duplicates is often irreversible and should never be done without proper testing of matching and merging rules.
Phase 5: Maintenance
A one-time cleanup is only a temporary fix for your duplicate issues. Prevention and scheduled cleanups are essential to keeping your Salesforce squeaky clean. But set & forget is not the way to go. New objects, fields and entry methods are regularly added to any Salesforce environment. Make sure to update your deduplication settings to reflect these changes.
To recap, Salesforce has a basic deduplication solution for the most commonly used objects (Leads, Contacts, (Person)Accounts), allowing alerting and blocking on the creation of duplicates. A batch process for deduplicating existing data is also available for higher-end Salesforce plans. When you require all data to be saved and not blocked, other objects, cross object matching, faster processing or automatic processing it is time to look for other solutions. The first step is to list your requirements, the second step is to match these to the solutions in the AppExchange, and the last step is to test drive one or two apps.
You are welcome to test drive our app Duplicate Check for Salesforce. First, install Duplicate Check app from the AppExchange and request a trial next. Contact us and mention this article to get your trial extended to a month.
When you have an api user, the duplicate rule should not run with them if the rule is block. You can leave the rule on if it is report only. With this in mind, Salesforce Matching and Duplicate Rules are NOT incompatible with data loads and migrations.
UPDATE to my previous post. Report only is compatible with an api load or integration. But both alerts and blocks are not.
I would add one more solution – No Duplicates app. It’s a free app available on Appexchange