Meet Stacey: The Head of Customer Support for Akmos, a large (fictional) online retailer. She works in Salesforce day in, day out. Her role as head of the support team means that she spends a lot of time on the phone dealing with some of the most complex customer support issues. To manage these high-stakes situations, she needs to be armed with the right data.
One day, she’s on the phone with a customer, Joseph, who’s chasing up a missing order. The problem is, Stacey simply can’t see any evidence of this order in his customer records. Tensions begin to break out before she finally realizes what has happened. It’s a tale as old as time: Salesforce Duplicates. While Joseph’s order is certainly in the CRM somewhere, it’s far from in the right place.
Deterministic Reasoning: Opportunities, Challenges, and Limitations
Finding, managing, and resolving duplicates is a challenge as old as Salesforce itself. And for most of this period, one prevailing method has dominated: Deterministic reasoning. In short, this involves applying rules-based logic to the challenge of identifying and resolving duplicates.
To see this method in action, let’s consider a more straightforward example. Joseph logs onto the website and purchases an item, but uses a different email. A new Salesforce record is generated. But other than the email, all the information matches his existing details, meaning the chances of this being a duplicate record are very high. The system can therefore use straightforward text matching to spot the duplicate and merge it with the existing record. Then, predefined rules like “always preserve the oldest record” help to decide how details are merged and which field is considered authoritative.
In situations like this, deterministic reasoning works well. Indeed, most Salesforce duplicates fall into this category. And this rules-based approach certainly has its benefits; it excels at imposing consistency on large datasets, which is perfect for organizations like Akmos. But while most duplicates can be merged correctly with rules-based systems, they can’t be successful 100% of the time.
That’s because there will always be some situations where the logic-based approach runs into issues. Here are some examples:
- Two different customers have very similar names, leading to the records being flagged as duplicates, despite them not being duplicates.
- A married couple shares a surname, address, and potentially an email address. Here, the records get merged even though they represent two distinct people.
- On the other hand, a customer may be registered under an uncommon nickname or shortened name, as well as their full name. In this case, it may not be immediately clear that they’re the same person.
In these edge cases, the issue is that deterministic rules can’t understand nuance or intent. In practice, that means, even with effective rules-based duplication controls, there will always be a small number of records that are merged incorrectly. And even a single incorrect record can create significant customer issues.
It’s easy to think that this small number of incorrect records isn’t an issue. But customer service agents won’t know which ones are incorrect until after they’ve had an embarrassing conversation with a customer. This means agents struggle to rely on their information and communicate confidently, even in the majority of situations where the information is correct.
AI: A New Approach to Identifying Duplicates
When a human casts their eye over the conflicts we introduced in the last section, they can generally resolve them in a few seconds. But if getting humans to resolve every duplicate record at scale was feasible, you wouldn’t be using deterministic reasoning in the first place. Until recently, this resulted in something of an impasse.
But now, the developments in AI we’ve seen over recent years present us with a genuinely robust alternative. That’s because AI can bring that additional level of human reasoning, understanding contextual factors to better identify duplicates and decide how to resolve them. It does this by interpreting patterns, evaluating probabilities, and delivering context-aware verification.
This includes two main stages:
1. Verifying Duplicates
When potential duplicates have been identified, AI can apply logic and context to decide whether or not these records belong to the same person, just as a human would.
If the names on the records are similar, it can compare contact details to decide whether they’re different people. If contact details are the same but names are different, it can consider factors like which contact details are the same, or whether both names or just the first are different. These techniques allow for more accurate duplicate identification.
2. Merging Records
Next, AI can use this contextual understanding to decide how best to resolve the conflicting records. To do this, it assesses which record best represents the truth, what field data should survive, and when differing records still describe the same entity.
Here’s an example: Let’s pretend that two records show the same person, one under the name Joseph Hale and the other under the name Joe Hale. Here, reasoning can compare these names to the name in the email address (joseph.hale2639@gmail.com) to decide that the full name should be the canonical one. Then, ‘Joe’ can be stored as a preferred or alternative name.
The key difference with the deterministic approach is context. The rules-based approach might always require the longest name, the original record, or the most recent contact details. But, at best, this will only be right most of the time.
Reasoning and Rules: Complementary, Not Competing
It might be tempting to think that we should simply do away with the deterministic approach entirely and deploy AI at every stage of the process. This would be a mistake, because neither approach alone can be 100% relied upon.
While the deterministic approach is great for consistency and scale, it lacks nuance and context. Conversely, AI models excel at context and pattern recognition, but can lack transparency, consistency, or structure.
The best approach, therefore, is to combine both methods. Let’s go back to Joseph, Stacey, and Akmos to see what this looks like in practice:
- Joseph purchases a product. This time, he’s used an alternative email address and put his name down as ‘Joe’. This is a particularly complex conflict, since both the email and first names are different, even though his phone number and address are the same.
- The deterministic approach flags it as a potential duplicate, as usual. Now, however, the AI can take over.
- First, it decides whether or not this is a genuine duplicate. With context, this isn’t a difficult job – it’s easy to see that ‘Joe’ is a common variation of ‘Joseph’. While the email addresses are different, the surnames are the same across all fields.
- Then, it decides how to merge the records. Both emails contain the name ‘Joseph’, so this is selected as the canonical name. Then, the most recent email can be selected as the preferred choice, with the original being preserved as an alternative.
This means that when Stacey needs to get in touch with Joseph to discuss his order, she has a single unified view of all his customer interactions, regardless of the varying names and contact details he’s used.
Plauti: Combining Rules and AI to Banish Duplicates
At Plauti, this hybrid philosophy is fundamental to how we approach duplicates. We think it’s about more than just resolving a handful of complex edge cases in your Salesforce environment. In fact, this is a conceptual change in how we think about data quality. The combination of the deterministic and AI-based approaches creates a shift away from static rule curation towards dynamic reasoning systems that can learn, understand, and adapt.
Or to put it in simpler terms: We use mathematics for what can be defined and reasoning for what must be understood. This approach allows organizations to configure, guide, and apply their own business logic within an intelligent and structured framework that remains fully under their organizational control.
This means customers like Joe can rely on an effective and personalized level of service, whatever details he’s registering under. And for customer support agents like Stacey, it means she can trust the information at her fingertips and communicate confidently with her customers. This makes for a better experience for everybody involved.
Check out Plauti Deduplicate to find out more.