Often, when talking with people who have been put in charge of building a data migration or integration with Salesforce, the first question they ask is “What tool should we use?”. My response is always something along the lines of: “whoa, slow down! We need to gather the requirements first, then have our requirements drive that decision.” Naturally, the next question is: “What are the key requirements that drive that decision?”.
In this post, I will guide you to choosing the ETL Tool/Middleware that’s right for you, by asking you to ask yourself these 9 questions.
Question 1: Are we performing a Data Migration or building an Integration?
Even though I always say that data migrations must be held to the same coding standard as an integration, it’s still an important factor when choosing the right tool, because when it comes to integrations we have to think about the long term supportability of the code.
If we are coding a data migration, we (or our Dev) know that we are the only ones that will ever run the code, we will be the ones to debug it, and the code will likely never be deployed to a production server. This means that we only need to worry about our own skill set, and we can probably bend the rules a bit when it comes to corporate tooling policy, as the code will never need to be maintained by anyone else.
Also for data migrations, it probably doesn’t make sense to make a new investment in an expensive tool that won’t be used long term.
If you don’t know the difference between a Data migration and Integration, please see this article.
Question 2: For Data Migrations, what is the skill set of the person coding it?
Performing data migrations are complex enough without having to use a new tool. So preference should be given to the tool we know best if it’s the right tool for the job. (Note: The APEX data loader is almost never the right tool for the job because it doesn’t have any data transformation capabilities and is relatively difficult to automate)
Q) For Integrations, what is the skill set of the people who will be maintaining it?
If we are building an integration then we need to be more concerned with the skill set of the people who will be tasked with maintaining it, then with the people who will be coding it. Coding the integration may take a few months, while it will be maintained it for years to come. Plus the people maintaining it are usually our client’s personnel, it’s not fair to code it in such a way that they have to hire new staff and retain a new skill set for years to come, just to suit our coding preferences.
Question 3: For Integrations, what are the organization’s standard ETL/Integration Tools?
Though it’s generally the case, It’s not safe to assume that the skill set of the people who will be maintaining the Integration code, is the organization’s standard – and we should confirm what is. This is also the time to discuss if the organization is planning on shifting to a new tool set, and if so, perhaps this project should be on the new platform.
Question 4: For Integrations, does it need to be real time, or can we use scheduled batch jobs?
The reason to ask this is obvious – we need to ensure the tool selected supports real time integrations if that is the requirement.
Question 5: Besides Salesforce, what other connectors do we need?
We need to ensure the tool selected has a connector to any system or database we need to connect to – or at least, has the ability to build one. If we need to build connectors, we need to factor in that cost, as well as maintenance and any special licenses required.
Question 6: Can we have a copy of your Coding Standards & Best Practices Documentation?
We need to make sure we are following corporate standards and best practices, and we need to make sure the tool we select is in line with those policies. Again this is much more important for integrations as opposed to migrations as you can often get and exception for data migrations.
Question 7: What is the organization’s long term Data Strategy?
It’s important to understand where the organization wants to go in terms of data strategy, architecture, governance and policy. For example
- If the organization is going though initiative to move infrastructure to the cloud, perhaps we should look at cloud based integration tools.
- If the organization is looking to move towards API based or ESB based integrations perhaps we should be looking at MuleSoft or something similar.
Ask for all documents related to corporate data governance.
Question 8: What are the organization’s Data Security Policies?
In my experience, companies never compromise on their Data Security Policies, and these can have significant impact to our design and the tool selected. For example:
- Some companies have security zones where some zones are not allowed to be open to the internet. With Salesforce being a cloud system, we can’t directly integrate with any system living in those zones. So, we will have to design around it – and get our design approved by the organizations InfoSec team.
- If we are performing a data migration, will security policy allow for a data backup to be taken off site? Can a production backup be restored to a dev server? Must the data be anonymized in certain environments?
- Some companies require that all data be encrypted at rest (and of course in transit), if we are staging data we need to ensure our staging area supports at rest encryption and that its enabled.
- Many organizations require that all tools used within the organization pass a security review and a vendor due diligence (they don’t want to use a vendor that will go under in six weeks). It’s important to understand if this is the case, what the vetting process is, and how long it takes. I have recommended an existing tool, even though it was not the ideal choice, just to avoid the vetting process as it would put us at risk of missing deadlines.
Question 9: Can we have a copy of your NOC’s requirements for deploying and monitoring Jobs?
Often an organization’s NOC (Network Operations Center), is the front line support for all in prod systems, they often have requirements that must be met before any system can be put into production. Requirements like needed documentation (runbooks etc.) or requirements revolving around error handling & logging, as well as mandates on things like which scheduling software must be used. We need to ensure the tool selected supports all of these requirements.
The 9 questions featured in this article are a good place for you to start when gathering requirements for your ETL Tool/Middleware, to guide you in the right direction.
These FAQs are adapted from my book “Developing Data Migrations and Integrations with Salesforce: Patterns and Best Practices” (Apress December 2018)
Have a question you would like to see as a part of my FAQ blog series? Email it to me! – [email protected]