Data Cloud

6 Ways to Extract Data from Salesforce Data Cloud

By Timo Kovala

It’s no secret that Salesforce wants to handle as much of your customer data as possible. After all, that has been the company’s winning strategy ever since its inception.

Data Cloud is the company’s fastest-growing platform in terms of year-on-year revenue growth, and it extends Salesforce well beyond its previous position as your trusted CRM system. With 130+ native connectors to stream data in (as of January 2025), Data Cloud is capable of ingesting, unifying, harmonizing, and enriching virtually any kind of data, structured or unstructured, to be used within your Salesforce architecture. Needless to say, Data Cloud is undoubtedly a powerhouse of data ingestion.

That’s all well and good, but what if you want to get data out from Data Cloud? There are times when you cannot or choose not to act upon data within the Salesforce platform. Take, for example, analytics tools like Power BI or Looker Studio, or martech platforms like Adobe Campaign or HubSpot. In these cases, Data Cloud still has you covered, but you need creativity to make it work. Next, we will explore six ways of exporting data to third-party platforms from Salesforce Data Cloud.

1. Data Activations

Data activation in Salesforce Data Cloud refers to the process of utilizing unified, enriched, and real-time customer data for specific business actions or interactions across different channels, platforms, or systems. It enables businesses to ‘activate’ their data by sending it to downstream systems, like Marketing Cloud Engagement, Sales Cloud, or analytics applications, to deliver personalized customer experiences or actionable insights.

The payload of a data activation is a segment, along with its associated attributes. A segment can have three types of attributes: direct attributes, related attributes, and calculated insights. A direct attribute is included in the DMO being segmented on and can only have one data point per segment member (e.g. contact phone or email). Related attributes can have a many-to-one relation with the segmented DMO (e.g. past purchases or email engagement). Calculated insights are custom-built calculations (e.g. CLTV and customer rating). Without these attributes, a segment would essentially be just a list of individuals along with their contact information. Attributes provide added context for content personalization, triggering and subsegmentation on the target platform.

Data activation is the extraction option of choice when working with external ad audiences or customer segments for sales or support initiatives. It is relatively simple to configure, although building the segment itself does have some pitfalls to avoid.

The biggest limitation of data activations is latency. By default, segments refresh every 12 or 24 hours. You can also build up to 20 rapid segments that cut down the refresh schedule to one or four hours. This is still far from real-time, but for most external segment activation use cases, this should be acceptable. You wouldn’t need to refresh a Meta or Google Ads audience more frequently than on an hourly basis, for instance.

2. Data Actions

Data actions are a close cousin to data activations. Both are methods of ‘activating’ data to Salesforce or an external platform. Instead of segments, data actions push events based on changes in data model objects or calculated insights to the target platform. These events can then be used to trigger automated processes, marketing workflows, or real-time insights on a data lake. Use data actions when a single, immediate action needs to happen based on customer activity or real-time updates. Conversely, data actions are not suitable for bulk updates or handling large datasets.

The thing that confused me at first was that you could use both data activations and data actions with Marketing Cloud Engagement. Both options have the same output – a data extension that can be used in a journey or an automation. The main difference is that data activation is run on a schedule, whereas data actions are near real-time. The downside of a data action is that the payload is simpler and event-based, limiting your options for dynamic content and personalization.

ScenarioUse Data ActionsUse Data Activations
Real-time engagementSend data immediately to trigger a journey or workflow (e.g. cart abandonment email).Not suitable – better for pre-scheduled updates.
Batch segment updatesNot suitable – event-based data is difficult to use for updating large datasets.Use for exporting and updating large audience segments.
Event-driven triggersIdeal when triggering journeys based on customer actions or changes in behavior.Not suitable – better for predefined, static updates.
High-volume data updatesLimited scalability for bulk data movements.Better suited for bulk updates or large datasets.
Dynamic segmentationUse when a customer qualifies for a segment and an immediate trigger is required.Use for ongoing, scheduled updates to campaigns.
Custom workflows or logicUseful for conditional logic or highly specific workflows that involve external systems.Not designed for highly custom workflows.
READ MORE: Types of Data Targets in Data Cloud (Activations vs. Data Actions)

3. Data Cloud-Triggered Flows

Flow, being the go-to declarative automation tool of Salesforce, is a huge boon for admins, as Data Cloud DMOs can trigger flows. Data Cloud combines unstructured, semi-structured and structured data from various sources into a flexible data model, and Flow allows you to trigger processes in your CRM without having to make changes in your Salesforce data model. Let’s say you have a calculated insight for identifying propensity to buy, and you want to trigger a series of actions in Sales Cloud when that score reaches a threshold. Flow is likely your best bet for a scalable, real-time solution.

There is one aspect of Flow that is particularly useful for extracting data from Data Cloud: the HTTP Callout action. The Flow HTTP Callout is one of few ways admins can make calls to external platforms declaratively. In Data Cloud’s context, this means that non-developers have the ability to set up real-time integrations based on DMOs and calculated insights, without code. This opens up pathways of powering automations external systems with rich, unified data. Think: updating points on an external loyalty platform, or adding calculated insights to an external customer database.

READ MORE: Salesforce Data Cloud-Triggered Flow: A Quick Guide

4. Data Shares

Ever since Dreamforce ‘23, the terms zero-copy, zero-ETL and Bring Your Own Lake (BYOL) have become synonymous with Data Cloud. These three terms essentially mean the same thing: you bring your own database, plug it in, and Data Cloud is able to reference it without need of replicating or transforming it in between. I see this as ultimately what drove the name change from Salesforce Customer Data Platform (CDP) to Data Cloud. Without its zero-copy/zero-ETL architecture, Data Cloud would have remained a run-of-the-mill CDP with a Salesforce logo on it. I’m glad they went with the approach they did.

Behind the scenes, what powers these zero-copy integrations is a feature called data share. Salesforce has partnered up with key players in the data lake space: Amazon, Databricks, Google, and Snowflake (as of January 2025). With these partners, Data Cloud is able to read and write data on an external data lake or warehouse without the need to move that data. This means that you gain all the benefits of Data Cloud (e.g. profile unification, data enrichment, and enablement of autonomous AI agents), without incurring the cost or issues associated with data replication. These include transaction and storage costs, inconsistencies in data, potential sync errors, and the risk of poor data trust.

Data share should be your extraction option of choice whenever the option is available to you. In general, you want to avoid having to replicate data of any sort whenever possible. However, when drawing data from the source, latency (i.e. non-real-time access) often becomes an issue. Data Cloud is one of few data platforms capable of achieving this at scale. The only major limitation is the number of official zero-copy partners. Luckily, since Data Cloud is Salesforce’s no. 1 priority alongside Agentforce, this is likely to increase in the coming months.

READ MORE: Snowflake and Salesforce Data Cloud: A Practical Guide

5. MuleSoft Anypoint Connector

MuleSoft is a Salesforce-owned integration platform that enables businesses to connect their applications, data, and devices seamlessly. Its flagship product, Anypoint Platform, provides tools for designing, building, deploying, and managing APIs and integrations. MuleSoft is well-suited for creating a network of reusable APIs that streamline business processes and enable data exchange between systems. MuleSoft enhances the power of Data Cloud by serving as the integration layer that connects it to external systems and data sources.

The crucial element that MuleSoft offers over third-party integration platforms is the ability to not only ingest data but natively orchestrate workflows with Data Cloud. MuleSoft’s real-time bidirectional data sync ensures that integrated platforms get the most up-to-date insights and unified profile data for use. This is key, especially for personalized customer-facing processes. In addition, MuleSoft is a Salesforce product – its native connector ensures faster time-to-value and better reliability compared to other integration platforms.

Whether or not to use MuleSoft for Data Cloud-based orchestration hedges on your existing integration architecture. If MuleSoft is already in use and is the preferred tool over others, like Azure or Boomi, it’s likely your best option. In case you don’t use an integration platform at the moment, MuleSoft is a strong contender, especially for enterprise-level businesses. On the other hand, adding to or replacing your existing integration platforms with MuleSoft is a tougher discussion.

READ MORE: What Is MuleSoft? (+ How It Works With Salesforce)

6. APIs and Webhooks

Going with Salesforce’s mantra of ‘clicks, not code’, direct API integrations should be your option of last resort. APIs weigh on your governor limits and place an added burden on your system architecture and data governance, so use them with care. You should always try native connectors and built-in functionalities first before going the programmatic route. Data Cloud comes equipped with several useful APIs for your custom integration needs. In addition, Data Cloud provides custom webhooks as data action targets for event-based integrations.

APIs enable retrieval and querying of unified customer profiles and other DMOs stored in Data Cloud. API integrations become complex when several datasets need to be retrieved and combined. This is where Data Cloud’s data graphs feature comes into play. With data graphs, you connect data from several related DMOs into a single view, which is automatically kept up to date. This increases query performance and decreases the number of required API calls.

Understanding which API to use is key to achieving the desired result – refer to the table below for short descriptions and use cases for each Data Cloud API.

APIPurposeExample Use Cases
Profile APIRetrieve data stored in Profile DMOs.Export customer data to CRMs, analytics platforms, or marketing tools.
Query APIExtract data stream, DMO, and unified DMO data via SQL queries.Large volume queries, external app integrations, or on-demand querying by external data platforms.
Calculated Insights APIRetrieve calculated insights with SQL.Enrich an external customer database or analytics platform with custom metrics or dimensions.
Data Graph APIQuery data and metadata from data graphs.Build integrations with Data Cloud requiring relational datasets in near real time.
Metadata APIRetrieve metadata on calculated insights, DMOs and their relationships to other objects.Enable multi-platform data governance by extracting information on original data source, segmentation criteria used, and field mappings.
Webhook Data Action TargetSend HTTP requests based on changes in DMOs and calculated insights.Set up event-based triggers, notifications or alerts to external systems.
READ MORE: Introducing Data Graphs (In Data Cloud)

Summary: Choosing the Right Method Is Key

In this article, we covered six different methods for extracting data from Data Cloud – that is, making data stored in DMOs and calculated insights accessible to external platforms. In most cases, all you need is a segment for an ad campaign or a trigger for a workflow. If your use case is more complex, a custom connection is required.

Here’s where integration platforms like MuleSoft can help. With or without an integration platform, knowing which API to use is key, as there is some overlap. With Data Cloud, there are often several paths to the same end result. As always, try to solve your requirements with the simplest, no-code solution whenever possible.

The Author

Timo Kovala

Timo is a Marketing Architect at Capgemini, working with enterprises and NGOs to ensure a sound marketing architecture and user adoption. He is certified in Salesforce, Marketing Cloud Engagement, and Account Engagement.

Comments:

    Jose S
    January 17, 2025 5:27 pm
    You can also use the jdbc or python connectors as well.

Leave a Reply