Skip to main content

Salesforce — Historical data

This connector captures data from Salesforce objects into Flow collections. It uses batch processing and is ideal for syncing your historical Salesforce data.

A separate connector is available for real-time Salesforce data capture. For help using both connectors in parallel, contact your Estuary account manager.

This connector is available for use in the Flow web application. For local development or open-source workflows, ghcr.io/estuary/source-salesforce:dev provides the latest connector image. You can also follow the link in your browser to see past image versions.

This connector is based on an open-source connector from a third party, with modifications for performance in the Flow system. You can find their documentation here, but keep in mind that the two versions may be significantly different.

Supported data resources

This connector can capture the following Salesforce standard objects, if present in your account:

  • Account
  • Contact
  • User
  • OpportunityFilledHistory
  • LeadHistory
  • Opportunity
  • Campaign
  • Case
  • ContactLineItem
  • Entitlement
  • Lead
  • LiveChatTranscript
  • MessagingSession
  • Quote
  • QuoteLineItem
  • ServiceAppointment
  • ServiceContract
  • Task
  • UserServicePresence
  • WorkOrder
  • WorkOrderLineItem

Custom objects aren't currently supported. Each captured object is mapped to a Flow collection through a separate binding.

Because most Salesforce accounts contain large volumes of data, you may only want to capture a subset of the available objects. There are several ways to control this:

Prerequisites

Using OAuth2 to authenticate with Salesforce in the Flow web app

If you're using the Flow web app, you'll be prompted to authenticate with Salesforce using OAuth. You'll need the following:

Configuring the connector specification manually

If you're working with flowctl and writing specifications in a local development environment, you'll need to manually supply OAuth credentials. You'll need:

Setup

Create a read-only Salesforce user

Creating a dedicated read-only Salesforce user is a simple way to specify which objects Flow will capture. This is useful if you have a large amount of data in your Salesforce organization.

  1. While signed in as an administrator, create a new profile by cloning the standard Minimum Access profile.

  2. Edit the new profile's permissions. Grant it read access to all the standard and custom objects you'd like to capture with Flow.

  3. Create a new user, applying the profile you just created. You'll use this user's email address and password to authenticate Salesforce in Flow.

Create a developer application and generate authorization tokens

To manually write a capture specification for Salesforce, you need to create and configure a developer application. Through this process, you'll obtain the client ID, client secret, and refresh token.

  1. Create a new developer application.

    a. When selecting Scopes for your app, select Manage user data via APIs (api), Perform requests at any time (refresh_token, offline_access), and Manage user data via Web browsers (web).

  2. Edit the app to ensure that Permitted users is set to All users may self-authorize.

  3. Locate the Consumer Key and Consumer Secret. These are equivalent to the client id and client secret, respectively.

  4. Follow the Salesforce Web Server Flow. The final POST response will include your refresh token.

Configuration

You configure connectors either in the Flow web app, or by directly editing the Flow specification file. See connectors to learn more about using connectors. The values and specification sample below provide configuration details specific to the batch Salesforce source connector.

Properties

Endpoint

The properties in the table below reflect the manual authentication method. If you're working in the Flow web app, you'll use OAuth2, so you won't need the /credentials values listed here.

PropertyTitleDescriptionTypeRequired/Default
/credentialsobjectRequired
/credentials/auth_typeAuthorization typeSet to Clientstring
/credentials/client_idClient IDThe Salesforce Client ID, also known as a Consumer Key, for your developer application.stringRequired
/credentials/client_secretClient SecretThe Salesforce Client Secret, also known as a Consumer Secret, for your developer application.stringRequired
/credentials/refresh_tokenRefresh TokenThe refresh token generated by your developer application.stringRequired
/is_sandboxSandboxWhether you're using a Salesforce Sandbox.booleanfalse
/start_dateStart DateStart date in the format YYYY-MM-DD. Data added on and after this date will be captured. If this field is blank, all data will be captured.string
/streams_criteriaFilter Salesforce Objects (Optional)Filter Salesforce objects for capture.array
/streams_criteria/-/criteriaSearch criteriaPossible criteria are "starts with", "ends with", "contains", "exacts", "starts not with", "ends not with", "not contains", and "not exacts".string"contains"
/streams_criteria/-/valueSearch valueSearch term used with the selected criterion to filter objects.string

Bindings

PropertyTitleDescriptionTypeRequired/Default
/cursorFieldCursor fieldField used as a cursor to track data replication; typically a timestamp field.array, null
/streamStreamSalesforce object from which a collection is captured.stringRequired
/syncModeSync ModeConnection method.stringRequired

Sample

This sample specification reflects the manual authentication method.

captures:
${PREFIX}/${CAPTURE_NAME}:
endpoint:
connector:
image: ghcr.io/estuary/source-salesforce:dev
config:
credentials:
auth_type: Client
client_id: {your_client_id}
client_secret: {secret}
refresh_token: {XXXXXXXX}
is_sandbox: false
start_date: 2022-01-01
streams_criteria:
- criteria: "starts with"
value: "Work"
bindings:
- resource:
cursorField: [SystemModstamp]
stream: WorkOrder
syncMode: incremental
target: ${PREFIX}/WorkOrder
- resource:
cursorField: [SystemModstamp]
stream: WorkOrderLineItem
syncMode: incremental
target: ${PREFIX}/WorkOrderLineItem