Google Analytics Data API
This connector captures data from Google Analytics 4 properties into Flow collections via the Google Analytics Data API.
It’s available for use in the Flow web application. For local development or open-source workflows, ghcr.io/estuary/source-google-analytics-data-api-native:dev
provides the latest version of the connector as a Docker image. You can also follow the link in your browser to see past image versions.
Supported data resources
The following data resources are supported:
- Daily active users
- Devices
- Four-weekly active users
- Locations
- Pages
- Traffic sources
- Website overview
- Weekly active users
Each is fetched as a report and mapped to a Flow collection through a separate binding.
You can also capture custom reports.
Prerequisites
To use this connector, you'll need:
-
The Google Analytics Data API enabled on your Google project with which your Analytics property is associated. (Unless you actively develop with Google Cloud, you'll likely just have one option).
-
Your Google Analytics 4 property ID.
Authentication
Your Google username and password is required to authenticate the connector using OAuth2.
Configuration
You configure connectors either in the Flow web app, or by directly editing a specification file. See connectors to learn more about using connectors. The values and specification sample below provide configuration details specific to the Google Analytics Data API source connector.
Properties
Endpoint
The following properties reflect the manual authentication method. If you authenticate directly with Google in the Flow web app, some of these properties aren't required.
Property | Title | Description | Type | Required/Default |
---|---|---|---|---|
/property_id | Property ID | A Google Analytics GA4 property identifier whose events are tracked. | string | Required |
/custom_reports | Custom Reports | A JSON array describing the custom reports you want to sync from Google Analytics. Learn more about custom reports. | string | |
/start_date | Start Date | The date from which you'd like to replicate data, in the format YYYY-MM-DDT00:00:00Z. All data generated after this date will be replicated. | string | Defaults to 30 days before the present |
/credentials | Credentials | Credentials for the service | object | |
/credentials/credentials_title | Authentication Method | Set to OAuth Credentials . | string | Required |
/credentials/client_id | OAuth Client ID | The OAuth app's client ID. | string | Required |
/credentials/client_secret | OAuth Client Secret | The OAuth app's client secret. | string | Required |
/credentials/refresh_token | Refresh Token | The refresh token received from the OAuth app. | string | Required |
Bindings
Property | Title | Description | Type | Required/Default |
---|---|---|---|---|
/name | Data resource | Name of the data resource. | string | Required |
/interval | Interval | Interval between data syncs | string |
Custom reports
You can include data beyond the default data resources with Custom Reports. These replicate the functionality of Custom Reports in the Google Analytics Web console.
Fill out the Custom Reports property with a JSON array as a string with the following schema:
[{"name": "<report-name>", "dimensions": ["<dimension-name>", ...], "metrics": ["<metric-name>", ...]}]
Filters are also supported. See Google's documentation for examples of filters and valid filter syntax.
[{"name": "<report-name>", "dimensions": ["<dimension-name>", ...], "metrics": ["<metric-name>", ...], "dimensionFilter": "<filter-object>", "metricFilter": "<another-filter-object>"}]
Sample
This sample reflects the manual authentication method.
captures:
${PREFIX}/${CAPTURE_NAME}:
endpoint:
connector:
image: ghcr.io/estuary/source-google-analytics-data-api-native:dev
config:
custom_reports: '[{"name": "my_custom_report_with_a_filter", "dimensions": ["browser"], "metrics": ["totalUsers"], "dimensionFilter": {"filter": {"fieldName": "browser", "stringFilter": {"value": "Chrome"}}}}]'
credentials:
credentials_title: OAuth Credentials
client_id: <secret>
client_secret: <secret>
refresh_token: <secret>
start_date: "2025-02-07T17:00:00Z"
property_id: "123456789"
bindings:
- resource:
name: daily_active_users
interval: PT5M
target: ${PREFIX}/daily_active_users
Performance considerations
Data sampling
The Google Analytics Data API enforces compute thresholds for ad-hoc queries and reports. If a threshold is exceeded, the API will apply sampling to limit the number of sessions analyzed for the specified time range. These thresholds can be found here. If your account is on the Analytics 360 tier, you're less likely to run into these limitations.