Skip to main content

Firebolt

This Flow connector materializes delta updates of Flow collections into Firebolt FACT or DIMENSION tables.

To interface between Flow and Firebolt, the connector uses Firebolt's method for loading data: First, it stores data as JSON documents in an S3 bucket. It then references the S3 bucket to create a Firebolt external table, which acts as a SQL interface between the JSON documents and the destination table in Firebolt.

It is available for use in the Flow web application. For local development or open-source workflows, ghcr.io/estuary/materialize-firebolt:dev provides the latest version of the connector as a Docker image. You can also follow the link in your browser to see past image versions.

Prerequisites

To use this connector, you'll need:

  • A Firebolt database with at least one engine
    • The engine must be started before creating the materialization.
    • It's important that the engine stays up throughout the lifetime of the materialization. To ensure this is the case, select Edit Engine on your engine. In the engine settings, set Auto-stop engine after to Always On.
  • An S3 bucket where JSON documents will be stored prior to loading
    • The bucket must be in a supported AWS region matching your Firebolt database.
    • The bucket may be public, or may be accessible by an IAM user. To configure your IAM user, see the steps below.
  • At least one Flow collection
tip

If you haven't yet captured your data from its external source, start at the beginning of the guide to create a dataflow. You'll be referred back to this connector-specific documentation at the appropriate steps.

Setup

For non-public buckets, you'll need to configure access in AWS IAM.

  1. Follow the Firebolt documentation to set up an IAM policy and role, and add it to the external table definition.

  2. Create a new IAM user. During setup:

    1. Choose Programmatic (access key) access. This ensures that an access key ID and secret access key are generated. You'll use these to configure the connector.

    2. On the Permissions page, choose Attach existing policies directly and attach the policy you created in step 1.

  3. After creating the user, download the IAM credentials file. Take note of the access key ID and secret access key and use them to configure the connector. See the Amazon docs if you lose your credentials.

Configuration

To use this connector, begin with data in one or more Flow collections. Use the below properties to configure a Firebolt materialization, which will direct Flow data to your desired Firebolt tables via an external table.

Properties

Endpoint

PropertyTitleDescriptionTypeRequired/Default
/aws_key_idAWS key IDAWS access key ID for accessing the S3 bucket.string
/aws_regionAWS regionAWS region the bucket is in.string
/aws_secret_keyAWS secret access keyAWS secret key for accessing the S3 bucket.string
/databaseDatabaseName of the Firebolt database.stringRequired
/engine_urlEngine URLEngine URL of the Firebolt database, in the format: <engine-name>.<organization>.<region>.app.firebolt.io.stringRequired
/passwordPasswordFirebolt password.stringRequired
/s3_bucketS3 bucketName of S3 bucket where the intermediate files for external table will be stored.stringRequired
/s3_prefixS3 prefixA prefix for files stored in the bucket.string
/usernameUsernameFirebolt username.stringRequired

Bindings

PropertyTitleDescriptionTypeRequired/Default
/tableTableName of the Firebolt table to store materialized results in. The external table will be named after this table with an _external suffix.stringRequired
/table_typeTable typeType of the Firebolt table to store materialized results in. See the Firebolt docs for more details.stringRequired

Sample

materializations:
${PREFIX}/${mat_name}:
endpoint:
connector:
config:
database: my-db
engine_url: my-db-my-engine-name.my-organization.us-east-1.app.firebolt.io
password: secret
# For public S3 buckets, only the bucket name is required
s3_bucket: my-bucket
username: firebolt-user
# Path to the latest version of the connector, provided as a Docker image
image: ghcr.io/estuary/materialize-firebolt:dev
# If you have multiple collections you need to materialize, add a binding for each one
# to ensure complete data flow-through
bindings:
- resource:
table: table-name
table_type: fact
source: ${PREFIX}/${source_collection}

Delta updates

Firebolt is an insert-only system; it doesn't support updates or deletes. Because of this, the Firebolt connector operates only in delta updates mode. Firebolt stores all deltas — the unmerged collection documents — directly.

In some cases, this will affect how materialized views look in Firebolt compared to other systems that use standard updates.

Reserved words

Firebolt has a list of reserved words, which my not be used in identifiers. Collections with field names that include a reserved word will automatically be quoted as part of a Firebolt materialization.

Reserved words
allfalseor
alterfetchorder
andfirstouter
arrayfloatover
betweenfrompartition
bigintfullprecision
boolgenerateprepare
booleangroupprimary
bothhavingquarter
caseifright
castilikerow
charinrows
concatinnersample
copyinsertselect
createintset
crossintegershow
current_dateintersecttext
current_timestampintervaltime
databaseistimestamp
dateisnulltop
datetimejointrailing
decimaljoin_typetrim
deleteleadingtrue
describelefttruncate
distinctlikeunion
doublelimitunknown_char
doublecolonlimit_distinctunnest
dowlocaltimestampunterminated_string
doylongupdate
dropnaturalusing
empty_identifiernextvarchar
epochnotweek
exceptnullwhen
executenumericwhere
existsoffsetwith
explainon
extractonly