Snowflake

This connector materializes Flow collections into tables in a Snowflake database. It allows both standard and delta updates. Snowpipe Streaming is additionally available for delta update bindings.

The connector first uploads data changes to a Snowflake table stage. From there, it transactionally applies the changes to the Snowflake table.

ghcr.io/estuary/materialize-snowflake:dev provides the latest connector image. You can also follow the link in your browser to see past image versions.

Prerequisites

To use this connector, you'll need:

A Snowflake account that includes:
- A target database, to which you'll materialize data
- A schema — a logical grouping of database objects — within the target database
- A virtual warehouse
- A user with a role assigned that grants the appropriate access levels to these resources.
- The correct timezone setting (see Timestamp Data Type Mapping)
See the script below for details.
Know your Snowflake account's host URL. This is formatted using your Snowflake account identifier, for example, orgname-accountname.snowflakecomputing.com.
At least one Flow collection

tip

If you haven't yet captured your data from its external source, start at the beginning of the guide to create a dataflow. You'll be referred back to this connector-specific documentation at the appropriate steps.

Setup

To meet the prerequisites, copy and paste the following script into the Snowflake SQL editor, replacing the variable names in the first five lines.

If you'd like to use an existing database, warehouse, and/or schema, be sure to set database_name, warehouse_name, and estuary_schema accordingly. If you specify a new name, the script will create the item for you. You can set estuary_role and estuary_user to whatever you'd like.

Check the All Queries check box, and click Run.

set database_name = 'ESTUARY_DB';
set warehouse_name = 'ESTUARY_WH';
set estuary_role = 'ESTUARY_ROLE';
set estuary_user = 'ESTUARY_USER';
set estuary_schema = 'ESTUARY_SCHEMA';
-- create role and schema for Estuary
create role if not exists identifier($estuary_role);
grant role identifier($estuary_role) to role SYSADMIN;
-- Create snowflake DB
create database if not exists identifier($database_name);
use database identifier($database_name);
create schema if not exists identifier($estuary_schema);
-- create a user for Estuary
create user if not exists identifier($estuary_user)
default_role = $estuary_role
default_warehouse = $warehouse_name;
grant role identifier($estuary_role) to user identifier($estuary_user);
grant all on schema identifier($estuary_schema) to identifier($estuary_role);
-- create a warehouse for estuary
create warehouse if not exists identifier($warehouse_name)
warehouse_size = xsmall
warehouse_type = standard
auto_suspend = 60
auto_resume = true
initially_suspended = true;
-- grant Estuary role access to warehouse
grant USAGE
on warehouse identifier($warehouse_name)
to role identifier($estuary_role);
-- grant Estuary access to database
grant CREATE SCHEMA, MONITOR, USAGE on database identifier($database_name) to role identifier($estuary_role);
-- change role to ACCOUNTADMIN for STORAGE INTEGRATION support to Estuary (only needed for Snowflake on GCP)
use role ACCOUNTADMIN;
grant CREATE INTEGRATION on account to role identifier($estuary_role);
use role sysadmin;
COMMIT;

Key-pair Authentication

As username and password authentication was deprecated in April 2025, you need to authenticate using key-pair authentication, also known as JWT authentication.

To set up your user for key-pair authentication, first generate a key-pair in your shell:

# generate a private key
openssl genrsa 2048 | openssl pkcs8 -topk8 -inform PEM -out rsa_key.p8 -nocrypt
# generate a public key
openssl rsa -in rsa_key.p8 -pubout -out rsa_key.pub
# read the public key and copy it to clipboard
cat rsa_key.pub

-----BEGIN PUBLIC KEY-----
MIIBIj...
-----END PUBLIC KEY-----

Then assign the public key with your Snowflake user using these SQL commands:

ALTER USER $estuary_user SET RSA_PUBLIC_KEY='MIIBIjANBgkqh...'

Verify the public key fingerprint in Snowflake matches the one you have locally:

DESC USER $estuary_user;
SELECT TRIM((SELECT "value" FROM TABLE(RESULT_SCAN(LAST_QUERY_ID()))
  WHERE "property" = 'RSA_PUBLIC_KEY_FP'), 'SHA256:');

Then compare with the local version:

openssl rsa -pubin -in rsa_key.pub -outform DER | openssl dgst -sha256 -binary | openssl enc -base64

Now you can use the generated private key when configuring your Snowflake connector.

tip

Key-pair authentication is required for delta updates bindings to use Snowpipe Streaming.

Configuration

To use this connector, begin with data in one or more Flow collections. Use the below properties to configure a Snowflake materialization, which will direct one or more of your Flow collections to new Snowflake tables.

Properties

Endpoint

Property	Title	Description	Type	Required/Default
`/host`	Host (Account URL)	The Snowflake Host used for the connection. Example: orgname-accountname.snowflakecomputing.com (do not include the protocol).	string	Required
`/database`	Database	Name of the Snowflake database to which to materialize	string	Required
`/schema`	Schema	Database schema for bound collection tables (unless overridden within the binding resource configuration) as well as associated materialization metadata tables	string	Required
`/warehouse`	Warehouse	Name of the data warehouse that contains the database	string
`/role`	Role	Role assigned to the user	string
`/account`	Account	The Snowflake account identifier	string
`/credentials`	Credentials	Credentials for authentication	object	Required
`/credentials/auth_type`	Authentication type	`jwt` is the only supported authentication method currently	string	Required
`/credentials/user`	User	Snowflake username	string	Required
`/credentials/password`	Password	Deprecated	string	Deprecated
`/credentials/privateKey`	Private Key	Required if using jwt authentication	string	Required
`/advanced/disableFieldTruncation`	Disable Field Truncation	Disables truncation of large materialized fields	boolean

Bindings

Property	Title	Description	Type	Required/Default
`/table`	Table	Table name	string	Required
`/schema`	Alternative Schema	Alternative schema for this table	string
`/delta_updates`	Delta updates	Whether to use standard or delta updates	boolean

Sample

Key-pair authentication:

materializations:
  ${PREFIX}/${mat_name}:
    endpoint:
  	    connector:
    	    config:
              database: acmeCo_db
              host: orgname-accountname.snowflakecomputing.com
              schema: acmeCo_flow_schema
              warehouse: acmeCo_warehouse
              credentials:
                auth_type: jwt
                user: snowflake_user
                privateKey: |
                  -----BEGIN PRIVATE KEY-----
                  MIIEv....
                  ...
                  ...
                  ...
                  ...
                  ...
                  -----END PRIVATE KEY-----
    	    image: ghcr.io/estuary/materialize-snowflake:dev
  # If you have multiple collections you need to materialize, add a binding for each one
    # to ensure complete data flow-through
    bindings:
  	- resource:
      	table: ${table_name}
    source: ${PREFIX}/${source_collection}

(DEPRECATED) User and password authentication:

materializations:
  ${PREFIX}/${mat_name}:
    endpoint:
  	    connector:
    	    config:
              database: acmeCo_db
              host: orgname-accountname.snowflakecomputing.com
              schema: acmeCo_flow_schema
              warehouse: acmeCo_warehouse
              credentials:
                auth_type: user_pasword
                user: snowflake_user
                password: secret
    	    image: ghcr.io/estuary/materialize-snowflake:dev
  # If you have multiple collections you need to materialize, add a binding for each one
    # to ensure complete data flow-through
    bindings:
  	- resource:
      	table: ${table_name}
    source: ${PREFIX}/${source_collection}

Sync Schedule

This connector supports configuring a schedule for sync frequency. You can read about how to configure this here.

Snowflake compute is priced per second of activity, with a minimum of 60 seconds. Inactive warehouses don't incur charges. To keep costs down, you'll want to minimize your warehouse's active time.

To accomplish this, we recommend a two-pronged approach:

Configure your Snowflake warehouse to auto-suspend after 60 seconds.

This ensures that after each transaction completes, you'll only be charged for one minute of compute, Snowflake's smallest granularity.

Use a query like the one shown below, being sure to substitute your warehouse name:
```
ALTER WAREHOUSE ESTUARY_WH SET auto_suspend = 60;
```
Configure the materialization's Sync Schedule based on your requirements for data freshness.

Delta updates

This connector supports both standard (merge) and delta updates. The default is to use standard updates.

Enabling delta updates will prevent Flow from querying for documents in your Snowflake table, which can reduce latency and costs for large datasets. If you're certain that all events will have unique keys, enabling delta updates is a simple way to improve performance with no effect on the output. However, enabling delta updates is not suitable for all workflows, as the resulting table in Snowflake won't be fully reduced.

You can enable delta updates on a per-binding basis:

    bindings:
  	- resource:
      	table: ${table_name}
        delta_updates: true
    source: ${PREFIX}/${source_collection}

Performance considerations

Optimizing performance for standard updates

When using standard updates for a large dataset, the collection key you choose can have a significant impact on materialization performance and efficiency.

Snowflake uses micro partitions to physically arrange data within tables. Each micro partition includes metadata, such as the minimum and maximum values for each column. If you choose a collection key that takes advantage of this metadata to help Snowflake prune irrelevant micro partitions, you'll see dramatically better performance.

For example, if you materialize a collection with a key of /user_id, it will tend to perform far worse than a materialization of /date, /user_id. This is because most materializations tend to be roughly chronological over time, and that means that data is written to Snowflake in roughly /date order.

This means that updates of keys /date, /user_id will need to physically read far fewer rows as compared to a key like /user_id, because those rows will tend to live in the same micro-partitions, and Snowflake is able to cheaply prune micro-partitions that aren't relevant to the transaction.

Snowpipe Streaming

Snowpipe Streaming is the lowest-latency method to load data into Snowflake. Snowpipe Streaming is used by default for delta updates bindings. This method of ingress writes rows directly to Snowflake tables and scales compute automatically based on load.

Timestamp Data Type Mapping

The Snowflake materialization connector requires setting an expected timestamp type. These types map to Snowflake's timestamp data types with some caveats for TIMESTAMP_NTZ.

Available options in Estuary include:

TIMESTAMP_LTZ

Stored as a UTC point-in-time. Snowflake performs automatic timezone normalization.
TIMESTAMP_NTZ (discard TZ)

Stored as a wall-clock time without timezone. The source timezone is discarded.

This NTZ variant is recommended for existing tasks that already use TIMESTAMP_NTZ.

The type is written as TIMESTAMP_NTZ_DISCARD when working directly with the materialization specification.
TIMESTAMP_NTZ (normalize to UTC)

Stored as a wall-clock time without timezone. The connector normalizes to UTC prior to storage within Snowflake.

This NTZ variant is recommended for new tasks that use TIMESTAMP_NTZ, as it aligns well with Snowflake's default behavior.

The type is written as TIMESTAMP_NTZ_NORMALIZE when working directly with the materialization specification.
TIMESTAMP_TZ

Stored as a timestamp with associated timezone.

You do not need to explicitly set the TIMESTAMP_TYPE_MAPPING configuration in Snowflake. However, if you do, the value in Snowflake must match the value in Estuary.

Reserved words

Snowflake has a list of reserved words that must be quoted in order to be used as an identifier. Flow automatically quotes fields that are in the reserved words list. You can find this list in Snowflake's documentation here and in the table below.

caution

In Snowflake, objects created with quoted identifiers must always be referenced exactly as created, including the quotes. Otherwise, SQL statements and queries can result in errors. See the Snowflake docs.

Reserved words
account	from	qualify
all	full	regexp
alter	grant	revoke
and	group	right
any	gscluster	rlike
as	having	row
between	ilike	rows
by	in	sample
case	increment	schema
cast	inner	select
check	insert	set
column	intersect	some
connect	into	start
connection	is	table
constraint	issue	tablesample
create	join	then
cross	lateral	to
current	left	trigger
current_date	like	true
current_time	localtime	try_cast
current_timestamp	localtimestamp	union
current_user	minus	unique
database	natural	update
delete	not	using
distinct	null	values
drop	of	view
else	on	when
exists	or	whenever
false	order	where
following	organization	with
for

Prerequisites​

Setup​

Key-pair Authentication​

Configuration​

Properties​

Endpoint​

Bindings​

Sample​

Sync Schedule​

Delta updates​

Performance considerations​

Optimizing performance for standard updates​

Snowpipe Streaming​

Timestamp Data Type Mapping​

Reserved words​