Amazon RDS for MariaDB
This is a change data capture (CDC) connector that captures change events from a MariaDB database via the Binary Log. It's derived from the MySQL capture connector, so the same configuration applies, but the setup steps look somewhat different.
This connector is available for use in the Flow web application. For local development or open-source workflows, ghcr.io/estuary/source-mariadb:dev provides the latest version of the connector as a Docker image. You can also follow the link in your browser to see past image versions.
Prerequisites
To use this connector, you'll need a MariaDB database setup with the following.
- The
binlog_formatsystem variable must be set toROW. - The binary log retention
period should be set to 168 hours (the maximum allowed by RDS).
- This value may be set lower if necessary, but we discourage doing so as this may increase the likelihood of unrecoverable failures.
- A database user with appropriate permissions:
REPLICATION CLIENTandREPLICATION SLAVEprivileges.- Permission to read the tables being captured.
- Permission to read from
information_schematables, if automatic discovery is used.
- If the table(s) to be captured include columns of type
DATETIME, thetime_zonesystem variable must be set to an IANA zone name or numerical offset or the capture configured with atimezoneto use by default.
Setup
-
Allow connections to the database from the Estuary Flow IP address.
-
Modify the database, setting Public accessibility to Yes.
-
Edit the VPC security group associated with your database, or create a new VPC security group and associate it with the database. Refer to the steps in the Amazon documentation. Create a new inbound rule and a new outbound rule that allow all traffic from the Estuary Flow IP addresses.
infoAlternatively, you can allow secure connections via SSH tunneling. To do so:
- Follow the guide to configure an SSH server for tunneling
- When you configure your connector as described in the configuration section above,
including the additional
networkTunnelconfiguration to enable the SSH tunnel. See Connecting to endpoints on secure networks for additional details and a sample.
-
-
Create a RDS parameter group to enable replication in MariaDB.
-
Create a parameter group. Create a unique name and description and set the following properties:
- Family: mariadb10.6
- Type: DB Parameter group
-
Modify the new parameter group and update the following parameters:
- binlog_format: ROW
-
Associate the parameter group with the database and set Backup Retention Period to 7 days. Reboot the database to allow the changes to take effect.
-
-
Switch to your MariaDB client. Run the following commands to create a new user for the capture with appropriate permissions:
CREATE USER IF NOT EXISTS flow_capture IDENTIFIED BY 'secret'
GRANT REPLICATION CLIENT, REPLICATION SLAVE ON *.* TO 'flow_capture';
GRANT SELECT ON *.* TO 'flow_capture';
- Run the following command to set the binary log retention to 7 days, the maximum value which RDS MariaDB permits:
CALL mysql.rds_set_configuration('binlog retention hours', 168);
- In the RDS console, note the instance's Endpoint and Port. You'll need these for the
addressproperty when you configure the connector.
Capturing from Read Replicas
This connector supports capturing from a read replica of your database, provided that binary logging is enabled on the replica and all other requirements are met. To create a read replica:
-
Follow RDS instructions to create a read replica of your MariaDB database.
-
Modify the replica and set the following:
- DB parameter group: the parameter group you created previously
- Backup retention period: 7 days
- Public access: Publicly accessible
-
Reboot the replica to allow the changes to take effect.
Backfills and performance considerations
When the a MariaDB capture is initiated, by default, the connector first backfills, or captures the targeted tables in their current state. It then transitions to capturing change events on an ongoing basis.
This is desirable in most cases, as in ensures that a complete view of your tables is captured into Flow. However, you may find it appropriate to skip the backfill, especially for extremely large tables.
In this case, you may turn of backfilling on a per-table basis. See properties for details.
Configuration
You configure connectors either in the Flow web app, or by directly editing the catalog specification file. See connectors to learn more about using connectors. The values and specification sample below provide configuration details specific to the MariaDB source connector.