Script used at SneakersAPI.dev to replicate data from ClickHouse to PostgreSQL. This script is used to sync data from ClickHouse to Postgres based on a YAML configuration file.
Key features:
- Replicates data from ClickHouse to PostgreSQL.
- Manage primary keys, indexes and destination columns types.
- Time-series data can be synced via cursor to avoid full table scans.
- Batch processing coupled with temporary tables in separate thread and connection.
About performance:
Measured from table creation to last upsert, with batch size of 50k rows:
- 800k rows with 5 columns: around 10s, 80k rows/s
- 170k rows with 18 columns: around 5s, 34k rows/s
Note: This tool might not be the best fit for high volume of data. We tested it only under 10 million rows.
Configuration is done via a YAML file. See config.example.yml for reference.
export CLICKHOUSE_DSN=<clickhouse_dsn>
export DATABASE_URL=<database_url>
go run . [-only=<table_name>] [-drop=<table_name>] [-config=<path>]-only=<table_name>: Avoid running all tables and only process the one specified.-drop=<table_name>: Drop the table after processing and reset cursor, if any.-config=<path>: Path to the configuration file. Defaults toconfig.yml.
docker build -t replication .
docker run -e CLICKHOUSE_DSN=<clickhouse_dsn> \
-e DATABASE_URL=<database_url> \
replication \
[-only=<table_name>] \
[-config=<path>]