Skip to content

michelp/metagration

Repository files navigation

Tests


Metagration: Logical PostgreSQL Migration

Migrating logically replicated PostgreSQL databases is a delicate dance of applying the right script at the right time, and enduring possible downtime making sure replicas are correctly up to date. Consider the warnings from the documentation:

  • The database schema and DDL commands are not replicated. The initial schema can be copied by hand using pg_dump --schema-only. Subsequent schema changes would need to be kept in sync manually.

Metagration is a PostgreSQL migration tool written in PostgreSQL. Metagration "up/down" scripts are stored procedures and applied in-database by the database. Creating and managing metagrations and actually running them are completely decoupled.

Metagrations can be managed and replicated like any other data in your database using whatever favorite tool you are already familar with. Using tools like pglogical you can then apply metagrations across logically replicated cluster at the exact same point in time in the WAL stream. Metagration keeps track of restore points before all changes so entire clusters can be Point-In-Time-Recovered to the same point in the transcation log, avoiding migration induced conflict errors.

Metagration has support for 100% of PostgreSQL's features, because it is PostgreSQL:

  • Up/Down scripts are stored procedures in any pl language.

  • No external tools, any PostgreSQL client can manage metagrations.

  • Cloud-friendly Trusted Language Extension for PostgreSQL >= 18.

  • One simple function for new SQL scripts.

  • Procedures can be transactional, and transaction aware.

  • Generates Point In Time Recovery restore points before migration.

  • Metagrations can export/import to/from SQL files.

  • Metagrations are just rows so pg_dump/pg_restore them.

  • Can use pgTAP for powerful migration verification.

  • Postgres docker container entrypoint friendly.

Intro

Metagrations are DDL change scripts wrapped in PostgreSQL stored procedures run in a specific order either "up" or "down". A metagration is a script defined entirely within the database, there is no external migration tool or language.

Metagration scripts are what move the database from one revision to the next. Each script has a forward "up" procedure, and optionally a backward "down" procedure to undo the "up" operation. Script procedures can be written in any supported stored procedure language. Metagration strictly enforces the revision order of the scripts applied.

Metagration comes with a simple create function for writing fast up and down scripts in plpgsql, which often look exactly like their SQL counterparts:

# SELECT metagration.new_script(
      'CREATE TABLE public.foo (id bigserial);',
      'DROP TABLE public.foo;'
      );
 new_script
--------
      1

This creates a new script with revision 1. The function metagration.new_script(up[, down]) expands the up and down code into dynamically created plpgsql functions. Once the script is created, it can then be run with metagration.run()

# CALL metagration.run();
# \dt
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | foo  | table | postgres

Now add another script with an unfortunate table name to be reverted:

# SELECT metagration.new_script(
    'CREATE TABLE public.bad (id bigserial);',
    'DROP TABLE public.bad;
    );
 new_script
--------
      2
# CALL metagration.run();
# \dt
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | foo  | table | postgres
 public | bad  | table | postgres

Now revision 2 can be reverted by calling metagration.run() with a specific target revision, in this case back to 1, and the bad table gets dropped:

postgres=# CALL metagration.run(1);
# \dt
        List of relations
 Schema | Name | Type  |  Owner
--------+------+-------+----------
 public | foo  | table | postgres

The current, previous, and next revisions can be queried:

# SELECT metagration.previous_revision();
 previous_revision
------------------
                 0
# SELECT metagration.current_revision();
 current_revision
-----------------
                1
# SELECT metagration.next_revision();
 next_revision
--------------
             2

Metagrations can also be run with a relative revision parameter passed as a text string:

CALL metagration.run('1');  -- go forward one revision
CALL metagration.run('-3');  -- go back three revisions

A log of all migrations, their start and end revisions, times, and restore points are stored in the metagration.log table:

# SELECT * FROM metagration.log ORDER BY migration_start;
 revision_start | revision_end |        migration_start        |         migration_end         | txid |           restore_point           | restore_point_lsn
----------------+--------------+-------------------------------+-------------------------------+------+-----------------------------------+-------------------
              0 |            1 | 2020-05-13 23:13:02.830335+00 | 2020-05-13 23:13:02.831964+00 |  505 | 0|1|2020-05-13|23:13:02.830335+00 | 0/183F408
              1 |            3 | 2020-05-13 23:13:02.841926+00 | 2020-05-13 23:13:02.8432+00   |  505 | 1|3|2020-05-13|23:13:02.841926+00 | 0/1841A20
              3 |            4 | 2020-05-13 23:13:02.846628+00 | 2020-05-13 23:13:02.847429+00 |  505 | 3|4|2020-05-13|23:13:02.846628+00 | 0/1844730
              4 |            1 | 2020-05-13 23:13:02.848043+00 | 2020-05-13 23:13:02.850642+00 |  505 | 4|1|2020-05-13|23:13:02.848043+00 | 0/18459C0
              1 |            4 | 2020-05-13 23:13:02.852157+00 | 2020-05-13 23:13:02.858205+00 |  505 | 1|4|2020-05-13|23:13:02.852157+00 | 0/1846790

Before each metagration a recovery restore point is created with pg_create_restore_point and can be used for Point In Time Recovery to the point just before the migration and other recovery tasks. The current transaction id is also saved.

Dynamic Metagrations

Metagration scripts are stored procedures, and can be fully dynamic in terms of the SQL they execute when run. To facilitate this, the run() function accepts an optional args jsonb argument that is passed to each script when run. This allows scripts to respond to dynamic variables at run time.

For plpgsql scripts built with new_script, optional local variable declarations can also be provided, in the following example, the index variable i in the FOR loops are declared in the up_declare and down_declare parameters to new_script() shown here:

SELECT new_script(
$up$
    FOR i IN (SELECT * FROM generate_series(1, (args->>'target')::bigint, 1)) LOOP
        EXECUTE format('CREATE TABLE %I (id serial)', 'foo_' || i);
    END LOOP;
$up$,
$down$
    FOR i IN (SELECT * FROM generate_series(1, (args->>'target')::bigint, 1)) LOOP
        EXECUTE format('DROP TABLE %I', 'foo_' || i);
    END LOOP;
$down$,
    up_declare:='i bigint',
    down_declare:='i bigint'
    );

To run, pass an integer value for the target jsonb key in args:

# CALL metagration.run(args:=jsonb_build_object('target', 3));
# \dt+
                      List of relations
 Schema |  Name | Type  |  Owner   |  Size   | Description 
--------+-------+-------+----------+---------+-------------
 public | foo_1 | table | postgres | 0 bytes | 
 public | foo_2 | table | postgres | 0 bytes | 
 public | foo_3 | table | postgres | 0 bytes | 

If your up script depends on args, it's likely your down scripts do too. Pass them as well to revert or, get the args used in the up migration from the migration.log table where they are saved.

# CALL metagration.run('-1', args:=jsonb_build_object('target', 3));
# \dt+
                      List of relations
 Schema |  Name   | Type  |  Owner   |  Size   | Description 
--------+---------+-------+----------+---------+-------------

# SELECT migration_args FROM metagration.log 
  WHERE revision_end = metagration.current_revision() 
  ORDER BY migration_end DESC LIMIT 1;
  
 migration_args 
----------------
 {"target": 3}
(1 row)

Import and Exporting

The obvious question is, if metagrations are stored procedures that makes DDL changes, who CREATEs the metagrations? They can be created programatically as shown above with new_script or by inserting directly into the metagraiton.script table. They can be imported and exported using any PostgreSQL client or admin tool. Because metagrations are in-database, they are dumped and restored when the database is backed up.

You can still check your metagrations into source control and stream them into a new database when you initialize it, then call metagrate.run().

Since this process of creating metagrations is decoupled from the actual act of migration, it can be done using any of the many database management tools for PostgreSQL. Because metagration scripts are stored procedures, they are stateless database objects that can be exported, imported, dropped and re-created as necessary.

A helpful tool for doing this metagration.export(). The export() function will generate SQL script file that CREATE OR REPLACEs the migration scripts and optionally clear and run them. Simply capture the output of this function, for example with:

psql -A -t -U postgres -c 'select metagration.export()' > export_file.sql

And then check it in to your source control. The scripts can then be imported with psql or any other PostgreSQL client:

psql < export_file.sql

This will import all the migrations but not run them, for that you still call metagration.run() or pass run_migrations:=true as shown below.

If metagration.export(replace_scripts:=true) is called the generated script will truncate the script and log tables and re-insert all the exported scripts.

If metagration.export(transactional:=true) the generated script will wrap itself in BEGIN/COMMIT.

If metagration.export(run_migrations:=true) the generated script will run the migrations immediately after inserting them.

Docker Entrypoint

Metagrations can be used easily from the standard postgres docker container entry point directory. The SQL code dumped from export(run_migrations:=true) (see above) can be dropped into the /docker-entrypoint-initdb.d/ directory in the container and the migrations will be inserted and automatically run when the new container is initialized.

Installation

Metagration is distributed as a Trusted Language Extension (TLE) for PostgreSQL 18+.

Prerequisites

  • PostgreSQL 18 or higher
  • pg_tle extension installed

Installation Steps

  1. Build the TLE installer:
make tle

This generates install-tle.sql from the source.

  1. Install pg_tle in your database:
CREATE EXTENSION pg_tle;
  1. Install metagration TLE:
psql your_database -f install-tle.sql
  1. Create the extension:
CREATE EXTENSION metagration;

Uninstallation

DROP EXTENSION metagration CASCADE;
SELECT pgtle.uninstall_extension('metagration');

Development

Run the test suite:

make test

This builds the TLE installer and runs all pgTAP tests in Docker.

Security

Metagration follows PostgreSQL security best practices to protect against SQL injection and privilege escalation attacks.

Built-in Security Features

  1. search_path Protection: All functions and procedures use SET search_path = metagration, pg_catalog, pg_temp to prevent search_path injection attacks.

  2. Schema Validation: The script_schema column is constrained to valid PostgreSQL identifiers using a CHECK constraint, preventing malicious schema injection.

  3. Identifier Quoting: All dynamic SQL uses proper quoting with %I (identifiers) and %L (literals) format specifiers.

  4. SECURITY INVOKER: All procedures explicitly run with the caller's privileges, following the principle of least privilege.

Permission Setup

After installation, configure permissions using the provided procedure:

-- Create a role for migration management
CREATE ROLE migration_admin;

-- Configure recommended permissions
CALL metagration.setup_permissions('migration_admin');

-- Grant the role to trusted users
GRANT migration_admin TO your_user;

This setup:

  • Revokes PUBLIC access to metagration schemas
  • Grants minimal required permissions to the migration role
  • Allows only authorized users to create and run migrations

Trust Requirements

Important: Metagration allows users to execute arbitrary SQL through migration scripts. This is by design - it's the core feature. Therefore:

  • Only grant migration permissions to trusted users
  • Migration scripts execute with the caller's privileges
  • Review migration scripts before running in production
  • Consider using separate roles for different environments (dev/staging/prod)

Recommendations

  1. Use Dedicated Roles: Create environment-specific roles:

    CREATE ROLE migration_dev;
    CREATE ROLE migration_prod;
  2. Audit Migration Scripts: Review all scripts in metagration.script table:

    SELECT revision, comment, up_script, down_script
    FROM metagration.script
    WHERE revision > 0
    ORDER BY revision;
  3. Monitor Migration Activity: Check the log table:

    SELECT * FROM metagration.log
    ORDER BY migration_start DESC
    LIMIT 10;
  4. Test in Staging: Always test migrations in a non-production environment first.

How does it work?

Metagration scripts are stored procedures run in revision order. This means that revision 2 is always run after 1, and before 3 when migrating forward, and the opposite when going backwards. It is not possible to insert "in between" two existing revisions, even if their revisions are not consecutive. A BEFORE INSERT trigger enforces that new scripts must have a revision > max(revision) for all existing scripts. While you can disable this trigger to bulk import revisions you will be responsible for their revision order being correct.

When a script is created with metagration.new_script() the up and down code are substituted into the body dynamically generated plpgsql procedure. You don't have to use new_script(), a script can be written in any supported language that can write stored procedures, such as python and javascript.

One and only one script at a time can be is_current = true. This is enforced with a UNIQUE partial index. The procedure metagration.run() also does a 'LOCK ... SHARE MODE' on the script table when it runs ensuring only one metagration script can run at a time.

About

Metagration: PostgreSQL Migrator in PostgreSQL

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •