- Overview
- Features
- Requirements
- Architecture
- Installation
- Docker Regression Testing
- Configuration
- Usage
- Functions Reference
- GUC Parameters
- Security Model
- Key Rotation
- Supported Algorithms
- License
column_encrypt is a PostgreSQL extension that provides transparent column-level encryption. It allows database users to use their own keys when inserting data into a table so that data is stored encrypted in designated columns. Users who hold the correct key can read and write encrypted columns without any modification to their SQL queries — encryption and decryption happen transparently at the type input/output level.
- Two custom encrypted data types:
ENCRYPTED_TEXTandENCRYPTED_BYTEA - Two-tier key model: Key Encryption Key (KEK) wraps the Data Encryption Key (DEK), so the DEK is never stored in plaintext
- AES encryption via the
pgcryptoextension - Key version header embedded in every ciphertext for future key rotation tracking
- Version-aware decryption using the ciphertext header and session-loaded keys
- Single-role security model with unified
column_encrypt_userrole - Automatic log masking for sensitive key operations (defense-in-depth)
- Key registry metadata with explicit
pending,active,retired, andrevokedstates - Key rotation support with both full-table and batched re-encryption helpers
- Optional blind-index helpers for scalable equality lookup patterns
- Hash/equality semantics on decrypted plaintext so comparisons remain correct across key rotation
- Session introspection helper to show which key versions are currently loaded in the backend
- Cast support from
bool,inet,cidr,xml, andcharactertoencrypted_text
- PostgreSQL 14 or later (tested with PostgreSQL 14, 15, 16, 17, and 18)
pgcryptoextension (listed as a dependency incolumn_encrypt.controland auto-installed if not already present)- A C compiler and PostgreSQL development headers (
postgresql-develon RPM-based systems,libpq-dev/postgresql-server-dev-*on Debian/Ubuntu)
The extension registers two custom base types (encrypted_text, encrypted_bytea) backed by variable-length bytea storage.
- On
INSERT/UPDATE: the type input function (col_enc_text_in/col_enc_bytea_in) transparently encrypts the supplied plaintext value using the session's currently loaded DEK. - On
SELECT: the type output function (col_enc_text_out/col_enc_bytea_out) transparently decrypts the stored ciphertext using the matching session-loaded key for that ciphertext version. - A 2-byte key version header is prepended to every ciphertext and used during decryption and re-encryption workflows. The header uses an unambiguous format: bit 15 (0x8000) is set as a flag, with the key version (1-32767) stored in the low 15 bits in network byte order. This ensures cross-platform compatibility and eliminates ambiguity when reading ciphertext. Legacy data written before this format is auto-detected and handled via fallback logic.
- Keys are held in
TopMemoryContextas a versioned in-memory keyring and are securely zeroed withsecure_memsetwhen removed from session memory. - An
emit_log_hookmasks known sensitive key-management function calls in log messages (query text, detail, context, internal query) to reduce accidental key leakage. - Administrative and runtime functions are gated by dedicated extension roles instead of
PUBLICexecution. - Binary protocol
SEND/RECEIVEis intentionally rejected so clients cannot bypass the text I/O encryption path. - Equality and hash behavior are defined on decrypted plaintext, not raw ciphertext bytes.
- Range ordering on encrypted values is intentionally unsupported; use companion blind indexes for scalable equality lookups instead.
- Because equality/hash behavior depends on decrypted plaintext and session key availability, blind-index columns are the recommended pattern for indexed equality at scale.
- When upgrading from releases that used ciphertext-based hash semantics, rebuild any existing hash indexes on encrypted columns before relying on them again.
-
Clone the repository:
git clone https://github.com/vibhorkum/column_encrypt.git cd column_encrypt -
Ensure
pg_configis inPATH:export PATH=/usr/pgsql-<version>/bin:$PATH
-
Build and install:
make make install
-
Add the extension to
shared_preload_librariesinpostgresql.conf:shared_preload_libraries = '$libdir/column_encrypt' -
Restart PostgreSQL:
pg_ctl restart -D $PGDATA # or systemctl restart postgresql
-
Create the encrypt schema (required before installing the extension):
CREATE SCHEMA IF NOT EXISTS encrypt;
-
Create the extension in the target database (requires superuser).
pgcryptowill be installed automatically as a dependency if it is not already present:CREATE EXTENSION column_encrypt;
-
Configure search_path to include the
encryptschema. The extension types (encrypted_text,encrypted_bytea) and functions live in theencryptschema. To use unqualified type names, addencryptto your search_path:-- For the current database (recommended) ALTER DATABASE your_database SET search_path TO public, encrypt, pg_catalog; -- Or for a specific role ALTER ROLE your_role SET search_path TO public, encrypt, pg_catalog; -- Or just for the current session SET search_path TO public, encrypt, pg_catalog;
Alternatively, use schema-qualified names:
encrypt.encrypted_text.
Upgrading to v4.0:
v4.0 removes all deprecated functions. You must migrate through v3.3 first:
-- Step 1: Upgrade to v3.3 (deprecation release)
ALTER EXTENSION column_encrypt UPDATE TO '3.3';
-- Step 2: Update your code to use encrypt.* API (see MIGRATION.md)
-- Step 3: Upgrade to v4.0
ALTER EXTENSION column_encrypt UPDATE TO '4.0';See MIGRATION.md for the complete migration guide.
General notes:
- After upgrading from v2.x, rebuild any existing hash indexes on
encrypted_textorencrypted_byteacolumns. - For scalable equality lookups, prefer a companion blind-index column instead of direct encrypted-column hash indexing.
If you do not want to install PostgreSQL development packages on your host, you can build and run the extension tests in Docker instead.
Run the local regression harness against PostgreSQL 18:
./run-docker-regression.shRun against a specific supported version (14, 15, 16, 17, or 18):
./run-docker-regression.sh 14
./run-docker-regression.sh 15
./run-docker-regression.sh 16
./run-docker-regression.sh 17
./run-docker-regression.sh 18What the Docker harness does:
- builds a throwaway Ubuntu-based test image
- installs PostgreSQL server, contrib, and dev packages for the selected version
- runs
makeandmake install - initializes a temporary cluster with
shared_preload_libraries = 'column_encrypt' - creates a test database and runs
make installcheck
Files involved:
docker/Dockerfiledocker/run-regression.shdocker/docker-compose.test.ymlrun-docker-regression.sh
This path mirrors the GitHub Actions CI workflow closely, which makes local failures easier to compare with CI results.
After installation, the following GUC parameters are available (see GUC Parameters for full details):
| Parameter | Default | Description |
|---|---|---|
encrypt.enable |
on |
Enable/disable column encryption (superuser only) |
encrypt.mask_key_log |
on |
Mask known sensitive key-management calls in PostgreSQL logs (superuser only) |
encrypt.mask_query_literals |
off |
Mask string literals in PostgreSQL log messages (superuser only) |
encrypt.key_version |
1 |
Key version written into ciphertext header (superuser only) |
Connect as a user with column_encrypt_user role and register your Data Encryption Key (DEK) wrapped by a master passphrase (KEK). Log masking is automatic:
-- Log masking is automatic, no ceremony needed
SELECT encrypt.register_key('my-secret-data-key', 'my-master-passphrase');In each new session, load the DEK into session memory before accessing encrypted columns:
SELECT encrypt.load_key('my-master-passphrase');By default, INSERT/UPDATE statements with plaintext values will appear in PostgreSQL logs. To prevent sensitive data leakage, enable literal masking:
-- Enable for the session (requires superuser)
SET encrypt.mask_query_literals = on;
-- Or enable globally in postgresql.conf (recommended):
-- encrypt.mask_query_literals = onWhen enabled, INSERT INTO t(ssn) VALUES('123-45-6789') appears in logs as INSERT INTO t(ssn) VALUES('***').
Note: This masks ALL string literals in logs, which reduces observability. For production systems handling sensitive data, this trade-off is usually acceptable.
CREATE TABLE secure_data(id SERIAL, ssn ENCRYPTED_TEXT);
INSERT INTO secure_data(ssn) VALUES('888-999-2045');
INSERT INTO secure_data(ssn) VALUES('888-999-2046');
INSERT INTO secure_data(ssn) VALUES('888-999-2047');test=# SELECT * FROM secure_data;
id | ssn
----+--------------
1 | 888-999-2045
2 | 888-999-2046
3 | 888-999-2047
(3 rows)test=# SELECT * FROM secure_data;
ERROR: cannot decrypt data, because key was not setThis is the expected result — without loading the key, the ciphertext cannot be decrypted.
test=# SELECT encrypt.load_key('wrong-passphrase');
ERROR: incorrect passphraseThis is also expected — an incorrect passphrase is rejected (SQLSTATE 28P01).
All encryption functions are in the encrypt schema:
| Function | Returns | Description |
|---|---|---|
encrypt.register_key(dek text, passphrase text, activate boolean DEFAULT true) |
integer |
Wraps the DEK with the passphrase using AES-256/S2K and stores it. Returns the assigned key_id. |
encrypt.load_key(passphrase text, all_versions boolean DEFAULT false) |
boolean |
Loads key(s) into session memory. Use all_versions => true during rotation. |
encrypt.unload_key() |
void |
Securely removes all loaded keys from session memory. |
encrypt.activate_key(key_id integer) |
boolean |
Sets a key as active for new encryptions (retires previous active key). |
encrypt.revoke_key(key_id integer) |
boolean |
Prevents a key from being loaded. |
encrypt.rotate(schema text, table text, column text, batch_size integer DEFAULT 10000) |
bigint |
Re-encrypts entire column with the active key. batch_size controls internal UPDATE chunk size. Returns total rows processed. |
encrypt.verify(schema text, table text, column text, sample_size integer DEFAULT 100) |
setof record |
Verifies encryption integrity by sampling rows. |
encrypt.keys() |
setof record |
Lists all registered keys with state and metadata. |
encrypt.status() |
record |
Returns quick status: key_loaded, active_key_version, encrypted_column_count. |
encrypt.blind_index(value text, hmac_key text) |
text |
Returns a SHA-256 HMAC blind index for searchable lookups. |
encrypt.loaded_cipher_key_versions() |
integer[] |
Returns key versions currently loaded in session (metadata only). |
All parameters require superuser (PGC_SUSET) to change.
| Parameter | Type | Default | Range | Description |
|---|---|---|---|---|
encrypt.enable |
bool |
on |
— | Enables or disables column encryption globally. This is independent from the log masking helper functions. |
encrypt.mask_key_log |
bool |
on |
— | When enabled, masks known sensitive key-management function calls, including case-variant and schema-qualified forms, in PostgreSQL log messages as a defense-in-depth control. |
encrypt.mask_query_literals |
bool |
off |
— | When enabled, masks string literals in PostgreSQL log messages including single-quoted ('value' → '***'), empty-tag dollar-quoted ($$value$$ → $$***$$), and tagged dollar-quoted ($tag$value$tag$ → $tag$***$tag$) strings. Useful for environments where query logs must not contain any sensitive data. |
encrypt.key_version |
int |
1 |
1–32767 |
Key version number written into the 2-byte ciphertext header. Increment this when rotating keys to track which version encrypted each value. |
cipher_key_tablestores versioned wrapped keys, activation metadata, and timestamps. Direct table access is revoked fromPUBLIC.- All
encrypt.*functions useSECURITY DEFINERand are revoked fromPUBLIC. column_encrypt_useris the single role for all encryption operations (key management, loading, rotation).encrypt.enableisPGC_SUSET— only superusers can enable or disable encryption. This prevents unprivileged users from bypassing encryption by toggling the GUC.- Encryption keys stored in C session memory are zeroed with
secure_memsetwhen removed, preventing key material from lingering in process memory. - The
emit_log_hookautomatically masks known sensitive key-management function calls in PostgreSQL logs. This is a defense-in-depth measure, not a substitute for cautious operational handling of passphrases. loaded_cipher_key_versions()reveals version metadata only, not DEKs or passphrases.
Example grants:
-- Grant encryption privileges to application roles
GRANT column_encrypt_user TO app_user;
GRANT column_encrypt_user TO key_manager;| Key | Role | Storage |
|---|---|---|
| Data Encryption Key (DEK) | Encrypts/decrypts column data | Stored wrapped (encrypted) in cipher_key_table; loaded into session memory on encrypt.load_key() |
| Key Encryption Key (KEK) / Master Passphrase | Wraps the DEK before storage | Never stored in the database; managed externally |
encrypt.register_key() encrypts the DEK with the passphrase using AES-256 with iterated-salted S2K (cipher-algo=aes256, s2k-mode=3) via pgcrypto's pgp_sym_encrypt. encrypt.load_key() reverses this with pgp_sym_decrypt.
To rotate the encryption key, follow these steps:
-- Step 1: Register a new key version (inactive by default)
SELECT encrypt.register_key('new-data-key', 'my-master-passphrase', false);
-- Step 2: Load all key versions for rotation
SELECT encrypt.load_key('my-master-passphrase', all_versions => true);
-- Step 3: Activate the new version and re-encrypt
-- (activate_key sets encrypt.key_version internally)
SELECT encrypt.activate_key(2);
SELECT encrypt.rotate('public', 'secure_data', 'ssn');
-- For large tables, use a smaller batch_size to reduce per-UPDATE lock duration.
-- Note: rotate() processes the entire column in one call, using batch_size as
-- the internal chunk size for UPDATE operations. Returns total rows processed.
SELECT encrypt.rotate('public', 'secure_data', 'ssn', 5000);
-- Step 4: Clear the session keyring
SELECT encrypt.unload_key();After rotation, all new sessions must call encrypt.load_key('my-master-passphrase') to use the active key version.
column_encrypt replicates ciphertext, not plaintext. That means:
- subscribers must have the extension installed
- subscribers must manage and load keys independently
- wrapped keys in
cipher_key_tableare not replicated as usable session keys - replication/apply roles should run with
encrypt.enable = offso replication transports ciphertext rather than decrypted plaintext
For local end-to-end testing, use the Docker-based logical replication harness to exercise publisher/subscriber behavior.
For scalable equality lookups, prefer a companion blind-index column over direct comparisons on encrypted columns.
Do not rely on <, <=, >, or >= semantics for encrypted values; plaintext range ordering is intentionally unsupported.
Example:
ALTER TABLE secure_data ADD COLUMN ssn_blind_index text;
UPDATE secure_data
SET ssn_blind_index = encrypt.blind_index('888-999-2045', 'blind-index-secret');The blind-index key should be managed separately from the DEK/KEK used for encryption.
See Logical Replication above for the conceptual model; this section documents the Docker-based integration harness that validates it end to end.
Recommended pattern:
- Use a dedicated publisher-side replication user with
ALTER ROLE ... SET encrypt.enable = offso logical decoding emits raw ciphertext instead of decrypted plaintext. - Use a dedicated subscriber-side subscription owner/apply-worker role with
ALTER ROLE ... SET encrypt.enable = offso replicated ciphertext is stored directly without requiring a backend-local loaded key. - Keep normal application sessions on the default
encrypt.enable = onpath and load keys only in roles that were explicitly grantedcolumn_encrypt_user. - Load the key only in interactive/application sessions that need to read decrypted values; replication workers should replicate ciphertext, not plaintext.
Run the local integration harness:
./run-docker-logical-replication.sh 18Files involved:
docker/docker-compose.replication.ymldocker/run-logical-replication.shrun-docker-logical-replication.sh
The extension uses AES encryption via pgcrypto.
| Algorithm | Description |
|---|---|
| AES-256 | Advanced Encryption Standard with 256-bit key |
The DEK is wrapped using AES-256/S2K (iterated-salted string-to-key) via pgp_sym_encrypt.
The v4.0 API uses standard PostgreSQL error messages with SQLSTATE codes:
| SQLSTATE | Condition Name | Message | Cause |
|---|---|---|---|
22023 |
invalid_parameter_value |
encryption key cannot be null or empty | DEK is NULL or empty |
22023 |
invalid_parameter_value |
encryption key must be at least 16 bytes | DEK is too short (use 32 bytes for AES-256) |
22023 |
invalid_parameter_value |
passphrase cannot be null or empty | Passphrase is NULL or empty |
28P01 |
invalid_password |
incorrect passphrase | Master passphrase failed to decrypt wrapped key |
28P01 |
invalid_password |
failed to decrypt key version N | Passphrase incorrect for specific key version |
42602 |
invalid_name |
invalid identifier | Schema/table/column name contains invalid characters |
42703 |
undefined_column |
column not found | Target column does not exist |
42809 |
wrong_object_type |
not an encrypted column | Target is not encrypted_text or encrypted_bytea |
0A000 |
feature_not_supported |
encryption must be enabled | Operation attempted with encrypt.enable = off |
22000 |
data_exception |
cannot activate expired key | Key's expires_at is in the past |
54000 |
program_limit_exceeded |
maximum key version (32767) exceeded | Key version exceeds ciphertext header limit |
Note: Legacy error codes (EDB-ENC*) from the C layer may still appear for type I/O errors (e.g., "cannot decrypt data, because key was not set").
This extension is licensed under the PostgreSQL License, a permissive open-source license similar to the BSD 2-Clause license.