Python interface for PFC-JSONL — high-performance compression for structured log files (JSONL), with block-level timestamp filtering.
pip install pfc-jsonl
Requires the
pfc_jsonlbinary. Install it separately — see below.
PFC-JSONL compresses JSONL log files 25–37% smaller than gzip/zstd on typical log data. It stores a timestamp index alongside each file, enabling fast time-range queries without full decompression.
| Operation | Description |
|---|---|
compress |
JSONL → .pfc (with timestamp index) |
decompress |
.pfc → JSONL |
query |
Decompress only blocks matching a time range |
seek_blocks |
Decompress specific blocks by index (DuckDB primitive) |
import pfc
# Compress
pfc.compress("logs/app.jsonl", "logs/app.pfc")
# Decompress
pfc.decompress("logs/app.pfc", "logs/app_restored.jsonl")
# Query by time range — only decompresses matching blocks
pfc.query("logs/app.pfc",
from_ts="2026-01-15T08:00:00",
to_ts="2026-01-15T09:00:00",
output_path="logs/morning.jsonl")The Python package is a thin wrapper — the compression engine is the pfc_jsonl binary.
Linux (x64):
curl -L https://github.com/ImpossibleForge/pfc-jsonl/releases/latest/download/pfc_jsonl-linux-x64 \
-o pfc_jsonl && chmod +x pfc_jsonl && sudo mv pfc_jsonl /usr/local/bin/macOS: Coming soon.
Windows: No native binary available. Use WSL2 or a Linux machine.
Custom location: Set the PFC_BINARY environment variable:
export PFC_BINARY=/opt/tools/pfc_jsonlVerify:
pfc_jsonl --helppfc.compress(input_path, output_path, *, level="default", block_size_mb=None, workers=None, verbose=False)
Compress a JSONL file to PFC format.
pfc.compress("logs/app.jsonl", "logs/app.pfc")
pfc.compress("big.jsonl", "big.pfc", level="max", workers=4)| Parameter | Default | Description |
|---|---|---|
level |
"default" |
"fast", "default", or "max" (also accepts 1-5) |
block_size_mb |
auto | Block size in MiB (power of 2, e.g. 16, 32) |
workers |
auto | Parallel compression workers |
verbose |
False |
Print progress from binary |
Decompress a PFC file back to JSONL.
pfc.decompress("logs/app.pfc", "logs/app_restored.jsonl")Decompress only the blocks matching a timestamp range.
pfc.query("logs/app.pfc",
from_ts="2026-01-15T08:00:00",
to_ts="2026-01-15T09:00:00",
output_path="logs/morning.jsonl")Timestamps can be ISO 8601 strings or Unix epoch integers (as strings).
Decompress specific blocks by index. Used internally by the DuckDB extension.
pfc.seek_blocks("logs/app.pfc", [0, 3, 7], "logs/selected.jsonl")Return the path to the pfc_jsonl binary being used.
print(pfc.get_binary()) # /usr/local/bin/pfc_jsonlimport pfc
from pfc import PFCError
try:
pfc.compress("missing.jsonl", "out.pfc")
except FileNotFoundError as e:
print(f"Binary not found: {e}")
except PFCError as e:
print(f"Compression failed (exit {e.returncode}): {e.stderr}")Use pfc-fluentbit to receive logs from Fluent Bit and compress them automatically.
Use the pfc DuckDB extension to query .pfc files directly with SQL:
Status: Submitted — pending review (PR #1679). Once available:
-- Once available in DuckDB community extensions:
INSTALL pfc FROM community;
LOAD pfc;
LOAD json;
SELECT line->>'$.level' AS level, line->>'$.message' AS msg
FROM read_pfc_jsonl('logs/app.pfc')
WHERE line->>'$.level' = 'ERROR';MIT — see LICENSE
The PFC-JSONL binary is proprietary software — free for personal and open-source use. Commercial use requires a license: impossibleforge@gmail.com