A minimal demo project showing FlameIQ catching a real performance regression in a Python library.
- Benchmarks TextCraft — a simple text processing library
- Sets a baseline with the fast, correct implementation
- Introduces a regression (careless refactor that recompiles regex on every call)
- Runs FlameIQ to catch the regression automatically
- Generates an HTML report showing exactly which metrics degraded
# Clone the demo
git clone https://github.com/flameiq/demo-flameiq
cd demo-flameiq
# Install FlameIQ from PyPI
pip install flameiq-core
# Verify install
flameiq --versionflameiq initThis creates .flameiq/ and flameiq.yaml in the current directory.
(A pre-configured flameiq.yaml is already included in this repo.)
python benchmarks/run_benchmark.py --output metrics_baseline.jsonYou will see output like:
Running benchmarks...
→ clean()
→ word_frequency()
→ summarise()
✓ Metrics written to metrics_baseline.json
commit: abc1234
latency p95: 2.45 ms
throughput: 412.3 calls/sec
flameiq baseline set --metrics metrics_baseline.json✓ Baseline set
Commit: abc1234
Branch: main
Metrics: 7 value(s) stored
flameiq baseline showA careless developer refactors clean() and accidentally recompiles
the regex on every call. Run the regressed benchmark:
python benchmarks/run_benchmark_regressed.py --output metrics_regressed.jsonflameiq compare --metrics metrics_regressed.json --fail-on-regressionFlameIQ will output something like:
Metric Baseline Current Change Threshold Status
──────────────────────────────────────────────────────────────────────────────
latency.mean 2.1200 3.8900 +83.49% ±10.0% REGRESSION
latency.p50 2.0500 3.7200 +81.46% ±10.0% REGRESSION
latency.p95 2.4500 4.5100 +84.08% ±10.0% REGRESSION
latency.p99 2.8900 5.2300 +80.97% ±15.0% REGRESSION
memory_mb 0.0001 0.0001 +0.00% ±8.0% PASS
throughput 412.30 231.50 -43.84% ±10.0% REGRESSION
✗ REGRESSION — 5 metric(s) exceeded threshold.
Exit code will be 1, failing the CI pipeline.
flameiq report --metrics metrics_regressed.json --output report.htmlOpen report.html in your browser to see the full visual diff.
flameiq validate metrics_baseline.jsonflameiq baseline showflameiq-demo/
├── textcraft/
│ ├── __init__.py
│ ├── processor.py ← fast, correct implementation
│ └── processor_regressed.py ← slow, regressed implementation
├── benchmarks/
│ ├── run_benchmark.py ← benchmark the fast version
│ └── run_benchmark_regressed.py ← benchmark the regressed version
├── flameiq.yaml ← FlameIQ configuration
└── README.md
In processor_regressed.py, clean() recompiles two regex patterns
on every single call:
# FAST (correct)
def clean(text):
text = re.sub(r"[^\w\s]", "", text) # regex compiled and cached by Python
...
# SLOW (regressed)
def clean(text):
punct_re = re.compile(r"[^\w\s]") # recompiled every call!
space_re = re.compile(r"\s+") # recompiled every call!
...This is a classic, easy-to-miss Python performance mistake. FlameIQ catches it automatically.
- FlameIQ on PyPI: https://pypi.org/project/flameiq-core/
- FlameIQ docs: https://flameiq-core.readthedocs.io
- FlameIQ source: https://github.com/flameiq/flameiq-core