Pre-Decision AI Risk Intelligence
Intervene Before Impact.
Jomex is an open-source framework that detects unsafe AI outputs before they reach end users — without requiring ground truth data.
It works by querying multiple LLMs, measuring their disagreement and internal instability, and making real-time Block / Escalate / Flag / Pass decisions with regulatory-aligned risk weights and tamper-evident audit trails.
Existing tools measure uncertainty after generation. Jomex is different:
| Feature | UQLM | MUSE | Jomex |
|---|---|---|---|
| Multi-model disagreement | Partial | Yes | Yes |
| Single-model instability | No | No | Yes (IIS) |
| Reasoning divergence | No | No | Yes |
| Pre-decision blocking | No | No | Yes |
| Regulatory-aligned weights | No | No | Yes (EU AI Act) |
| Audit trail (ProofSlip) | No | No | Yes |
| Multi-turn risk (MTAR) | No | No | Yes |
pip install jomeximport asyncio
from jomex import JomexEngine
from jomex.adapters import OpenAIAdapter, AnthropicAdapter, OllamaAdapter
engine = JomexEngine(
models=[
OpenAIAdapter("gpt-4o-mini"),
AnthropicAdapter("claude-sonnet-4-20250514"),
OllamaAdapter("llama3.1:8b"), # local model for architectural diversity
],
)
async def check():
result = await engine.evaluate(
"Is ibuprofen safe during pregnancy?",
domain="medical",
)
print(f"Decision: {result.decision}") # BLOCK / ESCALATE / FLAG / PASS
print(f"Risk: {result.risk_score:.3f}")
print(f"D_ext: {result.d_ext:.3f}")
print(f"IIS: {result.iis:.3f}")
print(f"R_struct: {result.r_struct:.3f}")
print(f"W_reg: {result.w_reg}")
asyncio.run(check())Instead of tuning weights manually, use pre-configured profiles for your domain:
# Safe defaults for medical applications — stricter thresholds, more paraphrases
engine = JomexEngine.with_profile(
models=[...],
profile_name="medical",
)| Profile | Domain | t_block | Framework | Notes |
|---|---|---|---|---|
medical |
Healthcare, clinical | 0.60 | EU AI Act + FDA | 5 IIS paraphrases, conservative |
legal |
Legal reasoning | 0.65 | EU AI Act | Higher reasoning weight (β=0.3) |
financial |
Finance, credit | 0.70 | Basel III | Higher disagreement weight (α=0.5) |
education |
Educational content | 0.75 | EU AI Act | Moderate thresholds |
customer_service |
Support chatbots | 0.75 | EU AI Act | IIS disabled for speed |
default |
General purpose | 0.70 | EU AI Act | Balanced defaults |
Query → [Model A] ─┐
[Model B] ──┼→ D_ext (disagreement)
[Model C] ──┘ IIS (instability)
R_struct (reasoning divergence)
× W_reg (regulatory weight)
─────────────────────
= Risk(x)
│
┌─────────┼─────────┐
▼ ▼ ▼
PASS FLAG BLOCK
+ ProofSlip audit record
Risk(x) = W_reg × (α · D_ext + β · R_struct + γ · IIS)
| Component | What it measures |
|---|---|
| D_ext | How much do different models disagree? (cosine distance) |
| IIS | How unstable is a single model across paraphrases? (Tr(Cov)) |
| R_struct | How different are the reasoning paths? (embedding distance) |
| W_reg | How critical is this domain? (EU AI Act / FDA / Basel III) |
| Risk Level | Action | Description |
|---|---|---|
| < t_flag | PASS | Output delivered normally |
| t_flag — t_escalate | FLAG | Output delivered with risk warning |
| t_escalate — t_block | ESCALATE | Routed to human reviewer |
| > t_block | BLOCK | Output prevented entirely |
Every decision generates a ProofSlip — a SHA256 chain-linked audit record.
Jomex tracks cumulative risk across conversation turns using CUSUM (Cumulative Sum Control Chart):
result = await engine.evaluate(
"What drug interactions should I worry about?",
domain="medical",
session_id="patient-session-42",
)
# MTAR state is now embedded directly in the result
print(f"Cumulative risk: {result.mtar_cumulative:.3f}")
print(f"Turn count: {result.mtar_turn_count}")
print(f"Alert triggered: {result.mtar_alert}")
# Also available via engine for full history
mtar = engine.get_mtar_state("patient-session-42")
print(f"Risk history: {mtar.history}")MTAR automatically overrides individual PASS/FLAG decisions to ESCALATE when cumulative risk crosses the alert threshold — even if the current turn looks safe on its own.
PAR lets you replay historical decisions under new configurations — answering "what if we changed our risk policy?" without exposing users to risk.
from jomex import ReplayEngine, ReplayConfig
replay = ReplayEngine()
# What if we applied stricter thresholds?
summary = await replay.run(
slips=engine.proof_gen.get_slips(),
config=ReplayConfig(
name="stricter_medical_policy",
t_block=0.5, # Was 0.7
t_escalate=0.35, # Was 0.5
),
)
print(f"Decisions changed: {summary.decisions_changed}/{summary.total_slips}")
print(f"Change rate: {summary.change_rate:.1%}")
print(f"Avg risk drift: {summary.avg_risk_drift:+.4f}")
print(f"Migration: {summary.migration}")
# e.g., {"PASS": {"FLAG": 12, "ESCALATE": 3}, "FLAG": {"BLOCK": 5}}PAR never modifies the original ProofSlip chain — it operates on a read-only copy.
Built-in support for:
- EU AI Act (default) — Unacceptable / High / Limited / Minimal
- FDA — Class III / II / I / Exempt
- Basel III — Systemic / Material / Standard / Low
Creative domains (writing, poetry, brainstorming) automatically get W_reg = 0, disabling risk assessment where it doesn't apply.
Deploy Jomex as an HTTP gateway in front of any LLM:
pip install fastapi uvicorn
uvicorn jomex.server:app --host 0.0.0.0 --port 8000| Endpoint | Method | Description |
|---|---|---|
/evaluate |
POST | Evaluate a query for risk |
/audit/verify |
GET | Verify ProofSlip chain integrity |
/audit/slips |
GET | List ProofSlips (filterable by decision) |
/replay |
POST | Run PAR scenario analysis |
/profiles |
GET | List available risk profiles |
/health |
GET | Health check |
# Custom server with real models
from jomex.server import create_app
from jomex.adapters import OpenAIAdapter, AnthropicAdapter
app = create_app(
models=[OpenAIAdapter("gpt-4o-mini"), AnthropicAdapter("claude-sonnet-4-20250514")],
profile="medical",
proof_storage="audit/slips.jsonl",
)By default, Jomex uses English embeddings. For multilingual deployments:
from jomex import set_default_embedding_model
# Switch to multilingual embeddings (Arabic, Turkish, etc.)
set_default_embedding_model("paraphrase-multilingual-MiniLM-L12-v2")engine = JomexEngine(
models=[...],
# Risk component weights (must sum to ~1.0)
alpha=0.5, # Weight for D_ext (disagreement)
beta=0.2, # Weight for R_struct (reasoning divergence)
gamma=0.3, # Weight for IIS (instability)
# Decision thresholds
t_flag=0.3, # Above this: FLAG
t_escalate=0.5, # Above this: ESCALATE
t_block=0.7, # Above this: BLOCK
# Optional toggles
iis_enabled=True, # Disable IIS to reduce latency
r_struct_enabled=True, # Disable R_struct for simpler assessment
iis_model_index=0, # Which model to test for internal stability
iis_paraphrases=3, # Number of paraphrases for IIS measurement
)Every evaluation generates a ProofSlip — a tamper-evident, SHA256 chain-linked audit record. Each slip references the hash of the previous slip, creating a verifiable sequential chain.
# Verify chain integrity
assert engine.proof_gen.verify_chain()
# Get all blocked decisions
blocked = engine.proof_gen.get_slips(decision_filter="BLOCK")
# Persist to file (JSON Lines, append-only)
engine = JomexEngine(
models=[...],
proof_generator=ProofSlipGenerator(storage_path="audit/jomex_slips.jsonl"),
)The ProofSlip generator is concurrency-safe — multiple async evaluations can run simultaneously without chain corruption.
jomex/
├── engine.py # JomexEngine orchestrator (async, parallel, with_profile)
├── scorers.py # D_ext, IIS, R_struct (multilingual embedding support)
├── regulatory.py # W_reg mapper (EU AI Act, FDA, Basel III)
├── proof_slip.py # ProofSlip generator (async, lock-protected hash chain)
├── profiles.py # Pre-configured risk profiles per domain
├── replay.py # PAR — Proof-of-Alignment Replay engine
├── server.py # FastAPI REST gateway
├── adapters.py # LLM provider adapters
└── models.py # Decision, RiskResult, MTARState
| Provider | Adapter | Install |
|---|---|---|
| OpenAI | OpenAIAdapter("gpt-4o-mini") |
pip install jomex[openai] |
| Anthropic | AnthropicAdapter("claude-sonnet-4-20250514") |
pip install jomex[anthropic] |
GoogleAdapter("gemini-2.0-flash") |
pip install jomex[google] |
|
| Ollama (local) | OllamaAdapter("llama3.1:8b") |
Requires Ollama |
| Testing | MockAdapter("test", ["response"]) |
Built-in |
# 140 tests across 18 sections, 0 failures
# Includes: unit, property, integration, concurrency (50 parallel),
# monotonicity invariants, tamper detection, and PAR replay validation
python tests/test_all.pyJomex is a risk assessment tool, not a decision-maker. It does NOT provide medical diagnoses, legal opinions, or financial advice. A "PASS" decision does NOT guarantee output safety. Human oversight is required for all high-risk deployments. See LICENSE for full liability disclaimers.
Apache 2.0 — Oplogica Inc.
- Website: jomex.ai
- Repository: github.com/oplogica/jomex
- Paper: Coming soon
- Contact: contact@jomex.ai