Change the rules. See why the system changed.
npx @neuroverseos/nv-sim visualizeYour agents already have a decide → act loop. Insert one call between them:
Before: action = agent.decide()
After: action = govern(agent.decide())
my_agent | neuroverse guard --world ./world --traceEvery action your agent emits gets evaluated. Blocked actions return {"status":"BLOCK","reason":"..."}. Allowed actions pass through. Your agent reads the verdict and adapts.
npx @neuroverseos/nv-sim servefor agent in agents:
action = agent.decide()
verdict = requests.post("http://localhost:3456/api/evaluate", json={
"actor": agent.id,
"action": action,
}).json()
if verdict["decision"] == "BLOCK":
action = "hold"
elif verdict["decision"] == "MODIFY":
action = verdict["modified_action"]
environment.apply(agent, action)import { evaluateGuard, loadWorld } from '@neuroverseos/governance';
const world = await loadWorld('./world/');
const verdict = evaluateGuard({ intent, tool, scope }, world);
if (verdict.status === 'BLOCK') throw new Error(`Blocked: ${verdict.reason}`);neuroverse mcp --world ./world --plan plan.jsonOne command. Same rules govern your agents whether you're simulating or shipping.
You build a multi-agent system. You run it. You get metrics — loss curves, reward signals, completion rates. Something goes wrong, and you ask:
Why did the agents do that?
Nobody can tell you. Metrics say what happened. Logs say when. Nothing tells you why agents changed their behavior — which rule caused it, which agents shifted first, what strategy they abandoned, and what they replaced it with.
Most multi-agent systems let agents do whatever emerges. NeuroVerse lets you decide what's allowed — and actually stops the rest.
What should agents explore?
"Protein mutations that improve binding affinity for SSTR2"
What should never be published?
"Results based on a single data source. Claims with confidence below 70%."
What makes a result valuable?
"Multiple independent lines of evidence converging on the same finding."
How should agents be rewarded or penalized?
IF an agent publishes without peer validation → reduce its influence for 3 rounds IF two agents independently converge on the same finding → boost that finding's priority
Click "Build World" and your plain English becomes enforceable logic:
BLOCK Results with confidence below 70%
BLOCK Results from a single source
PRIORITIZE Multi-source convergence
PENALIZE Publishing without validation → reduce influence, 3 rounds
REWARD Independent convergence → boost priority, 5 rounds
Works without AI. The deterministic engine uses heuristics to translate your intent into rules — no API key, no cost, no cloud.
Better with AI. Add your API key (Anthropic, OpenAI, Google, Groq, or any OpenAI-compatible endpoint) and the engine generates smarter, more specific rules. Key stays in your browser — never sent to our servers.
If your policy has conflicts, the inline diagnostics show you exactly what's wrong with one-click fix buttons:
ERROR Conflicting rules: RULE-002 vs RULE-003
Fix: Resolve conflict — remove one rule, or add a condition
[Merge into single rule] ← click to fix
Upload your agent output (JSONL) or load demo data. The engine evaluates every action against your rules and shows:
- Per-action verdicts — ALLOW, BLOCK, MODIFY, PAUSE, REWARD, or PENALIZE
- Audit trail — per-agent breakdown, rule firing frequency, timeline by cycle
- Behavioral insights — two columns side by side:
| Observed (from your data) | Requires Integration (blind spots) |
|---|---|
| Agent X fails 75% of the time | Did Agent X change strategy after being blocked? |
| "No sources" triggered 8x across 3 agents | Is this systemic or isolated? |
| Agents A and B produced identical output | Independent convergence or echo amplification? |
| Quality degrading over 5 cycles | Drift or deliberate strategy shift? |
The left column is computed from real audit data. The right column tells you what you can only answer by putting governance inside the loop.
Remove the confidence threshold. What breaks? Add a rule penalizing groupthink. Do agents explore more diverse hypotheses?
This is the experiment. Not the simulation — the rules themselves.
npm install
npm run dev:fullOpens in your browser. Everything runs locally. Light and dark mode included.
npm install @neuroverseos/governanceThe governance engine is a separate open-source package. The simulation UI uses it, but you can use it independently in any system.
The governance package includes validation that runs the same checks as the browser UI:
# Initialize a world definition
neuroverse init --name "my-research-agents"
# Validate your world (9 static analysis checks)
neuroverse validate --world ./world
# Run 14 standard guard simulations + fuzz testing
neuroverse test --world ./world
# Red team: 28 adversarial attacks across 6 categories
neuroverse redteam --world ./worldValidation checks: structural completeness, referential integrity, guard coverage, gate consistency, kernel alignment, guard shadowing, reachability analysis, state space coverage, and governance health scoring.
echo '{"intent":"delete user data"}' | neuroverse guard --world ./world --trace
# → {"status":"BLOCK","reason":"...","ruleId":"..."}Pipe your agent's output through neuroverse guard. Every action gets evaluated. Works with Python, Rust, Go, shell scripts — anything that writes to stdout.
# Govern a Python agent
python my_agent.py | neuroverse run --world ./world --plan plan.json
# Interactive governed chat
neuroverse run --interactive --world ./world --provider openai --plan plan.jsonnpx @neuroverseos/nv-sim serve --port 3456POST /api/evaluate
Body: { actor, action, payload?, state?, world? }
Returns: { decision: ALLOW|BLOCK|MODIFY, reason, evidence }
# Zero-dependency test
curl -X POST http://localhost:3456/api/evaluate \
-H "Content-Type: application/json" \
-d '{"actor":"agent_1","action":"panic_sell","world":"trading"}'import { evaluateGuard, loadWorld } from '@neuroverseos/governance';
const world = await loadWorld('./world/');
for (const agent of agents) {
const action = agent.decide();
const verdict = evaluateGuard({ intent: action.intent, tool: action.tool, scope: action.scope }, world);
if (verdict.status === 'BLOCK') {
agent.retry(verdict.reason);
} else {
agent.execute(action);
}
}neuroverse mcp --world ./world --plan plan.jsonYour IDE's AI assistant becomes a governed agent. Same rules, same verdicts.
neuroverse plan compile plan.md --output plan.json
neuroverse plan check --plan plan.json
neuroverse plan advance step_id --plan plan.json --evidence type --proof urlThe simulation UI ships with pre-built profiles for common agent systems:
| Engine | What It Governs | Example |
|---|---|---|
| ScienceClaw | Research agents | Block synthesis with no papers, penalize unsourced claims |
| MiroFish / OASIS | Social simulation | Limit influence concentration, dampen sentiment spirals |
| LangChain / LangGraph | LLM agent chains | Cap tool calls, require validation before output |
| Custom | Any system | Auto-detects field mappings from your JSONL |
Each profile maps your system's output format to the governance engine's action schema automatically.
AI is optional. The deterministic engine runs on math, not tokens. When you bring your own model, AI becomes a governed actor — subject to the same rules as every other agent.
| Provider | Key / Env Var | Auto-detected |
|---|---|---|
| Anthropic (Claude) | sk-ant-* / ANTHROPIC_API_KEY |
Yes |
| OpenAI | sk-* / OPENAI_API_KEY |
Yes |
| Google (Gemini) | AIza* |
Yes |
| Groq | gsk_* / GROQ_API_KEY |
Yes |
| Together | TOGETHER_API_KEY |
Yes |
| Mistral | MISTRAL_API_KEY |
Yes |
| Deepseek | DEEPSEEK_API_KEY |
Yes |
| Fireworks | FIREWORKS_API_KEY |
Yes |
| Ollama | OLLAMA_BASE_URL |
Yes |
| Local LLM | LOCAL_LLM_URL |
Yes |
| (none) | — | Deterministic fallback |
In the browser UI, click the sparkle icon in the header and paste your key. It's stored in localStorage only.
Any endpoint that speaks the OpenAI chat completions format (POST /v1/chat/completions) works.
Most simulation tools answer: "What will happen?"
NV-SIM answers: "What changes when I change the rules — and why?"
| What You Get | What It Proves |
|---|---|
| Behavioral shifts | Before → after for every agent, with percentages |
| Causal explanation | Why agents changed — traced to specific rules |
| Behavioral insights | Output tendencies, echo detection, drift, pattern clustering |
| Blind spot analysis | What you can observe vs. what requires loop integration |
| Full audit trail | Every decision, every rule, every adaptation — JSONL |
The output is narrative, not metrics. Not "40% adjusted actions" — but "40% shifted from aggressive to conservative strategies after early attempts failed."
| Command | What It Does |
|---|---|
nv-sim visualize |
Interactive control platform |
nv-sim enforce [preset] [rules.txt] |
Policy enforcement lab |
nv-sim compare [preset] |
Baseline vs governed simulation |
nv-sim scenario <id> |
Run a named stress scenario |
nv-sim serve --port N |
Governance runtime (HTTP API) |
nv-sim world-from-doc rules.txt |
Generate world from plain English |
nv-sim chaos --runs N |
Stress test (randomized scenarios) |
neuroverse guard --world ./world |
Pipe-mode evaluation |
neuroverse validate --world ./world |
9 static analysis checks |
neuroverse test --world ./world |
14 guard simulations + fuzz |
neuroverse redteam --world ./world |
28 adversarial attacks |
neuroverse playground --world ./world |
Interactive web UI (localhost:4242) |
neuroverse mcp --world ./world |
MCP server for IDE integration |
@neuroverseos/governance ← deterministic rule engine (npm, open source)
↓
nv-sim engine ← world rules + narrative injection + swarm simulation
↓
behavioral analysis ← shift detection, echo detection, drift tracking
↓
behavioral insights ← observed signals vs. integration blind spots
↓
audit trail ← append-only evidence chain (JSONL)
↓
nv-sim CLI + UI ← scenarios, comparison, governance runtime, control platform
↓
AI providers (optional) ← BYOM: Anthropic, OpenAI, Groq, local LLMs, or none
Everything runs locally. No cloud. No accounts. No cost.
Apache 2.0
Change the rules. See why the system changed.