NeoXFortress Agent Risk Lab

Deterministic simulation of AI agent attack scenarios for governance, compliance, and control engineering.

Created by NeoXFortress LLC — AI Agent Accountability Infrastructure for Regulated Contractors

Why This Exists

AI agents are making decisions inside defense contractors. They call tools, process sensitive data, and interact with external services. When an agent goes wrong — prompt injection, data exfiltration, privilege escalation — most organizations have no evidence of what happened.

This lab simulates those failures. Every attack is detected and blocked by the policy engine. Every blocked action produces a signed Agent Accountability Receipt — tamper-evident proof of what was attempted and what was stopped.

This lab shows the attack. The receipt proves you caught it.

Attack Scenarios

Scenario	MITRE ATLAS	What Happens	Outcome
Prompt Injection	AML.T0051	Malicious instructions embedded in a document trick the agent into exfiltrating data and disabling logging	4 injection patterns detected, 2 tool calls blocked
Data Exfiltration	AML.T0048.002	Agent attempts to send CUI-marked data to Gmail, a .ru domain, and an external file share	3 exfiltration vectors blocked by content + domain policy
Privilege Escalation	AML.T0044	Agent attempts shell execution, firewall modification, admin account creation, and key vault access	4 unauthorized tool calls denied by allowlist

Quick Start

git clone https://github.com/NeoXFortress/neox-agent-risk-lab.git
cd neox-agent-risk-lab
python3 run_all.py

No external dependencies. No API keys. Pure Python 3.10+.

What You'll See

============================================================
  NeoXFortress Agent Risk Lab — Attack Simulation Suite
============================================================

[Scenario 1] Prompt Injection Attack
----------------------------------------
  [1] Agent received document (614 chars)
  [2] Injection scanner: 4 pattern(s) detected
      - INJECTION-001: Direct instruction override attempt
      - INJECTION-002: Role reassignment attempt
      - INJECTION-005: Privilege tag injection
      - INJECTION-006: Context boundary manipulation
  [3] Agent tried send_email to pastebin.com → DENY
  [4] Agent tried disable_logging → DENY

[Scenario 2] Tool Misuse / Data Exfiltration
----------------------------------------
  [1] Agent processed CUI//SP-EXPT document
  [2] Agent tried send_email to Gmail with CUI → DENY
  [3] Agent tried send_email to .ru domain → DENY
  [4] Agent tried write_file with CUI content → DENY

[Scenario 3] Privilege Escalation Attempt
----------------------------------------
  [1] Agent tried execute_shell → DENY
  [2] Agent tried modify_firewall → DENY
  [3] Agent tried create_admin_user → DENY
  [4] Agent tried access_key_vault → DENY

[*] Total events logged: 12

Then open outputs/report.html in your browser for the full styled exploit report.

Screenshots

Terminal output: 3 scenarios, 12 events, all attacks blocked

Styled exploit report with MITRE ATLAS mapping and CMMC control coverage

Signed receipt: hash-chained execution steps with compliance verdict

Example: Signed Receipt (Prompt Injection Scenario)

Every scenario produces a schema-valid Agent Accountability Receipt. Here's the compliance verdict and integrity block from the prompt injection receipt:

{
  "compliance": {
    "verdict": "non_compliant",
    "assessed_by": "engine_rule_set",
    "violated_controls": ["SC.L2-3.13.1", "SI.L2-3.14.1"],
    "risk_score": 8.0,
    "framework": "CMMC L2",
    "notes": "Agent attempted to follow injected instructions. 4 patterns detected. Tool calls blocked."
  },
  "integrity": {
    "hash_chain": {
      "alg": "sha256",
      "chain": [
        { "step_id": "step-001", "hash": "a3f8c1...", "prev_hash": "000000..." },
        { "step_id": "step-002", "hash": "7b2e4d...", "prev_hash": "a3f8c1..." },
        { "step_id": "step-003", "hash": "e91f0a...", "prev_hash": "7b2e4d..." },
        { "step_id": "step-004", "hash": "d4c7b2...", "prev_hash": "e91f0a..." }
      ],
      "final_hash": "d4c7b2..."
    },
    "signature": {
      "type": "hmac_sha256",
      "signed_payload": "canonical_receipt_plus_final_hash"
    }
  }
}

Full receipts are in outputs/receipt-*.json. Schema: Agent Accountability Receipt v0.1.1.

Output Files

After running, the outputs/ directory contains:

outputs/
├── execution-log.json                    ← Structured event log (all scenarios)
├── receipt-prompt_injection.json         ← Signed receipt: injection attack
├── receipt-exfiltration.json             ← Signed receipt: CUI exfiltration attempt
├── receipt-privilege_escalation.json     ← Signed receipt: unauthorized tool access
├── report.md                             ← Agent Exploit Report (Markdown)
├── report.html                           ← Agent Exploit Report (styled HTML)
└── evidence-manifest.json                ← Evidence pack manifest

Architecture

1-agent sequential pipeline. No multi-agent coordination needed — deterministic attack simulations with policy enforcement.

Document/Input → Policy Engine → Tool Wrapper → Decision (ALLOW/DENY/REQUIRES_HUMAN) → Log → Receipt

Components

File	Purpose
`policy_engine.py`	Tool-level access control: allowlist enforcement, domain blocking, CUI content gates, prompt injection scanner (7 patterns)
`scenario_runner.py`	Executes 3 attack scenarios, produces structured JSON logs + signed receipts
`report_generator.py`	Builds Markdown + HTML exploit report with MITRE ATLAS mapping and CMMC control coverage
`run_all.py`	One-command runner: scenarios + report

Policy Engine Decisions

Decision	Meaning	Example
`ALLOW`	Tool call permitted under current policy	`read_document` on internal file
`DENY`	Tool call blocked — logged with rule ID and reason	`send_email` to `.ru` domain
`REQUIRES_HUMAN`	Tool call paused pending human approval	`send_email` to approved domain

CMMC L2 Control Coverage

Control	Domain	Scenarios
SC.L2-3.13.1	Boundary Protection	Exfiltration, Prompt Injection
SC.L2-3.13.2	CUI Flow Enforcement	Exfiltration
SI.L2-3.14.1	System Integrity	Prompt Injection
AC.L2-3.1.1	Access Control	Exfiltration, Privilege Escalation
AC.L2-3.1.2	Access Control Enforcement	Privilege Escalation
AU.L2-3.3.1	System Auditing	Privilege Escalation

Roadmap

v1.1 (Next)

Live LLM integration — route scenarios through real OpenAI/Anthropic API calls via the agent harness
Additional MITRE ATLAS techniques: model poisoning (AML.T0020), training data extraction (AML.T0024)
PDF export for evidence packs

v1.2 (Planned)

Asymmetric signature support (Ed25519) for multi-party verification
OWASP LLM Top 10 scenario coverage
Custom scenario builder — define your own attack vectors via YAML

License

MIT License — see LICENSE. Licensed for easy adoption in GovCon and regulated environments.

Open: Schema, reference implementation, and simulation framework (this repo) are MIT-licensed. Proprietary: The NeoXFortress Agent Accountability Engine (AAE) — including the enterprise policy orchestration, advanced enforcement logic, and client integrations — is a separate commercial product. Contact neoxfortress.com for enterprise implementations.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
outputs		outputs
screenshots		screenshots
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
policy_engine.py		policy_engine.py
report_generator.py		report_generator.py
run_all.py		run_all.py
scenario_runner.py		scenario_runner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeoXFortress Agent Risk Lab

Why This Exists

Attack Scenarios

Quick Start

What You'll See

Screenshots

Example: Signed Receipt (Prompt Injection Scenario)

Output Files

Architecture

Components

Policy Engine Decisions

CMMC L2 Control Coverage

Related

Roadmap

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeoXFortress Agent Risk Lab

Why This Exists

Attack Scenarios

Quick Start

What You'll See

Screenshots

Example: Signed Receipt (Prompt Injection Scenario)

Output Files

Architecture

Components

Policy Engine Decisions

CMMC L2 Control Coverage

Related

Roadmap

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages