Skip to content

bluzir/radius

Repository files navigation

RADIUS

Draw the boundary.
No LLM in the decision loop.

npm MIT docs


"The more data & control you give to the AI agent: (A) the more it can help you AND (B) the more it can hurt you." — Lex Fridman

The problem

Your agent has root access to your machine. Your security layer is a system prompt that says "please be careful." Think about that for a second.

Intelligence is scaling. Access is scaling. Security is not. One bad prompt and your agent reads ~/.ssh/id_rsa or runs rm -rf /. One malicious skill and your credentials are gone before you notice.

You could review every action manually, but then why have an agent?

Why this matters

Numbers from 78 validated research sources (114 analyzed), Feb 2026:

13.4% Of 3,984 marketplace skills scanned, 534 had critical issues. 76 were confirmed malicious, with install-time scripts that stole credentials.
6 / 6 Researchers tested six coding agents for tool injection. All six gave up remote code execution through poisoned tool metadata.
85%+ A 78-study survey of prompt-based guardrails found most break under adaptive red-team attacks. The LLM can't reliably police itself.

A regex match on rm -rf is true or false. The agent can't talk its way past it.

What RADIUS does

RADIUS sits between the agent and every tool call. Before anything executes, it runs through a pipeline of modules and gets a verdict: allow, deny, modify (patch arguments), challenge (ask a human), or alert (log and continue).

You pick which modules run and how strict each one is. Blocking ~/.ssh reads but allowing /tmp is one line in fs_guard. Requiring Telegram approval for Bash but not for Read is one rule in approval_gate. Modules are independent, configure them separately.

You start with a profile (local, standard, or unbounded) that sets sensible defaults, then adjust whatever you want in radius.yaml:

npm install agentradius
npx agentradius init --framework openclaw --profile standard

Two commands and you have filesystem locks, shell blocking, secret redaction, rate limits, and an audit log.

Modules

Deterministic modules. Enable only what you need.

Module What it does
kill_switch Emergency stop. Set an env var or drop a file, all risky actions halt.
self_defense Locks control-plane files (config/hooks) as immutable and detects tampering.
tripwire_guard Optional honeytokens. Touching a tripwire can deny immediately or trigger kill switch.
tool_policy Allow or deny by tool name. Optional argument schema validation. Default deny.
fs_guard Blocks file access outside allowed paths. ~/.ssh, ~/.aws, /etc are unreachable.
command_guard Matches shell patterns: sudo, rm -rf, pipe chains. Blocked before execution.
exec_sandbox Wraps commands in bwrap. Restricted filesystem and network access.
egress_guard Outbound network filter. Allowlist by domain, IP, port. Everything else is dropped.
output_dlp Catches secrets in output: AWS keys, tokens, private certs. Redacts or blocks.
rate_budget Caps tool calls per minute. Stops runaway loops.
repetition_guard Optional loop brake for identical tool calls repeated N times in a row.
skill_scanner Inspects skills at load time for injection payloads: zero-width chars, base64 blobs, exfil URLs.
approval_gate Routes risky operations to Telegram or an HTTP endpoint for human approval.
verdict_provider Optional external verdict provider integration (deterministic adapter contract).
audit Append-only log of every decision. Every action, every timestamp.

New in 0.5.x:

  • self_defense for immutable control-plane protection (opt-in)
  • tripwire_guard for honeytoken tripwires (opt-in)
  • repetition_guard for repeated identical tool-call loops (opt-in)

Three postures

One config change. Pick the containment level that matches your context.

local -- production, billing, credentials. Default deny. Sandbox required. 30 calls/min.

standard -- development, staging, daily work. Default deny. Secrets redacted. 60 calls/min.

unbounded -- research, brainstorming, migration. Logs everything, blocks nothing. 120 calls/min.

Install

npm install agentradius

Get running

npx agentradius init --framework openclaw --profile standard
npx agentradius doctor    # verify setup
npx agentradius pentest   # test your defenses

This creates radius.yaml and wires the adapter for your orchestrator.

Supported frameworks: openclaw, nanobot, claude-telegram, generic.

What gets generated:

  • openclaw: .radius/openclaw-hook.command.sh, .radius/openclaw-hooks.json
  • claude-telegram: .radius/claude-telegram.module.yaml, .radius/claude-tool-hook.command.sh, auto-patched .claude/settings.local.json

Hook scripts resolve config via $SCRIPT_DIR so they work regardless of shell working directory.

Usage

As a library

import { RadiusRuntime, GuardPhase } from 'agentradius';

const guard = new RadiusRuntime({
  configPath: './radius.yaml',
  framework: 'openclaw'
});

const result = await guard.evaluateEvent({
  phase: GuardPhase.PRE_TOOL,
  framework: 'openclaw',
  sessionId: 'session-1',
  toolCall: {
    name: 'Bash',
    arguments: { command: 'cat ~/.ssh/id_rsa' },
  },
  metadata: {},
});

// result.finalAction === 'deny'
// result.reason === 'fs_guard: path ~/.ssh/id_rsa is outside allowed paths'

As a hook (stdin/stdout)

echo '{"tool_name":"Bash","tool_input":{"command":"sudo rm -rf /"}}' | npx agentradius hook

As a server

npx agentradius serve --port 3000

Configuration

global:
  profile: standard
  workspace: ${CWD}
  defaultAction: deny

modules:
  - kill_switch
  - tool_policy
  - fs_guard
  - command_guard
  - output_dlp
  - rate_budget
  - audit

moduleConfig:
  kill_switch:
    enabled: true
    envVar: RADIUS_KILL_SWITCH
    filePath: ./.radius/KILL_SWITCH

  fs_guard:
    allowedPaths:
      - ${workspace}
      - /tmp
    blockedPaths:
      - ~/.ssh
      - ~/.aws
    blockedBasenames:
      - .env
      - .env.local
      - .envrc

  command_guard:
    denyPatterns:
      - "^sudo\\s"
      - "rm\\s+-rf"

  rate_budget:
    windowSec: 60
    maxCallsPerWindow: 60
    store:
      engine: sqlite
      path: ./.radius/state.db
      required: false # set true when node:sqlite is available (Node 22+)

Optional hardening modules (all opt-in):

modules:
  - self_defense
  - tripwire_guard
  - repetition_guard
  - exec_sandbox

moduleConfig:
  self_defense:
    immutablePaths:
      - ./radius.yaml
      - ./.radius/**
    onWriteAttempt: deny
    onHashMismatch: kill_switch

  tripwire_guard:
    fileTokens:
      - /workspace/.tripwire/salary_2026.csv
    envTokens:
      - RADIUS_TRIPWIRE_SECRET
    onTrip: kill_switch

  repetition_guard:
    threshold: 3
    cooldownSec: 60
    onRepeat: deny
    store:
      engine: sqlite
      path: ./.radius/state.db
      required: false # set true when node:sqlite is available (Node 22+)

  exec_sandbox:
    engine: bwrap
    shareNetwork: true
    childPolicy:
      network: deny

Template variables: ${workspace}, ${HOME}, ${CWD}, and any environment variable.

Approvals

approval_gate routes risky tools to Telegram or HTTP for human confirmation. Both support sync_wait mode.

Telegram callbacks: Approve (one action) · Allow 30m (temporary lease) · Deny

HTTP expects a POST returning {"status":"approved"}, {"status":"denied"}, {"status":"approved_temporary","ttlSec":1800}, or {"status":"error","reason":"..."}.

Pending workflow is also supported for bridge architectures:

  • initial POST may return {"status":"pending","pollUrl":"https://.../status/<id>","retryAfterMs":500}
  • RADIUS polls pollUrl until final status or timeout
approval:
  channels:
    telegram:
      enabled: true
      transport: polling
      botToken: ${TELEGRAM_BOT_TOKEN}
      allowedChatIds: []
      approverUserIds: []
    http:
      enabled: false
      url: http://127.0.0.1:3101/approvals/resolve
      timeoutMs: 10000
  store:
    engine: sqlite
    path: ./.radius/state.db
    required: true

moduleConfig:
  approval_gate:
    autoRouting:
      defaultChannel: telegram
      frameworkDefaults:
        openclaw: telegram
        generic: http
    rules:
      - tool: "Bash"
        channel: auto
        prompt: 'Approve execution of "Bash"?'
        timeoutSec: 90

Allow 30m only bypasses repeated approval prompts. All other modules still enforce normally.

Single-bot topology note (Telegram):

  • If your orchestrator already consumes Telegram updates for the same bot token, avoid running two polling consumers.
  • For one-bot setups, prefer approval.channel=http and bridge approvals through your existing bot service.

OpenClaw subprocess compatibility

OpenClaw hooks run as subprocesses, so in-memory state resets on every tool call. Anything that needs to persist across calls requires SQLite:

approval:
  store:
    engine: sqlite
    path: ./.radius/state.db
    required: false # set true when node:sqlite is available (Node 22+)

moduleConfig:
  rate_budget:
    store:
      engine: sqlite
      path: ./.radius/state.db
      required: false # set true when node:sqlite is available (Node 22+)
Module Subprocess mode Note
kill_switch, tool_policy, fs_guard, command_guard, audit Works Stateless or file/env based
self_defense, tripwire_guard Works File-system tripwire and immutable checks are subprocess-safe
approval_gate + Allow 30m Works SQLite lease store persists across processes
rate_budget Works SQLite store keeps counters across processes
repetition_guard Works Use SQLite store for cross-process streak tracking
output_dlp Partial Requires PostToolUse hook wiring
egress_guard Works Preflight policy; kernel egress needs OS firewall
exec_sandbox Platform dependent Linux bwrap; non-Linux needs equivalent
skill_scanner Not triggered by PreToolUse Run via npx agentradius scan or CI

Custom adapter

For Claude Code-based orchestrators with custom runtime/protocol, see examples/claude-custom-adapter-runner.mjs.

Maps Claude hook payload to canonical GuardEvent, runs the pipeline, maps back to Claude response JSON.

echo '{"hook_event_name":"PreToolUse","tool_name":"Bash","tool_input":{"command":"sudo id"}}' \
  | node ./examples/claude-custom-adapter-runner.mjs --config ./radius.yaml
# {"decision":"block","reason":"command_guard: denied by pattern ..."}

Threat coverage

Covered

Attack What stops it
Credential theft (cat ~/.ssh/id_rsa) fs_guard
System file access (/etc/shadow) fs_guard
Privilege escalation (sudo ...) command_guard
Destructive shell (rm -rf /) command_guard
Secret leakage in output (AKIA..., ghp_...) output_dlp
Runaway loops (500 calls/min) rate_budget
Emergency freeze kill_switch
Skill supply chain (hidden instructions) skill_scanner
Unsigned skill installs skill_scanner provenance policy
Dotenv harvest (.env reads) fs_guard + command_guard
Network exfiltration egress_guard
Sandbox escape exec_sandbox (bwrap)
Unapproved tool use tool_policy

Not covered (v0.4)

  • Prompt injection at model level. Jailbreaks that produce harmful text without tool calls. RADIUS only sees tool calls and outputs, not the model's reasoning.
  • Semantic attacks via allowed tools. Reading an allowed file, then sending its contents via an allowed API. Modules check independently; they don't reason about intent.
  • Token/cost budgets. Rate limiting counts calls, not tokens or dollars.
  • Multi-tenant isolation. One config per runtime, no user-level policy separation.
  • OS-level exploits. exec_sandbox uses bwrap, not a VM. A kernel exploit bypasses it.

Tests

92 tests across 10 suites. ~500ms.

npm test

CI regression

.github/workflows/security-regression.yml runs build, tests, and pentest on every push:

npx agentradius init --framework generic --profile standard --output /tmp/radius-ci.yaml
npx agentradius pentest --config /tmp/radius-ci.yaml

Built-in pentest

npx agentradius pentest

  [OK  ] fs_guard blocks /etc/passwd
  [OK  ] command_guard blocks sudo chain
  [OK  ] fs_guard blocks dotenv file reads
  [OK  ] output_dlp detects tool-output secret
  [OK  ] output_dlp detects response secret
  [OK  ] skill_scanner catches malicious skill
  [OK  ] skill_scanner catches tool metadata poisoning
  [OK  ] rate_budget blocks runaway loop
  [WARN] egress_guard blocks outbound exfiltration
  [OK  ] adapters handle malformed payloads

Audit metrics

npx agentradius audit --json

Intervention rate, detection latency, kill-switch activations, sandbox coverage, provenance coverage, dotenv exposure posture.

How it works

Orchestrator event
  -> Adapter (converts to canonical format)
    -> Pipeline (modules run in config order)
      -> first DENY or CHALLENGE wins, patches compose, alerts accumulate
    -> Adapter (converts back to orchestrator format)
  -> Response

Modules run in config order. If any module returns DENY or CHALLENGE, the pipeline stops. MODIFY patches are deep-merged. If an enforce-mode module throws, it fails closed (denies). Observe-mode errors log and continue.

Requirements

  • Node.js >= 20
  • Node.js 22+ for persistent state (node:sqlite for approval leases, rate budgets)
  • bwrap (optional, exec_sandbox on Linux)

Credits

Security philosophy and threat model based on research by Dima Matskevich:

License

MIT

About

AI agents run shell commands on your machine. One hallucination = rm -rf or leaked ~/.ssh. RADIUS intercepts every tool call and runs it through regex, path checks, and rate limits before it executes. Code, not prompts.

Resources

License

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors