From 7095420694a49cb2deb1e53ee2ddd17812ab25db Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:23:13 +0530 Subject: [PATCH 01/23] Add skill-doctor: diagnose silent skill discovery failures MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Runs 6 diagnostic checks per skill (YAML parse, required fields, path conventions, cron format, stateful coherence, schema validity). Exits 1 when FAILs are present — suitable as a post-install gate in install.sh. Co-Authored-By: Claude Sonnet 4.6 --- skills/openclaw-native/skill-doctor/SKILL.md | 97 +++++ .../skill-doctor/STATE_SCHEMA.yaml | 33 ++ skills/openclaw-native/skill-doctor/doctor.py | 374 ++++++++++++++++++ .../skill-doctor/example-state.yaml | 57 +++ 4 files changed, 561 insertions(+) create mode 100644 skills/openclaw-native/skill-doctor/SKILL.md create mode 100644 skills/openclaw-native/skill-doctor/STATE_SCHEMA.yaml create mode 100755 skills/openclaw-native/skill-doctor/doctor.py create mode 100644 skills/openclaw-native/skill-doctor/example-state.yaml diff --git a/skills/openclaw-native/skill-doctor/SKILL.md b/skills/openclaw-native/skill-doctor/SKILL.md new file mode 100644 index 0000000..c269529 --- /dev/null +++ b/skills/openclaw-native/skill-doctor/SKILL.md @@ -0,0 +1,97 @@ +--- +name: skill-doctor +version: "1.0" +category: openclaw-native +description: Diagnoses silent skill discovery failures — YAML parse errors, path violations, schema mismatches — so broken skills don't disappear without a trace. +stateful: true +--- + +# Skill Doctor + +## What it does + +OpenClaw loads skills at startup. When a skill fails to load — corrupt frontmatter, bad cron expression, mismatched STATE_SCHEMA — it silently disappears from the registry. There is no error surfaced to the agent. + +Skill Doctor runs a full diagnostic pass over all installed skills and reports every failure that would cause silent non-loading, so you can fix problems before they become invisible gaps. + +## When to invoke + +- After installing new skills or upgrading openclaw-superpowers +- When a skill you expect to find is missing from the registry +- As a post-install gate inside `install.sh` +- Manually, any time something feels off with skill behaviour + +## Diagnostic checks + +Skill Doctor runs 6 checks per skill: + +| Check | Failure condition | +|---|---| +| YAML parse | Frontmatter cannot be parsed by a YAML parser | +| Required fields | `name` or `description` absent from frontmatter | +| Path conventions | Skill directory name does not match `name:` field | +| Cron format | `cron:` present but not a valid 5-field cron expression | +| Stateful coherence | `stateful: true` but `STATE_SCHEMA.yaml` missing | +| Schema validity | `STATE_SCHEMA.yaml` present but missing `version:` or `fields:` | + +## Output levels + +- **PASS** — skill will load correctly +- **WARN** — skill loads but has a non-critical issue (e.g. schema present but `stateful:` missing) +- **FAIL** — skill will not load; must fix before use + +## How to use + +``` +python3 doctor.py --scan # Full diagnostic pass +python3 doctor.py --scan --only-failures # Show FAILs only +python3 doctor.py --scan --skill cron-hygiene # Single skill +python3 doctor.py --fix-hint cron-hygiene # Print actionable fix suggestion +python3 doctor.py --status # Summary of last scan +python3 doctor.py --format json # Machine-readable output +``` + +## Procedure + +**Step 1 — Run the scan** + +``` +python3 doctor.py --scan +``` + +Review the output. Each skill gets a one-line verdict: PASS / WARN / FAIL. + +**Step 2 — Triage FAILs first** + +For each FAIL, run `--fix-hint ` to get an actionable repair suggestion. Skill Doctor never modifies skill files itself — it tells you exactly what to change. + +**Step 3 — Review WARNs** + +WARNs do not block loading but indicate drift from conventions. Common WARN: `STATE_SCHEMA.yaml` exists without `stateful: true` in frontmatter. Fix by adding the frontmatter field. + +**Step 4 — Re-scan to confirm** + +After applying fixes, re-run `--scan` and verify no FAILs remain. + +**Step 5 — Write scan result to state** + +After a clean pass, the scan summary is automatically written to state. Use `--status` in future sessions to surface the last known health without re-scanning. + +## State + +Skill Doctor persists scan results in `~/.openclaw/skill-state/skill-doctor/state.yaml`. + +Fields: `last_scan_at`, `skills_scanned`, `fail_count`, `warn_count`, `violations` list. + +After a clean install, `fail_count` and `warn_count` should both be 0. + +## Integration + +Add to the end of `install.sh`: + +```bash +echo "Running Skill Doctor post-install check..." +python3 ~/.openclaw/extensions/superpowers/skills/openclaw-native/skill-doctor/doctor.py --scan --only-failures +``` + +This surfaces any broken skills immediately after install rather than letting them silently disappear. diff --git a/skills/openclaw-native/skill-doctor/STATE_SCHEMA.yaml b/skills/openclaw-native/skill-doctor/STATE_SCHEMA.yaml new file mode 100644 index 0000000..fb3701a --- /dev/null +++ b/skills/openclaw-native/skill-doctor/STATE_SCHEMA.yaml @@ -0,0 +1,33 @@ +version: "1.0" +description: Diagnostic scan results and per-skill health ledger. +fields: + last_scan_at: + type: datetime + skills_scanned: + type: integer + default: 0 + fail_count: + type: integer + default: 0 + warn_count: + type: integer + default: 0 + violations: + type: list + description: All FAILs and WARNs from the most recent scan + items: + skill_name: { type: string } + level: { type: enum, values: [FAIL, WARN] } + check: { type: string } + message: { type: string } + fix_hint: { type: string } + detected_at: { type: datetime } + resolved: { type: boolean } + scan_history: + type: list + description: Rolling log of past scans (last 10) + items: + scanned_at: { type: datetime } + skills_scanned: { type: integer } + fail_count: { type: integer } + warn_count: { type: integer } diff --git a/skills/openclaw-native/skill-doctor/doctor.py b/skills/openclaw-native/skill-doctor/doctor.py new file mode 100755 index 0000000..ce430f7 --- /dev/null +++ b/skills/openclaw-native/skill-doctor/doctor.py @@ -0,0 +1,374 @@ +#!/usr/bin/env python3 +""" +Skill Doctor for openclaw-superpowers. + +Diagnoses silent skill discovery failures: YAML parse errors, path +violations, schema mismatches, cron format problems. Reports every +issue that would cause a skill to silently disappear from the registry. + +Usage: + python3 doctor.py --scan # Full diagnostic pass + python3 doctor.py --scan --only-failures # FAILs only + python3 doctor.py --scan --skill cron-hygiene # Single skill + python3 doctor.py --fix-hint cron-hygiene # Actionable fix hint + python3 doctor.py --status # Summary of last scan + python3 doctor.py --format json # Machine-readable output +""" + +import argparse +import json +import os +import re +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "skill-doctor" / "state.yaml" +SUPERPOWERS_DIR = Path(os.environ.get( + "SUPERPOWERS_DIR", + Path.home() / ".openclaw" / "extensions" / "superpowers" +)) +SKILLS_DIRS = [ + SUPERPOWERS_DIR / "skills" / "core", + SUPERPOWERS_DIR / "skills" / "openclaw-native", + SUPERPOWERS_DIR / "skills" / "community", +] +CRON_RE = re.compile( + r'^(\*|[0-9,\-\/]+)\s+' + r'(\*|[0-9,\-\/]+)\s+' + r'(\*|[0-9,\-\/]+)\s+' + r'(\*|[0-9,\-\/]+)\s+' + r'(\*|[0-9,\-\/]+)$' +) +MAX_HISTORY = 10 + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"violations": [], "scan_history": [], "skills_scanned": 0, + "fail_count": 0, "warn_count": 0} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Frontmatter parser ──────────────────────────────────────────────────────── + +def parse_frontmatter(skill_md: Path) -> tuple[dict, str]: + """ + Returns (fields_dict, error_message). + error_message is empty string on success. + """ + try: + text = skill_md.read_text() + except Exception as e: + return {}, f"Cannot read file: {e}" + + lines = text.splitlines() + if not lines or lines[0].strip() != "---": + return {}, "No frontmatter block found (file must start with ---)" + + end = None + for i, line in enumerate(lines[1:], start=1): + if line.strip() == "---": + end = i + break + + if end is None: + return {}, "Frontmatter block not closed (missing closing ---)" + + fm_text = "\n".join(lines[1:end]) + if not HAS_YAML: + # Minimal key:value parser + fields = {} + for line in fm_text.splitlines(): + if ":" in line: + k, _, v = line.partition(":") + fields[k.strip()] = v.strip().strip('"').strip("'") + return fields, "" + + try: + fields = yaml.safe_load(fm_text) or {} + return fields, "" + except Exception as e: + return {}, f"YAML parse error: {e}" + + +# ── Per-skill diagnostic ────────────────────────────────────────────────────── + +def diagnose_skill(skill_dir: Path) -> list[dict]: + """Returns list of violation dicts (empty = all clear).""" + violations = [] + skill_name = skill_dir.name + skill_md = skill_dir / "SKILL.md" + now = datetime.now().isoformat() + + def violation(level, check, message, fix_hint): + return { + "skill_name": skill_name, + "level": level, + "check": check, + "message": message, + "fix_hint": fix_hint, + "detected_at": now, + "resolved": False, + } + + if not skill_md.exists(): + violations.append(violation( + "FAIL", "SKILL_MD_MISSING", + f"No SKILL.md in {skill_dir}", + "Create SKILL.md with ---frontmatter--- block." + )) + return violations + + fm, parse_err = parse_frontmatter(skill_md) + + # Check 1: YAML parse + if parse_err: + violations.append(violation( + "FAIL", "YAML_PARSE", + f"Frontmatter unparseable: {parse_err}", + "Fix YAML syntax in the --- block at the top of SKILL.md." + )) + return violations # Can't continue without parseable frontmatter + + # Check 2: Required fields + for field in ("name", "description"): + if not fm.get(field): + violations.append(violation( + "FAIL", "REQUIRED_FIELD", + f"Missing required frontmatter field: `{field}`", + f"Add `{field}: ` to the frontmatter block." + )) + + # Check 3: Path convention + fm_name = fm.get("name", "") + if fm_name and fm_name != skill_name: + violations.append(violation( + "FAIL", "PATH_MISMATCH", + f"Directory name `{skill_name}` does not match `name: {fm_name}`", + f"Rename directory to `{fm_name}` or update `name:` in frontmatter." + )) + + # Check 4: Cron format + cron_val = fm.get("cron", "") + if cron_val: + cron_str = str(cron_val).strip() + if not CRON_RE.match(cron_str): + violations.append(violation( + "FAIL", "CRON_FORMAT", + f"Invalid cron expression: `{cron_str}`", + "Use a valid 5-field cron: `minute hour day month weekday` (e.g. `0 9 * * 1-5`)." + )) + + # Check 5: Stateful coherence + schema_file = skill_dir / "STATE_SCHEMA.yaml" + is_stateful = str(fm.get("stateful", "")).lower() == "true" + + if is_stateful and not schema_file.exists(): + violations.append(violation( + "FAIL", "STATEFUL_NO_SCHEMA", + "`stateful: true` in frontmatter but STATE_SCHEMA.yaml is missing", + "Create STATE_SCHEMA.yaml with `version:` and `fields:` keys." + )) + elif schema_file.exists() and not is_stateful: + violations.append(violation( + "WARN", "SCHEMA_NO_STATEFUL", + "STATE_SCHEMA.yaml exists but `stateful: true` is absent from frontmatter", + "Add `stateful: true` to the frontmatter block." + )) + + # Check 6: Schema validity + if schema_file.exists() and HAS_YAML: + try: + schema = yaml.safe_load(schema_file.read_text()) or {} + if "version" not in schema: + violations.append(violation( + "WARN", "SCHEMA_NO_VERSION", + "STATE_SCHEMA.yaml missing `version:` key", + "Add `version: \"1.0\"` to STATE_SCHEMA.yaml." + )) + if "fields" not in schema: + violations.append(violation( + "WARN", "SCHEMA_NO_FIELDS", + "STATE_SCHEMA.yaml missing `fields:` key", + "Add a `fields:` block defining your state shape." + )) + except Exception as e: + violations.append(violation( + "FAIL", "SCHEMA_PARSE", + f"STATE_SCHEMA.yaml unparseable: {e}", + "Fix YAML syntax in STATE_SCHEMA.yaml." + )) + + return violations + + +# ── Scan ────────────────────────────────────────────────────────────────────── + +def scan(only_failures=False, single_skill=None) -> tuple[list, int, int, int]: + """Returns (violations, skills_scanned, fail_count, warn_count).""" + all_violations = [] + skills_scanned = 0 + + for skills_root in SKILLS_DIRS: + if not skills_root.exists(): + continue + for skill_dir in sorted(skills_root.iterdir()): + if not skill_dir.is_dir(): + continue + if single_skill and skill_dir.name != single_skill: + continue + viols = diagnose_skill(skill_dir) + skills_scanned += 1 + if only_failures: + viols = [v for v in viols if v["level"] == "FAIL"] + all_violations.extend(viols) + + fail_count = sum(1 for v in all_violations if v["level"] == "FAIL") + warn_count = sum(1 for v in all_violations if v["level"] == "WARN") + return all_violations, skills_scanned, fail_count, warn_count + + +# ── Fix hints ───────────────────────────────────────────────────────────────── + +def print_fix_hints(skill_name: str, state: dict) -> None: + violations = [v for v in (state.get("violations") or []) + if v.get("skill_name") == skill_name] + if not violations: + print(f"No recorded violations for '{skill_name}'. Run --scan first.") + return + print(f"\nFix hints for: {skill_name}") + print("─" * 40) + for v in violations: + print(f" [{v['level']}] {v['check']}") + print(f" Problem : {v['message']}") + print(f" Fix : {v['fix_hint']}") + print() + + +# ── Output formatting ───────────────────────────────────────────────────────── + +def print_report(violations: list, skills_scanned: int, + fail_count: int, warn_count: int, fmt: str = "text") -> None: + if fmt == "json": + print(json.dumps({ + "skills_scanned": skills_scanned, + "fail_count": fail_count, + "warn_count": warn_count, + "violations": violations, + }, indent=2)) + return + + print(f"\nSkill Doctor Report — {datetime.now().strftime('%Y-%m-%d %H:%M')}") + print("─" * 48) + print(f" Skills scanned : {skills_scanned}") + print(f" FAILs : {fail_count}") + print(f" WARNs : {warn_count}") + print() + + if not violations: + print(" ✓ All skills healthy — no issues detected.") + else: + # Group by skill + by_skill: dict = {} + for v in violations: + by_skill.setdefault(v["skill_name"], []).append(v) + for skill_name, viols in sorted(by_skill.items()): + for v in viols: + icon = "✗" if v["level"] == "FAIL" else "⚠" + print(f" {icon} [{v['level']:4s}] {skill_name}: {v['check']}") + print(f" {v['message']}") + print() + + +# ── Status ──────────────────────────────────────────────────────────────────── + +def print_status(state: dict) -> None: + last = state.get("last_scan_at", "never") + scanned = state.get("skills_scanned", 0) + fails = state.get("fail_count", 0) + warns = state.get("warn_count", 0) + print(f"\nSkill Doctor — Last scan: {last}") + print(f" {scanned} skills | {fails} FAILs | {warns} WARNs") + active = [v for v in (state.get("violations") or []) if not v.get("resolved")] + if active: + print(f"\n Active issues ({len(active)}):") + for v in active: + print(f" [{v['level']}] {v['skill_name']}: {v['check']}") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Skill Doctor — diagnose skill loading failures") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--scan", action="store_true", help="Run full diagnostic scan") + group.add_argument("--fix-hint", metavar="SKILL", help="Print fix hint for a skill") + group.add_argument("--status", action="store_true", help="Show last scan summary") + parser.add_argument("--only-failures", action="store_true", + help="With --scan, show FAILs only") + parser.add_argument("--skill", metavar="SKILL", + help="With --scan, scan a single skill only") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + + if args.status: + print_status(state) + return + + if args.fix_hint: + print_fix_hints(args.fix_hint, state) + return + + if args.scan: + violations, scanned, fails, warns = scan( + only_failures=args.only_failures, + single_skill=args.skill, + ) + print_report(violations, scanned, fails, warns, fmt=args.format) + + # Persist + now = datetime.now().isoformat() + history = state.get("scan_history") or [] + history.insert(0, { + "scanned_at": now, + "skills_scanned": scanned, + "fail_count": fails, + "warn_count": warns, + }) + state["scan_history"] = history[:MAX_HISTORY] + state["last_scan_at"] = now + state["skills_scanned"] = scanned + state["fail_count"] = fails + state["warn_count"] = warns + state["violations"] = violations + save_state(state) + + sys.exit(1 if fails > 0 else 0) + + +if __name__ == "__main__": + main() diff --git a/skills/openclaw-native/skill-doctor/example-state.yaml b/skills/openclaw-native/skill-doctor/example-state.yaml new file mode 100644 index 0000000..19bbe59 --- /dev/null +++ b/skills/openclaw-native/skill-doctor/example-state.yaml @@ -0,0 +1,57 @@ +# Example runtime state for skill-doctor +last_scan_at: "2026-03-15T09:12:44.003000" +skills_scanned: 32 +fail_count: 2 +warn_count: 1 +violations: + - skill_name: my-custom-skill + level: FAIL + check: YAML_PARSE + message: "Frontmatter unparseable: mapping values are not allowed here" + fix_hint: "Fix YAML syntax in the --- block at the top of SKILL.md." + detected_at: "2026-03-15T09:12:44.000000" + resolved: false + - skill_name: expense-tracker + level: FAIL + check: STATEFUL_NO_SCHEMA + message: "`stateful: true` in frontmatter but STATE_SCHEMA.yaml is missing" + fix_hint: "Create STATE_SCHEMA.yaml with `version:` and `fields:` keys." + detected_at: "2026-03-15T09:12:44.000000" + resolved: false + - skill_name: obsidian-sync + level: WARN + check: SCHEMA_NO_STATEFUL + message: "STATE_SCHEMA.yaml exists but `stateful: true` is absent from frontmatter" + fix_hint: "Add `stateful: true` to the frontmatter block." + detected_at: "2026-03-15T09:12:44.000000" + resolved: false +scan_history: + - scanned_at: "2026-03-15T09:12:44.000000" + skills_scanned: 32 + fail_count: 2 + warn_count: 1 + - scanned_at: "2026-03-14T08:00:11.000000" + skills_scanned: 31 + fail_count: 0 + warn_count: 0 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Run: python3 doctor.py --scan +# Skill Doctor Report — 2026-03-15 09:12 +# ──────────────────────────────────────────────────────────────── +# Skills scanned : 32 +# FAILs : 2 +# WARNs : 1 +# +# ✗ [FAIL] my-custom-skill: YAML_PARSE +# Frontmatter unparseable: mapping values are not allowed here +# ✗ [FAIL] expense-tracker: STATEFUL_NO_SCHEMA +# `stateful: true` in frontmatter but STATE_SCHEMA.yaml is missing +# ⚠ [WARN] obsidian-sync: SCHEMA_NO_STATEFUL +# STATE_SCHEMA.yaml exists but `stateful: true` is absent +# +# Run: python3 doctor.py --fix-hint expense-tracker +# Fix hints for: expense-tracker +# ──────────────────────────────────────────────────────────────── +# [FAIL] STATEFUL_NO_SCHEMA +# Problem : `stateful: true` in frontmatter but STATE_SCHEMA.yaml is missing +# Fix : Create STATE_SCHEMA.yaml with `version:` and `fields:` keys. From e9b3cc58b56f447bf8bec4d244302d327e225a63 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:30:49 +0530 Subject: [PATCH 02/23] Add installed-skill-auditor: weekly post-install security audit Detects INJECTION, CREDENTIAL, EXFILTRATION, DRIFT, and ORPHAN issues in all installed skills. Maintains content baselines for drift detection. Cron: Mondays 9am. Exits 1 on CRITICAL findings. Co-Authored-By: Claude Sonnet 4.6 --- .../installed-skill-auditor/SKILL.md | 90 +++++ .../installed-skill-auditor/STATE_SCHEMA.yaml | 28 ++ .../installed-skill-auditor/audit.py | 374 ++++++++++++++++++ .../example-state.yaml | 56 +++ 4 files changed, 548 insertions(+) create mode 100644 skills/openclaw-native/installed-skill-auditor/SKILL.md create mode 100644 skills/openclaw-native/installed-skill-auditor/STATE_SCHEMA.yaml create mode 100755 skills/openclaw-native/installed-skill-auditor/audit.py create mode 100644 skills/openclaw-native/installed-skill-auditor/example-state.yaml diff --git a/skills/openclaw-native/installed-skill-auditor/SKILL.md b/skills/openclaw-native/installed-skill-auditor/SKILL.md new file mode 100644 index 0000000..b2a4f54 --- /dev/null +++ b/skills/openclaw-native/installed-skill-auditor/SKILL.md @@ -0,0 +1,90 @@ +--- +name: installed-skill-auditor +version: "1.0" +category: openclaw-native +description: Weekly audit of all installed third-party and community skills for malicious patterns, stale credentials, and drift from last-known-good state. +stateful: true +cron: "0 9 * * 1" +--- + +# Installed Skill Auditor + +## What it does + +`skill-vetting` scans before install. `installed-skill-auditor` scans after — continuously. + +Skills can be modified after installation. A community skill that was safe on Monday can be compromised by Tuesday if the source repo is pushed to and your agent auto-pulls. This skill runs weekly to catch post-install drift: injected payloads, hardcoded credentials, and pattern changes that weren't there at install time. + +It maintains a content hash of every skill file at the time it was first audited. On each weekly run it re-hashes and flags anything that changed unexpectedly. + +## When to invoke + +- Automatically, every Monday at 9am (cron) +- Manually after any `git pull` that touches skill directories +- After any agent action that writes to the skills tree + +## Audit checks + +| Check | What it detects | +|---|---| +| INJECTION | Instruction-override patterns in SKILL.md prose | +| CREDENTIAL | Hardcoded tokens, API keys, or secrets in any file | +| EXFILTRATION | URLs + data-sending patterns suggesting exfil | +| DRIFT | File content changed since last known-good baseline | +| ORPHAN | Skill directory present but not in install manifest | + +Severity: CRITICAL (INJECTION, EXFILTRATION) · HIGH (CREDENTIAL) · MEDIUM (DRIFT, ORPHAN) + +## Output + +``` +Installed Skill Audit — 2026-03-16 +──────────────────────────────────────────── +32 skills audited | 0 CRITICAL | 1 HIGH | 2 MEDIUM + +HIGH community/my-custom-skill — CREDENTIAL + Hardcoded token pattern detected in run.py (line 14) + +MEDIUM community/expense-tracker — DRIFT + SKILL.md hash changed since 2026-03-10 baseline + Run: python3 audit.py --diff expense-tracker +``` + +## How to use + +``` +python3 audit.py --scan # Full audit pass +python3 audit.py --scan --critical-only # CRITICAL findings only +python3 audit.py --baseline # Record current state as trusted +python3 audit.py --diff # Show changed lines since baseline +python3 audit.py --resolve # Mark finding resolved after review +python3 audit.py --status # Summary of last run +python3 audit.py --format json # Machine-readable output +``` + +## Procedure + +**Step 1 — Review the report** + +The cron run generates a report automatically. Open it via `--status` or check state. Any CRITICAL finding requires immediate action. + +**Step 2 — Triage by severity** + +- **CRITICAL**: Do not run the skill. Inspect the file, remove or quarantine the skill. +- **HIGH**: Rotate the exposed credential immediately; investigate how it got there. +- **MEDIUM (DRIFT)**: Use `--diff` to see what changed. If the change is expected (you updated the skill), run `--baseline` to accept it. If unexpected, treat as CRITICAL. +- **MEDIUM (ORPHAN)**: A skill directory exists with no install record. Either re-install through the vetting process or remove the directory. + +**Step 3 — Resolve or escalate** + +Run `--resolve ` after reviewing a finding. This marks it acknowledged in state. Unresolved CRITICAL findings are surfaced again on next cron run. + +**Step 4 — Update baseline after intentional changes** + +When you intentionally update a skill (e.g., upgrading to a new version), run `--baseline` so future drift detection has an accurate reference point. + +## State + +Results and content hashes stored in `~/.openclaw/skill-state/installed-skill-auditor/state.yaml`. + +Fields: `last_audit_at`, `baselines` (hash map), `findings`, `audit_history`. diff --git a/skills/openclaw-native/installed-skill-auditor/STATE_SCHEMA.yaml b/skills/openclaw-native/installed-skill-auditor/STATE_SCHEMA.yaml new file mode 100644 index 0000000..d7f3d41 --- /dev/null +++ b/skills/openclaw-native/installed-skill-auditor/STATE_SCHEMA.yaml @@ -0,0 +1,28 @@ +version: "1.0" +description: Content baselines, weekly findings, and audit history for installed skill audits. +fields: + last_audit_at: + type: datetime + baselines: + type: object + description: Map of skill_name -> {file_path -> sha256_hex, recorded_at} + findings: + type: list + description: Active findings awaiting resolution + items: + skill_name: { type: string } + check: { type: enum, values: [INJECTION, CREDENTIAL, EXFILTRATION, DRIFT, ORPHAN] } + severity: { type: enum, values: [CRITICAL, HIGH, MEDIUM] } + file_path: { type: string } + detail: { type: string } + detected_at: { type: datetime } + resolved: { type: boolean } + audit_history: + type: list + description: Rolling summary of weekly audits (last 12) + items: + audited_at: { type: datetime } + skills_audited: { type: integer } + critical_count: { type: integer } + high_count: { type: integer } + medium_count: { type: integer } diff --git a/skills/openclaw-native/installed-skill-auditor/audit.py b/skills/openclaw-native/installed-skill-auditor/audit.py new file mode 100755 index 0000000..bcdbda3 --- /dev/null +++ b/skills/openclaw-native/installed-skill-auditor/audit.py @@ -0,0 +1,374 @@ +#!/usr/bin/env python3 +""" +Installed Skill Auditor for openclaw-superpowers. + +Weekly audit of all installed skills for malicious patterns, +credential leaks, and post-install content drift. + +Usage: + python3 audit.py --scan # Full audit + python3 audit.py --scan --critical-only # CRITICAL only + python3 audit.py --baseline # Record current state + python3 audit.py --diff # Show changes since baseline + python3 audit.py --resolve # Mark finding resolved + python3 audit.py --status # Last audit summary + python3 audit.py --format json # Machine-readable +""" + +import argparse +import hashlib +import json +import os +import re +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "installed-skill-auditor" / "state.yaml" +SUPERPOWERS_DIR = Path(os.environ.get( + "SUPERPOWERS_DIR", + Path.home() / ".openclaw" / "extensions" / "superpowers" +)) +SKILLS_DIRS = [ + SUPERPOWERS_DIR / "skills" / "core", + SUPERPOWERS_DIR / "skills" / "openclaw-native", + SUPERPOWERS_DIR / "skills" / "community", +] +MAX_HISTORY = 12 + +# ── Detection patterns ──────────────────────────────────────────────────────── + +INJECTION_PATTERNS = [ + re.compile(r'ignore (?:all )?(?:previous|prior|above) instructions', re.I), + re.compile(r'you are now (?:a|an|in)', re.I), + re.compile(r'act as (?:a|an)', re.I), + re.compile(r'disregard (?:your|all) (?:rules|instructions|constraints)', re.I), + re.compile(r'new (?:system|developer|admin) (?:instructions?|prompt)', re.I), + re.compile(r'jailbreak', re.I), +] + +CREDENTIAL_PATTERNS = [ + re.compile(r'(?:api[_\-]?key|apikey)\s*[=:]\s*["\']?[A-Za-z0-9_\-]{16,}', re.I), + re.compile(r'(?:secret|token|password|passwd|pwd)\s*[=:]\s*["\']?[A-Za-z0-9_\-]{8,}', re.I), + re.compile(r'sk-[A-Za-z0-9]{20,}'), + re.compile(r'(?:ghp|gho|ghu|ghs|ghr)_[A-Za-z0-9]{36}'), + re.compile(r'AKIA[0-9A-Z]{16}'), # AWS access key + re.compile(r'Bearer [A-Za-z0-9\-_\.]{20,}'), +] + +EXFILTRATION_PATTERNS = [ + re.compile(r'(?:requests?|urllib|curl|wget).*(?:post|put|send).*http', re.I | re.S), + re.compile(r'webhook\.site', re.I), + re.compile(r'requestbin', re.I), + re.compile(r'ngrok\.io', re.I), + re.compile(r'pastebin\.com', re.I), +] + +SEVERITY = { + "INJECTION": "CRITICAL", + "EXFILTRATION": "CRITICAL", + "CREDENTIAL": "HIGH", + "DRIFT": "MEDIUM", + "ORPHAN": "MEDIUM", +} + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"baselines": {}, "findings": [], "audit_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Hashing ─────────────────────────────────────────────────────────────────── + +def file_sha256(path: Path) -> str: + h = hashlib.sha256() + h.update(path.read_bytes()) + return h.hexdigest() + + +def skill_hashes(skill_dir: Path) -> dict: + hashes = {} + for f in sorted(skill_dir.rglob("*")): + if f.is_file(): + rel = str(f.relative_to(skill_dir)) + try: + hashes[rel] = file_sha256(f) + except Exception: + pass + return hashes + + +# ── Pattern scanning ────────────────────────────────────────────────────────── + +def scan_file_patterns(path: Path) -> list[dict]: + findings = [] + try: + text = path.read_text(errors="replace") + except Exception: + return findings + + for pattern in INJECTION_PATTERNS: + if pattern.search(text): + findings.append({"check": "INJECTION", "file": str(path), "detail": + f"Injection pattern matched: {pattern.pattern[:60]}"}) + + for pattern in CREDENTIAL_PATTERNS: + if pattern.search(text): + findings.append({"check": "CREDENTIAL", "file": str(path), "detail": + f"Credential pattern matched: {pattern.pattern[:60]}"}) + + for pattern in EXFILTRATION_PATTERNS: + if pattern.search(text): + findings.append({"check": "EXFILTRATION", "file": str(path), "detail": + f"Exfiltration pattern matched: {pattern.pattern[:60]}"}) + + return findings + + +# ── Audit core ──────────────────────────────────────────────────────────────── + +def audit_skill(skill_dir: Path, baselines: dict) -> list[dict]: + skill_name = skill_dir.name + now = datetime.now().isoformat() + findings = [] + + def finding(check, file_path, detail): + return { + "skill_name": skill_name, + "check": check, + "severity": SEVERITY[check], + "file_path": str(file_path), + "detail": detail, + "detected_at": now, + "resolved": False, + } + + # Pattern scans on all files + for f in sorted(skill_dir.rglob("*")): + if f.is_file() and f.suffix in (".md", ".py", ".sh", ".yaml", ".yml", ".txt"): + for hit in scan_file_patterns(f): + findings.append(finding(hit["check"], hit["file"], hit["detail"])) + + # Drift detection + bl = baselines.get(skill_name, {}).get("hashes", {}) + if bl: + current = skill_hashes(skill_dir) + for rel_path, old_hash in bl.items(): + new_hash = current.get(rel_path) + if new_hash is None: + findings.append(finding("DRIFT", skill_dir / rel_path, + f"File deleted since baseline: {rel_path}")) + elif new_hash != old_hash: + findings.append(finding("DRIFT", skill_dir / rel_path, + f"Content changed since baseline ({rel_path})")) + for rel_path in current: + if rel_path not in bl: + findings.append(finding("DRIFT", skill_dir / rel_path, + f"New file added since baseline: {rel_path}")) + + return findings + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_scan(state: dict, critical_only: bool, fmt: str) -> None: + baselines = state.get("baselines") or {} + all_findings = [] + skills_audited = 0 + + for skills_root in SKILLS_DIRS: + if not skills_root.exists(): + continue + for skill_dir in sorted(skills_root.iterdir()): + if not skill_dir.is_dir(): + continue + findings = audit_skill(skill_dir, baselines) + all_findings.extend(findings) + skills_audited += 1 + + if critical_only: + all_findings = [f for f in all_findings if f["severity"] == "CRITICAL"] + + critical = sum(1 for f in all_findings if f["severity"] == "CRITICAL") + high = sum(1 for f in all_findings if f["severity"] == "HIGH") + medium = sum(1 for f in all_findings if f["severity"] == "MEDIUM") + now = datetime.now().isoformat() + + if fmt == "json": + print(json.dumps({ + "audited_at": now, + "skills_audited": skills_audited, + "critical_count": critical, + "high_count": high, + "medium_count": medium, + "findings": all_findings, + }, indent=2)) + else: + print(f"\nInstalled Skill Audit — {datetime.now().strftime('%Y-%m-%d')}") + print("─" * 48) + print(f" {skills_audited} skills audited | " + f"{critical} CRITICAL | {high} HIGH | {medium} MEDIUM") + print() + if not all_findings: + print(" ✓ No issues detected.") + else: + by_skill: dict = {} + for f in all_findings: + by_skill.setdefault(f["skill_name"], []).append(f) + for sname, flist in sorted(by_skill.items()): + for f in flist: + print(f" {f['severity']:8s} {sname} — {f['check']}") + print(f" {f['detail']}") + print() + + history = state.get("audit_history") or [] + history.insert(0, { + "audited_at": now, + "skills_audited": skills_audited, + "critical_count": critical, + "high_count": high, + "medium_count": medium, + }) + state["audit_history"] = history[:MAX_HISTORY] + state["last_audit_at"] = now + state["findings"] = all_findings + save_state(state) + + sys.exit(1 if critical > 0 else 0) + + +def cmd_baseline(state: dict) -> None: + baselines = state.get("baselines") or {} + now = datetime.now().isoformat() + count = 0 + for skills_root in SKILLS_DIRS: + if not skills_root.exists(): + continue + for skill_dir in sorted(skills_root.iterdir()): + if not skill_dir.is_dir(): + continue + baselines[skill_dir.name] = { + "hashes": skill_hashes(skill_dir), + "recorded_at": now, + } + count += 1 + state["baselines"] = baselines + save_state(state) + print(f"✓ Baseline recorded for {count} skills at {now[:19]}") + + +def cmd_diff(state: dict, skill_name: str) -> None: + baselines = state.get("baselines") or {} + bl = baselines.get(skill_name, {}).get("hashes", {}) + if not bl: + print(f"No baseline recorded for '{skill_name}'. Run --baseline first.") + return + + skill_dir = None + for skills_root in SKILLS_DIRS: + candidate = skills_root / skill_name + if candidate.exists(): + skill_dir = candidate + break + + if skill_dir is None: + print(f"Skill '{skill_name}' not found in skills directories.") + return + + current = skill_hashes(skill_dir) + changed = [(p, "CHANGED") for p, h in bl.items() if current.get(p) != h] + added = [(p, "ADDED") for p in current if p not in bl] + deleted = [(p, "DELETED") for p in bl if p not in current] + all_diffs = changed + added + deleted + + if not all_diffs: + print(f"✓ {skill_name}: no drift detected from baseline.") + return + + print(f"\nDrift for: {skill_name}") + print("─" * 40) + for path, status in sorted(all_diffs): + print(f" {status:8s} {path}") + print() + + +def cmd_resolve(state: dict, skill_name: str) -> None: + findings = state.get("findings") or [] + count = 0 + for f in findings: + if f.get("skill_name") == skill_name and not f.get("resolved"): + f["resolved"] = True + count += 1 + save_state(state) + print(f"✓ Resolved {count} finding(s) for '{skill_name}'.") + + +def cmd_status(state: dict) -> None: + last = state.get("last_audit_at", "never") + print(f"\nInstalled Skill Auditor — Last run: {last}") + history = state.get("audit_history") or [] + if history: + latest = history[0] + print(f" {latest.get('skills_audited',0)} skills | " + f"{latest.get('critical_count',0)} CRITICAL | " + f"{latest.get('high_count',0)} HIGH | " + f"{latest.get('medium_count',0)} MEDIUM") + active = [f for f in (state.get("findings") or []) if not f.get("resolved")] + if active: + print(f"\n Unresolved findings ({len(active)}):") + for f in active[:5]: + print(f" [{f['severity']}] {f['skill_name']}: {f['check']}") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Installed Skill Auditor") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--scan", action="store_true") + group.add_argument("--baseline", action="store_true") + group.add_argument("--diff", metavar="SKILL") + group.add_argument("--resolve", metavar="SKILL") + group.add_argument("--status", action="store_true") + parser.add_argument("--critical-only", action="store_true") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + + if args.status: + cmd_status(state) + elif args.baseline: + cmd_baseline(state) + elif args.diff: + cmd_diff(state, args.diff) + elif args.resolve: + cmd_resolve(state, args.resolve) + elif args.scan: + cmd_scan(state, critical_only=args.critical_only, fmt=args.format) + + +if __name__ == "__main__": + main() diff --git a/skills/openclaw-native/installed-skill-auditor/example-state.yaml b/skills/openclaw-native/installed-skill-auditor/example-state.yaml new file mode 100644 index 0000000..aeb3d8d --- /dev/null +++ b/skills/openclaw-native/installed-skill-auditor/example-state.yaml @@ -0,0 +1,56 @@ +# Example runtime state for installed-skill-auditor +last_audit_at: "2026-03-16T09:00:14.227000" +baselines: + obsidian-sync: + recorded_at: "2026-03-10T08:00:00.000000" + hashes: + SKILL.md: "a3f2c1e8d4b5f901..." + sync.py: "7b8e2d3a1f0c4e59..." + STATE_SCHEMA.yaml: "c9d0e1f2a3b4c5d6..." + my-custom-skill: + recorded_at: "2026-03-10T08:00:00.000000" + hashes: + SKILL.md: "1a2b3c4d5e6f7890..." + run.py: "9f8e7d6c5b4a3210..." +findings: + - skill_name: my-custom-skill + check: CREDENTIAL + severity: HIGH + file_path: "skills/community/my-custom-skill/run.py" + detail: "Credential pattern matched: (?:secret|token|password)" + detected_at: "2026-03-16T09:00:14.000000" + resolved: false + - skill_name: obsidian-sync + check: DRIFT + severity: MEDIUM + file_path: "skills/community/obsidian-sync/sync.py" + detail: "Content changed since baseline (sync.py)" + detected_at: "2026-03-16T09:00:14.000000" + resolved: false +audit_history: + - audited_at: "2026-03-16T09:00:14.000000" + skills_audited: 32 + critical_count: 0 + high_count: 1 + medium_count: 1 + - audited_at: "2026-03-09T09:00:00.000000" + skills_audited: 31 + critical_count: 0 + high_count: 0 + medium_count: 0 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Weekly cron (Monday 09:00) runs: +# python3 audit.py --scan +# +# Installed Skill Audit — 2026-03-16 +# ──────────────────────────────────────────────────────────────── +# 32 skills audited | 0 CRITICAL | 1 HIGH | 1 MEDIUM +# +# HIGH my-custom-skill — CREDENTIAL +# Credential pattern matched in run.py +# MEDIUM obsidian-sync — DRIFT +# Content changed since baseline (sync.py) +# +# Fix CREDENTIAL: Inspect run.py line 14, rotate the exposed token. +# Accept DRIFT: python3 audit.py --diff obsidian-sync +# (review changes, then) python3 audit.py --resolve obsidian-sync From fa28ed004f392267e43b2611a888341e02f11ca0 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:32:33 +0530 Subject: [PATCH 03/23] Add skill-trigger-tester: validate description trigger quality before publish MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Scores a skill's description against should-fire/should-not-fire prompt sets. Computes precision, recall, F1, and assigns a grade A–F. Exits 1 on grade C or lower, suitable as a pre-publish gate. Co-Authored-By: Claude Sonnet 4.6 --- skills/core/skill-trigger-tester/SKILL.md | 126 +++++++++++ skills/core/skill-trigger-tester/test.py | 250 ++++++++++++++++++++++ 2 files changed, 376 insertions(+) create mode 100644 skills/core/skill-trigger-tester/SKILL.md create mode 100755 skills/core/skill-trigger-tester/test.py diff --git a/skills/core/skill-trigger-tester/SKILL.md b/skills/core/skill-trigger-tester/SKILL.md new file mode 100644 index 0000000..c8f8698 --- /dev/null +++ b/skills/core/skill-trigger-tester/SKILL.md @@ -0,0 +1,126 @@ +--- +name: skill-trigger-tester +version: "1.0" +category: core +description: Scores a skill's description field against sample user prompts to predict whether OpenClaw will correctly trigger it — before you publish or install. +--- + +# Skill Trigger Tester + +## What it does + +OpenClaw maps user intent to skills by matching the user's message against each skill's `description:` field. A weak description means your skill silently never fires. A description that's too broad means it fires when it shouldn't. + +Skill Trigger Tester helps you validate the trigger quality of a skill's description before publishing. You give it: +- The description string you're testing +- A set of "should fire" prompts (true positives) +- A set of "should not fire" prompts (true negatives) + +It scores precision, recall, and gives an overall trigger quality grade (A–F) plus actionable suggestions. + +## When to invoke + +- Before publishing any new skill to ClawHub +- When a skill you expect to trigger isn't firing +- When a skill keeps firing on irrelevant prompts +- Inside `create-skill` workflow (Step 5: validation) + +## Scoring model + +The tool uses a keyword + semantic overlap heuristic against the description field: + +| Metric | Meaning | +|---|---| +| **Recall** | % of "should fire" prompts that would match | +| **Precision** | % of matches that are actually "should fire" | +| **F1** | Harmonic mean of recall and precision | + +Grade thresholds: + +| Grade | F1 | +|---|---| +| A | ≥ 0.85 | +| B | ≥ 0.70 | +| C | ≥ 0.55 | +| D | ≥ 0.40 | +| F | < 0.40 | + +## How to use + +```bash +python3 test.py --description "Diagnoses skill discovery failures" \ + --should-fire "why isn't my skill loading" \ + "my skill disappeared from the registry" \ + "check if my skills are healthy" \ + --should-not-fire "write a skill" \ + "install superpowers" \ + "review my code" + +python3 test.py --file skill-spec.yaml # Load test cases from YAML file +python3 test.py --format json # Machine-readable output +``` + +## Test spec file format + +```yaml +description: "Diagnoses skill discovery failures — YAML parse errors, path violations" +should_fire: + - "why isn't my skill loading" + - "my skill disappeared" + - "check skill health" +should_not_fire: + - "write a new skill" + - "install openclaw" +``` + +## Procedure + +**Step 1 — Write your test cases** + +For each skill you're testing, list 3–5 prompts that should trigger it and 3–5 that should not. Be honest about edge cases. + +**Step 2 — Run the scorer** + +```bash +python3 test.py --description "" \ + --should-fire "..." --should-not-fire "..." +``` + +**Step 3 — Interpret results** + +- Grade A/B: description is well-calibrated. Publish. +- Grade C: borderline — add more specific keywords to the description, or narrow the wording. +- Grade D/F: description is too vague or uses jargon the user won't say. Rewrite and retest. + +**Step 4 — Iterate** + +Try alternative descriptions and compare scores side-by-side using `--compare`. + +**Step 5 — Add test file to the skill directory** + +Commit the `trigger-tests.yaml` spec alongside the skill. Future contributors can run it to verify trigger quality hasn't regressed. + +## Common mistakes + +- **Too generic**: `"Helps with skills"` — will either never fire or fire on everything +- **Technical jargon**: `"Validates SKILL.md frontmatter schema coherence"` — users don't say this +- **Action + object only**: `"Creates skills"` — add when/why context +- **Missing synonyms**: If users might say "check" or "verify" or "test", the description needs to capture the semantic range + +## Output example + +``` +Skill Trigger Quality — skill-doctor +───────────────────────────────────────────── +Description: "Diagnoses silent skill discovery failures..." + +Should fire (5 prompts): 4 / 5 matched recall = 0.80 +Should not fire (5 prompts): 1 / 5 matched precision = 0.80 +F1 score: 0.80 Grade: B + +⚠ 1 false negative: "check if skills are healthy" + Suggestion: Add "healthy", "health", or "check" to the description. + +⚠ 1 false positive: "check my code" + Suggestion: Narrow description to avoid generic "check" overlap. +``` diff --git a/skills/core/skill-trigger-tester/test.py b/skills/core/skill-trigger-tester/test.py new file mode 100755 index 0000000..d37c4f7 --- /dev/null +++ b/skills/core/skill-trigger-tester/test.py @@ -0,0 +1,250 @@ +#!/usr/bin/env python3 +""" +Skill Trigger Tester for openclaw-superpowers. + +Scores a skill description's trigger quality against sample prompts. +Predicts whether OpenClaw will correctly fire the skill — before publish. + +Usage: + python3 test.py --description "Diagnoses skill failures" \\ + --should-fire "why isn't my skill loading" "skill disappeared" \\ + --should-not-fire "write a skill" "install openclaw" + + python3 test.py --file trigger-tests.yaml + python3 test.py --file spec.yaml --compare "Alternative description here" + python3 test.py --format json --file spec.yaml +""" + +import argparse +import json +import re +import sys +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +# ── Tokeniser ───────────────────────────────────────────────────────────────── + +_STOPWORDS = { + "a", "an", "the", "and", "or", "but", "in", "on", "at", "to", "for", + "of", "with", "by", "from", "up", "is", "are", "was", "were", "be", + "been", "being", "have", "has", "had", "do", "does", "did", "will", + "would", "could", "should", "may", "might", "shall", "can", "it", + "its", "this", "that", "these", "those", "my", "your", "our", "their", + "i", "you", "we", "they", "he", "she", "not", "no", "so", "if", +} + + +def tokenise(text: str) -> set[str]: + tokens = re.findall(r"[a-z0-9]+(?:'[a-z]+)?", text.lower()) + return {t for t in tokens if t not in _STOPWORDS and len(t) > 1} + + +# ── Synonyms ────────────────────────────────────────────────────────────────── + +_SYNONYMS: list[set[str]] = [ + {"check", "verify", "validate", "test", "inspect", "audit", "scan"}, + {"fix", "repair", "resolve", "remediate", "correct"}, + {"find", "detect", "discover", "locate", "identify"}, + {"create", "write", "build", "make", "generate", "add"}, + {"run", "execute", "launch", "start", "trigger", "invoke", "fire"}, + {"skill", "skills", "extension", "extensions", "plugin", "plugins"}, + {"install", "installed", "setup", "configure"}, + {"error", "errors", "failure", "failures", "broken", "issue", "issues", "problem"}, + {"load", "loading", "loads"}, + {"missing", "gone", "disappeared", "absent"}, + {"memory", "remember", "recall", "stored"}, + {"schedule", "scheduled", "cron", "recurring", "automatic"}, + {"cost", "spend", "budget", "expensive", "usage"}, +] + + +def expand_synonyms(tokens: set[str]) -> set[str]: + expanded = set(tokens) + for group in _SYNONYMS: + if tokens & group: + expanded |= group + return expanded + + +# ── Match scoring ───────────────────────────────────────────────────────────── + +def description_tokens(description: str) -> set[str]: + return expand_synonyms(tokenise(description)) + + +def prompt_matches(prompt: str, desc_tokens: set[str]) -> bool: + ptokens = expand_synonyms(tokenise(prompt)) + overlap = ptokens & desc_tokens + if not ptokens: + return False + score = len(overlap) / len(ptokens) + return score >= 0.25 # 25% token overlap threshold + + +# ── Grading ─────────────────────────────────────────────────────────────────── + +def grade(f1: float) -> str: + if f1 >= 0.85: + return "A" + if f1 >= 0.70: + return "B" + if f1 >= 0.55: + return "C" + if f1 >= 0.40: + return "D" + return "F" + + +# ── Analysis ────────────────────────────────────────────────────────────────── + +def analyse(description: str, should_fire: list[str], + should_not_fire: list[str]) -> dict: + desc_tokens = description_tokens(description) + + tp_matches = [p for p in should_fire if prompt_matches(p, desc_tokens)] + fp_matches = [p for p in should_not_fire if prompt_matches(p, desc_tokens)] + fn_misses = [p for p in should_fire if not prompt_matches(p, desc_tokens)] + tn_correct = [p for p in should_not_fire if not prompt_matches(p, desc_tokens)] + + recall = len(tp_matches) / len(should_fire) if should_fire else 1.0 + precision = (len(tp_matches) / (len(tp_matches) + len(fp_matches)) + if (tp_matches or fp_matches) else 1.0) + f1 = (2 * precision * recall / (precision + recall) + if (precision + recall) > 0 else 0.0) + + suggestions = [] + for miss in fn_misses: + miss_tokens = tokenise(miss) - _STOPWORDS + missing = miss_tokens - desc_tokens + if missing: + suggestions.append( + f"False negative: \"{miss}\" — consider adding: " + + ", ".join(f'"{t}"' for t in sorted(missing)[:3]) + ) + else: + suggestions.append(f"False negative: \"{miss}\" — check synonym coverage") + + for fp in fp_matches: + suggestions.append( + f"False positive: \"{fp}\" — narrow description to reduce overlap" + ) + + return { + "description": description, + "recall": round(recall, 3), + "precision": round(precision, 3), + "f1": round(f1, 3), + "grade": grade(f1), + "should_fire_total": len(should_fire), + "should_fire_matched": len(tp_matches), + "should_not_fire_total": len(should_not_fire), + "should_not_fire_mismatched": len(fp_matches), + "false_negatives": fn_misses, + "false_positives": fp_matches, + "suggestions": suggestions, + } + + +# ── Output ──────────────────────────────────────────────────────────────────── + +def print_result(result: dict, label: str = "") -> None: + header = f"Skill Trigger Quality{' — ' + label if label else ''}" + print(f"\n{header}") + print("─" * 48) + desc = result["description"] + print(f"Description: \"{desc[:80]}{'...' if len(desc) > 80 else ''}\"") + print() + print(f" Should fire ({result['should_fire_total']:2d} prompts): " + f"{result['should_fire_matched']:2d} matched " + f"recall = {result['recall']:.2f}") + print(f" Should not fire({result['should_not_fire_total']:2d} prompts): " + f"{result['should_not_fire_mismatched']:2d} matched " + f"precision = {result['precision']:.2f}") + print() + print(f" F1 score: {result['f1']:.2f} Grade: {result['grade']}") + + if result["suggestions"]: + print() + for s in result["suggestions"]: + print(f" ⚠ {s}") + print() + + +def print_comparison(r1: dict, r2: dict) -> None: + print("\nDescription Comparison") + print("─" * 48) + for label, r in [("Option A (original)", r1), ("Option B (alternative)", r2)]: + print(f" {label}: F1={r['f1']:.2f} Grade={r['grade']} " + f"(recall={r['recall']:.2f}, precision={r['precision']:.2f})") + winner = "A" if r1["f1"] >= r2["f1"] else "B" + print(f"\n → Option {winner} scores higher.") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def load_spec(path: str) -> dict: + if not HAS_YAML: + print("ERROR: pyyaml required for --file. Install with: pip install pyyaml") + sys.exit(1) + text = Path(path).read_text() + return yaml.safe_load(text) or {} + + +def main(): + parser = argparse.ArgumentParser(description="Skill Trigger Tester") + src = parser.add_mutually_exclusive_group(required=True) + src.add_argument("--description", metavar="TEXT", + help="Description string to test") + src.add_argument("--file", metavar="YAML", + help="Load test spec from YAML file") + parser.add_argument("--should-fire", nargs="+", metavar="PROMPT", default=[], + help="Prompts that should trigger the skill") + parser.add_argument("--should-not-fire", nargs="+", metavar="PROMPT", default=[], + help="Prompts that should NOT trigger the skill") + parser.add_argument("--compare", metavar="ALT_DESC", + help="Alternative description to compare against") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + if args.file: + spec = load_spec(args.file) + description = spec.get("description", "") + should_fire = spec.get("should_fire", []) + should_not_fire = spec.get("should_not_fire", []) + else: + description = args.description + should_fire = args.should_fire + should_not_fire = args.should_not_fire + + if not should_fire and not should_not_fire: + print("ERROR: Provide at least one --should-fire or --should-not-fire prompt.") + sys.exit(1) + + result = analyse(description, should_fire, should_not_fire) + + alt_result = None + if args.compare: + alt_result = analyse(args.compare, should_fire, should_not_fire) + + if args.format == "json": + output = {"primary": result} + if alt_result: + output["alternative"] = alt_result + print(json.dumps(output, indent=2)) + else: + print_result(result, label=description[:40]) + if alt_result: + print_result(alt_result, label="Alternative") + print_comparison(result, alt_result) + + sys.exit(0 if result["grade"] in ("A", "B") else 1) + + +if __name__ == "__main__": + main() From 4cd5c511e72f87880f56bdcba722f2979232fe57 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:39:26 +0530 Subject: [PATCH 04/23] Add skill-loadout-manager: named skill profiles to manage context bloat Defines and switches between curated skill subsets (loadouts). Ships 4 presets (minimal, coding, research, ops) and estimates token savings per loadout vs. all-skills-active mode. Co-Authored-By: Claude Sonnet 4.6 --- .../skill-loadout-manager/SKILL.md | 106 +++++ .../skill-loadout-manager/STATE_SCHEMA.yaml | 29 ++ .../skill-loadout-manager/example-state.yaml | 71 ++++ .../skill-loadout-manager/loadout.py | 377 ++++++++++++++++++ 4 files changed, 583 insertions(+) create mode 100644 skills/openclaw-native/skill-loadout-manager/SKILL.md create mode 100644 skills/openclaw-native/skill-loadout-manager/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/skill-loadout-manager/example-state.yaml create mode 100755 skills/openclaw-native/skill-loadout-manager/loadout.py diff --git a/skills/openclaw-native/skill-loadout-manager/SKILL.md b/skills/openclaw-native/skill-loadout-manager/SKILL.md new file mode 100644 index 0000000..1d00455 --- /dev/null +++ b/skills/openclaw-native/skill-loadout-manager/SKILL.md @@ -0,0 +1,106 @@ +--- +name: skill-loadout-manager +version: "1.0" +category: openclaw-native +description: Manages named skill profiles (loadouts) so you can switch between focused skill sets and prevent system prompt bloat from too many active skills. +stateful: true +--- + +# Skill Loadout Manager + +## What it does + +Installing more skills increases OpenClaw's system prompt size. Every installed skill contributes its description to the context window on every session start — even skills you haven't used in months. + +Skill Loadout Manager lets you define named loadouts: curated subsets of skills for specific contexts. You switch to a loadout and only those skills are active. Everything else is installed but dormant. + +Examples: +- `coding` — tools for writing, testing, reviewing code +- `research` — browsing, fact-checking, note synthesis +- `ops` — monitoring, cron hygiene, spend tracking +- `minimal` — just the essentials: memory, handoff, recovery + +## When to invoke + +- When you notice system prompt bloat slowing context initialisation +- When switching between focused work modes (deep coding vs. research) +- When you want to test a single skill in isolation +- After adding many new skills that aren't always relevant + +## Loadout structure + +A loadout is a named list of skill names stored in state. Activating a loadout signals to OpenClaw's skill loader which skills to surface in the system prompt. Skills not in the active loadout remain installed but excluded from description injection. + +```yaml +# Example loadout definition +name: coding +skills: + - systematic-debugging + - test-driven-development + - verification-before-completion + - skill-doctor + - dangerous-action-guard +``` + +## How to use + +```bash +python3 loadout.py --list # Show all loadouts and active one +python3 loadout.py --create coding # Create new loadout (interactive) +python3 loadout.py --add coding skill-doctor # Add skill to loadout +python3 loadout.py --remove coding skill-doctor # Remove skill +python3 loadout.py --activate coding # Switch to loadout +python3 loadout.py --activate --all # Activate all skills +python3 loadout.py --show coding # List skills in a loadout +python3 loadout.py --status # Current active loadout +python3 loadout.py --estimate coding # Estimate token savings +``` + +## Procedure + +**Step 1 — Assess current footprint** + +```bash +python3 loadout.py --estimate --all +``` + +This shows the estimated description token count for all installed skills and highlights candidates for loadout pruning. + +**Step 2 — Define your loadouts** + +Think in contexts: What skills do you actually need when writing code? When doing research? During maintenance windows? Create one loadout per context, aiming for 5–10 skills each. + +```bash +python3 loadout.py --create coding +python3 loadout.py --add coding systematic-debugging test-driven-development +python3 loadout.py --add coding verification-before-completion dangerous-action-guard +``` + +**Step 3 — Activate a loadout** + +```bash +python3 loadout.py --activate coding +``` + +OpenClaw reads the active loadout from state on next session start and only injects those skill descriptions. + +**Step 4 — Switch as needed** + +Switching is instant and takes effect on the next session. No restart required. + +**Step 5 — Return to full mode** + +```bash +python3 loadout.py --activate --all +``` + +## State + +Active loadout name and all loadout definitions stored in `~/.openclaw/skill-state/skill-loadout-manager/state.yaml`. + +Fields: `active_loadout`, `loadouts` map, `switch_history`. + +## Notes + +- Always-on skills (e.g., `dangerous-action-guard`, `prompt-injection-guard`) can be marked `pinned: true` so they're included in every loadout automatically. +- The `minimal` loadout is pre-seeded at install time with only safety and recovery skills. diff --git a/skills/openclaw-native/skill-loadout-manager/STATE_SCHEMA.yaml b/skills/openclaw-native/skill-loadout-manager/STATE_SCHEMA.yaml new file mode 100644 index 0000000..300d34f --- /dev/null +++ b/skills/openclaw-native/skill-loadout-manager/STATE_SCHEMA.yaml @@ -0,0 +1,29 @@ +version: "1.0" +description: Named skill loadouts, active loadout pointer, and switch history. +fields: + active_loadout: + type: string + default: "all" + description: Name of the currently active loadout ("all" means all skills active) + pinned_skills: + type: list + description: Skills always included regardless of active loadout + items: + type: string + loadouts: + type: object + description: Map of loadout_name -> loadout definition + items: + name: { type: string } + skills: { type: list, items: { type: string } } + description: { type: string } + created_at: { type: datetime } + last_used: { type: datetime } + switch_history: + type: list + description: Log of loadout switches (last 20) + items: + switched_at: { type: datetime } + from_loadout: { type: string } + to_loadout: { type: string } + skill_count: { type: integer } diff --git a/skills/openclaw-native/skill-loadout-manager/example-state.yaml b/skills/openclaw-native/skill-loadout-manager/example-state.yaml new file mode 100644 index 0000000..63e794a --- /dev/null +++ b/skills/openclaw-native/skill-loadout-manager/example-state.yaml @@ -0,0 +1,71 @@ +# Example runtime state for skill-loadout-manager +active_loadout: coding +pinned_skills: + - dangerous-action-guard + - prompt-injection-guard + - agent-self-recovery +loadouts: + minimal: + name: minimal + description: Safety and recovery essentials only + skills: + - agent-self-recovery + - context-window-management + - dangerous-action-guard + - prompt-injection-guard + - task-handoff + created_at: "2026-03-10T08:00:00.000000" + last_used: null + coding: + name: coding + description: Software development workflow + skills: + - dangerous-action-guard + - skill-doctor + - subagent-driven-development + - systematic-debugging + - test-driven-development + - verification-before-completion + created_at: "2026-03-10T08:00:00.000000" + last_used: "2026-03-15T09:00:00.000000" + ops: + name: ops + description: Operations, monitoring, and cost management + skills: + - cron-hygiene + - installed-skill-auditor + - loop-circuit-breaker + - secrets-hygiene + - spend-circuit-breaker + - workspace-integrity-guardian + created_at: "2026-03-12T10:00:00.000000" + last_used: "2026-03-13T14:00:00.000000" +switch_history: + - switched_at: "2026-03-15T09:00:00.000000" + from_loadout: research + to_loadout: coding + skill_count: 6 + - switched_at: "2026-03-14T14:30:00.000000" + from_loadout: all + to_loadout: research + skill_count: 5 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# python3 loadout.py --list +# Skill Loadouts (active: coding) +# ──────────────────────────────────────── +# all (all installed skills) +# minimal 5 skills Safety and recovery essentials only +# ▶ coding 6 skills Software development workflow +# ops 6 skills Operations, monitoring, and cost management +# +# python3 loadout.py --estimate coding +# Token estimate for loadout 'coding' +# ──────────────────────────────────────── +# Skills in loadout : 6 +# Est. tokens : ~320 +# All skills total : ~1,800 +# Token savings : ~1,480 (82% reduction) +# +# python3 loadout.py --activate ops +# ✓ Activated loadout 'ops' (6 skills). +# Takes effect on next OpenClaw session start. diff --git a/skills/openclaw-native/skill-loadout-manager/loadout.py b/skills/openclaw-native/skill-loadout-manager/loadout.py new file mode 100755 index 0000000..e049e4e --- /dev/null +++ b/skills/openclaw-native/skill-loadout-manager/loadout.py @@ -0,0 +1,377 @@ +#!/usr/bin/env python3 +""" +Skill Loadout Manager for openclaw-superpowers. + +Manages named skill profiles to control which skills are active, +preventing system prompt bloat from too many installed skills. + +Usage: + python3 loadout.py --list + python3 loadout.py --create + python3 loadout.py --add [ ...] + python3 loadout.py --remove + python3 loadout.py --activate + python3 loadout.py --activate --all + python3 loadout.py --show + python3 loadout.py --status + python3 loadout.py --pin + python3 loadout.py --estimate [|--all] +""" + +import argparse +import os +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "skill-loadout-manager" / "state.yaml" +SUPERPOWERS_DIR = Path(os.environ.get( + "SUPERPOWERS_DIR", + Path.home() / ".openclaw" / "extensions" / "superpowers" +)) +SKILLS_DIRS = [ + SUPERPOWERS_DIR / "skills" / "core", + SUPERPOWERS_DIR / "skills" / "openclaw-native", + SUPERPOWERS_DIR / "skills" / "community", +] +MAX_HISTORY = 20 + +# Skills that should always be active (security + recovery) +DEFAULT_PINNED = [ + "dangerous-action-guard", + "prompt-injection-guard", + "agent-self-recovery", + "task-handoff", +] + +# Pre-built loadout seeds +PRESET_LOADOUTS = { + "minimal": { + "description": "Safety and recovery essentials only", + "skills": [ + "dangerous-action-guard", + "prompt-injection-guard", + "agent-self-recovery", + "task-handoff", + "context-window-management", + ], + }, + "coding": { + "description": "Software development workflow", + "skills": [ + "systematic-debugging", + "test-driven-development", + "verification-before-completion", + "subagent-driven-development", + "skill-doctor", + "dangerous-action-guard", + ], + }, + "research": { + "description": "Research and knowledge synthesis", + "skills": [ + "fact-check-before-trust", + "brainstorming", + "writing-plans", + "channel-context-bridge", + "persistent-memory-hygiene", + ], + }, + "ops": { + "description": "Operations, monitoring, and cost management", + "skills": [ + "cron-hygiene", + "spend-circuit-breaker", + "loop-circuit-breaker", + "workspace-integrity-guardian", + "secrets-hygiene", + "installed-skill-auditor", + ], + }, +} + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return { + "active_loadout": "all", + "pinned_skills": list(DEFAULT_PINNED), + "loadouts": {}, + "switch_history": [], + } + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Skill discovery ─────────────────────────────────────────────────────────── + +def all_installed_skills() -> list[str]: + names = [] + for skills_root in SKILLS_DIRS: + if not skills_root.exists(): + continue + for d in sorted(skills_root.iterdir()): + if d.is_dir() and (d / "SKILL.md").exists(): + names.append(d.name) + return names + + +def skill_description_tokens(skill_name: str) -> int: + """Rough estimate of tokens contributed by this skill's description.""" + for skills_root in SKILLS_DIRS: + skill_md = skills_root / skill_name / "SKILL.md" + if skill_md.exists(): + try: + text = skill_md.read_text() + # Extract description from frontmatter + lines = text.splitlines() + for line in lines[1:20]: + if line.strip() == "---": + break + if line.startswith("description:"): + desc = line.split(":", 1)[1].strip().strip('"').strip("'") + return max(1, len(desc.split()) * 4 // 3) + except Exception: + pass + return 20 # default estimate + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_list(state: dict) -> None: + active = state.get("active_loadout", "all") + loadouts = state.get("loadouts") or {} + print(f"\nSkill Loadouts (active: {active})") + print("─" * 40) + print(f" {'all':20s} (all installed skills)") + for name, ldef in sorted(loadouts.items()): + marker = "▶" if name == active else " " + skill_count = len(ldef.get("skills") or []) + desc = ldef.get("description", "")[:40] + print(f" {marker} {name:20s} {skill_count:2d} skills {desc}") + print() + + +def cmd_create(state: dict, name: str, description: str = "") -> None: + loadouts = state.get("loadouts") or {} + if name in loadouts: + print(f"Loadout '{name}' already exists. Use --add to add skills.") + return + preset = PRESET_LOADOUTS.get(name, {}) + loadouts[name] = { + "name": name, + "skills": list(preset.get("skills", [])), + "description": description or preset.get("description", ""), + "created_at": datetime.now().isoformat(), + "last_used": None, + } + state["loadouts"] = loadouts + save_state(state) + seed_count = len(loadouts[name]["skills"]) + seed_msg = f" (seeded with {seed_count} preset skills)" if seed_count else "" + print(f"✓ Created loadout '{name}'{seed_msg}.") + + +def cmd_add(state: dict, loadout_name: str, skills: list[str]) -> None: + loadouts = state.get("loadouts") or {} + if loadout_name not in loadouts: + print(f"Loadout '{loadout_name}' not found. Run --create {loadout_name} first.") + return + existing = set(loadouts[loadout_name].get("skills") or []) + added = [] + for skill in skills: + if skill not in existing: + existing.add(skill) + added.append(skill) + loadouts[loadout_name]["skills"] = sorted(existing) + state["loadouts"] = loadouts + save_state(state) + if added: + print(f"✓ Added to '{loadout_name}': {', '.join(added)}") + else: + print(f"All skills already in '{loadout_name}'.") + + +def cmd_remove(state: dict, loadout_name: str, skill: str) -> None: + loadouts = state.get("loadouts") or {} + if loadout_name not in loadouts: + print(f"Loadout '{loadout_name}' not found.") + return + skills = loadouts[loadout_name].get("skills") or [] + if skill in skills: + skills.remove(skill) + loadouts[loadout_name]["skills"] = skills + state["loadouts"] = loadouts + save_state(state) + print(f"✓ Removed '{skill}' from '{loadout_name}'.") + else: + print(f"'{skill}' not in loadout '{loadout_name}'.") + + +def cmd_activate(state: dict, name: str) -> None: + loadouts = state.get("loadouts") or {} + if name != "all" and name not in loadouts: + print(f"Loadout '{name}' not found. Create it first with --create {name}.") + return + + previous = state.get("active_loadout", "all") + state["active_loadout"] = name + + if name != "all": + loadouts[name]["last_used"] = datetime.now().isoformat() + state["loadouts"] = loadouts + + history = state.get("switch_history") or [] + skill_count = (len(loadouts.get(name, {}).get("skills") or []) + if name != "all" else len(all_installed_skills())) + history.insert(0, { + "switched_at": datetime.now().isoformat(), + "from_loadout": previous, + "to_loadout": name, + "skill_count": skill_count, + }) + state["switch_history"] = history[:MAX_HISTORY] + save_state(state) + print(f"✓ Activated loadout '{name}' ({skill_count} skills).") + print(f" Takes effect on next OpenClaw session start.") + + +def cmd_show(state: dict, name: str) -> None: + loadouts = state.get("loadouts") or {} + pinned = set(state.get("pinned_skills") or []) + + if name == "all": + skills = all_installed_skills() + print(f"\nLoadout: all ({len(skills)} skills)") + elif name in loadouts: + ldef = loadouts[name] + skills = sorted(ldef.get("skills") or []) + print(f"\nLoadout: {name} ({len(skills)} skills)") + if ldef.get("description"): + print(f" {ldef['description']}") + else: + print(f"Loadout '{name}' not found.") + return + + print("─" * 40) + for skill in skills: + pin_marker = " [pinned]" if skill in pinned else "" + print(f" - {skill}{pin_marker}") + print() + + +def cmd_pin(state: dict, skill: str) -> None: + pinned = set(state.get("pinned_skills") or []) + pinned.add(skill) + state["pinned_skills"] = sorted(pinned) + save_state(state) + print(f"✓ Pinned '{skill}' — will be included in all loadouts.") + + +def cmd_estimate(state: dict, name: str) -> None: + if name == "all": + skills = all_installed_skills() + else: + loadouts = state.get("loadouts") or {} + if name not in loadouts: + print(f"Loadout '{name}' not found.") + return + skills = loadouts[name].get("skills") or [] + + total_tokens = sum(skill_description_tokens(s) for s in skills) + all_skills = all_installed_skills() + all_tokens = sum(skill_description_tokens(s) for s in all_skills) + savings = all_tokens - total_tokens + + print(f"\nToken estimate for loadout '{name}'") + print("─" * 40) + print(f" Skills in loadout : {len(skills)}") + print(f" Est. tokens : ~{total_tokens}") + if name != "all": + print(f" All skills total : ~{all_tokens}") + pct = int(100 * savings / all_tokens) if all_tokens else 0 + print(f" Token savings : ~{savings} ({pct}% reduction)") + print() + + +def cmd_status(state: dict) -> None: + active = state.get("active_loadout", "all") + pinned = state.get("pinned_skills") or [] + history = state.get("switch_history") or [] + print(f"\nActive loadout: {active}") + if pinned: + print(f"Pinned skills : {', '.join(pinned)}") + if history: + last = history[0] + print(f"Last switch : {last.get('switched_at','')[:16]} " + f"({last.get('from_loadout','')} → {last.get('to_loadout','')})") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Skill Loadout Manager") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--list", action="store_true") + group.add_argument("--create", metavar="NAME") + group.add_argument("--add", nargs="+", metavar="ARG", + help=" [ ...]") + group.add_argument("--remove", nargs=2, metavar=("LOADOUT", "SKILL")) + group.add_argument("--activate", metavar="NAME") + group.add_argument("--show", metavar="NAME") + group.add_argument("--pin", metavar="SKILL") + group.add_argument("--estimate", metavar="NAME") + group.add_argument("--status", action="store_true") + parser.add_argument("--all", action="store_true", + help="With --activate or --estimate: use all skills") + parser.add_argument("--description", metavar="TEXT", default="", + help="With --create: loadout description") + args = parser.parse_args() + + state = load_state() + + if args.list: + cmd_list(state) + elif args.create: + cmd_create(state, args.create, args.description) + elif args.add: + if len(args.add) < 2: + print("Usage: --add [ ...]") + sys.exit(1) + cmd_add(state, args.add[0], args.add[1:]) + elif args.remove: + cmd_remove(state, args.remove[0], args.remove[1]) + elif args.activate: + cmd_activate(state, "all" if args.all else args.activate) + elif args.show: + cmd_show(state, "all" if args.all else args.show) + elif args.pin: + cmd_pin(state, args.pin) + elif args.estimate: + cmd_estimate(state, "all" if args.all else args.estimate) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() From 9a2e00c07670d47a587fe0cbaf826c26d27169ff Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:45:23 +0530 Subject: [PATCH 05/23] Add skill-compatibility-checker: detect version/feature incompatibilities Reads requires_openclaw + requires_features frontmatter fields and compares against detected (or overridden) OpenClaw version. Ships feature registry with 5 runtime capabilities and their introduction versions. Co-Authored-By: Claude Sonnet 4.6 --- .../skill-compatibility-checker/SKILL.md | 94 +++++ .../STATE_SCHEMA.yaml | 27 ++ .../skill-compatibility-checker/check.py | 363 ++++++++++++++++++ .../example-state.yaml | 47 +++ 4 files changed, 531 insertions(+) create mode 100644 skills/openclaw-native/skill-compatibility-checker/SKILL.md create mode 100644 skills/openclaw-native/skill-compatibility-checker/STATE_SCHEMA.yaml create mode 100755 skills/openclaw-native/skill-compatibility-checker/check.py create mode 100644 skills/openclaw-native/skill-compatibility-checker/example-state.yaml diff --git a/skills/openclaw-native/skill-compatibility-checker/SKILL.md b/skills/openclaw-native/skill-compatibility-checker/SKILL.md new file mode 100644 index 0000000..affe4de --- /dev/null +++ b/skills/openclaw-native/skill-compatibility-checker/SKILL.md @@ -0,0 +1,94 @@ +--- +name: skill-compatibility-checker +version: "1.0" +category: openclaw-native +description: Checks whether installed skills are compatible with the current OpenClaw version and flags skills that require runtime features not present in your installation. +stateful: true +--- + +# Skill Compatibility Checker + +## What it does + +Skills can declare minimum OpenClaw version requirements and depend on specific runtime features (cron engine, session isolation, state storage, context compaction). When you upgrade or downgrade OpenClaw, or move a skill to a different environment, compatibility silently breaks. + +Skill Compatibility Checker reads skill frontmatter for version constraints and feature requirements, then compares them against the currently running OpenClaw version. It reports incompatibilities before they cause confusing silent failures. + +## When to invoke + +- After upgrading OpenClaw +- Before deploying a skill to a new environment +- When a skill that previously worked stops working after an update +- As a post-upgrade gate in automated deployment pipelines + +## Frontmatter fields checked + +```yaml +--- +name: my-skill +requires_openclaw: ">=1.4.0" # optional semver constraint +requires_features: # optional list of runtime features + - cron + - session_isolation + - state_storage + - context_compaction + - sessions_send +--- +``` + +If these fields are absent the skill is treated as version-agnostic. + +## Feature registry + +| Feature | Introduced | Description | +|---|---|---| +| `cron` | 1.0.0 | Cron-scheduled skill wakeups | +| `state_storage` | 1.0.0 | Persistent skill state at `~/.openclaw/skill-state/` | +| `session_isolation` | 1.2.0 | Skills run in isolated sessions (not main session) | +| `context_compaction` | 1.3.0 | Native context compaction API | +| `sessions_send` | 1.4.0 | Cross-session message passing | + +## Output + +``` +Skill Compatibility Check — OpenClaw 1.3.2 +──────────────────────────────────────────────── +32 skills checked | 0 incompatible | 2 warnings + +WARN channel-context-bridge: requires sessions_send (≥1.4.0, have 1.3.2) +WARN multi-agent-coordinator: requires sessions_send (≥1.4.0, have 1.3.2) +``` + +## How to use + +```bash +python3 check.py --check # Full compatibility scan +python3 check.py --check --skill my-skill # Single skill +python3 check.py --openclaw-version 1.5.0 # Override detected version +python3 check.py --features # List all known features + versions +python3 check.py --status # Last check summary +python3 check.py --format json +``` + +## Procedure + +**Step 1 — Run the check** + +```bash +python3 check.py --check +``` + +**Step 2 — Triage incompatibilities** + +- **FAIL**: The skill cannot run on this version. Either upgrade OpenClaw or disable the skill. +- **WARN**: The skill declares a feature that may be available but wasn't present in the stated minimum. Functionality may be degraded. + +**Step 3 — Document your constraints** + +If you're writing a new skill that uses `sessions_send` or `session_isolation`, add the appropriate `requires_openclaw:` and `requires_features:` frontmatter so future users know immediately if their version supports it. + +## State + +Last check results stored in `~/.openclaw/skill-state/skill-compatibility-checker/state.yaml`. + +Fields: `last_check_at`, `openclaw_version`, `incompatibilities`, `check_history`. diff --git a/skills/openclaw-native/skill-compatibility-checker/STATE_SCHEMA.yaml b/skills/openclaw-native/skill-compatibility-checker/STATE_SCHEMA.yaml new file mode 100644 index 0000000..9270bfe --- /dev/null +++ b/skills/openclaw-native/skill-compatibility-checker/STATE_SCHEMA.yaml @@ -0,0 +1,27 @@ +version: "1.0" +description: Compatibility check results against the current OpenClaw version. +fields: + last_check_at: + type: datetime + openclaw_version: + type: string + description: OpenClaw version at time of last check + incompatibilities: + type: list + description: Skills that failed or warned during the last check + items: + skill_name: { type: string } + level: { type: enum, values: [FAIL, WARN] } + constraint: { type: string } + feature: { type: string } + detail: { type: string } + detected_at: { type: datetime } + check_history: + type: list + description: Rolling log of past checks (last 10) + items: + checked_at: { type: datetime } + openclaw_version: { type: string } + skills_checked: { type: integer } + fail_count: { type: integer } + warn_count: { type: integer } diff --git a/skills/openclaw-native/skill-compatibility-checker/check.py b/skills/openclaw-native/skill-compatibility-checker/check.py new file mode 100755 index 0000000..023c95e --- /dev/null +++ b/skills/openclaw-native/skill-compatibility-checker/check.py @@ -0,0 +1,363 @@ +#!/usr/bin/env python3 +""" +Skill Compatibility Checker for openclaw-superpowers. + +Checks installed skills against the current OpenClaw version for +version constraints and feature availability. + +Usage: + python3 check.py --check + python3 check.py --check --skill channel-context-bridge + python3 check.py --openclaw-version 1.5.0 # override detected version + python3 check.py --features # list feature registry + python3 check.py --status # last check summary + python3 check.py --format json +""" + +import argparse +import json +import os +import re +import subprocess +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "skill-compatibility-checker" / "state.yaml" +SUPERPOWERS_DIR = Path(os.environ.get( + "SUPERPOWERS_DIR", + Path.home() / ".openclaw" / "extensions" / "superpowers" +)) +SKILLS_DIRS = [ + SUPERPOWERS_DIR / "skills" / "core", + SUPERPOWERS_DIR / "skills" / "openclaw-native", + SUPERPOWERS_DIR / "skills" / "community", +] +MAX_HISTORY = 10 + +# Feature availability: feature_name -> introduced_in (as tuple) +FEATURE_REGISTRY = { + "cron": (1, 0, 0), + "state_storage": (1, 0, 0), + "session_isolation": (1, 2, 0), + "context_compaction": (1, 3, 0), + "sessions_send": (1, 4, 0), +} + + +# ── Version helpers ─────────────────────────────────────────────────────────── + +def parse_version(v: str) -> tuple[int, ...]: + """Parse '1.2.3' → (1, 2, 3). Returns (0,) on failure.""" + try: + return tuple(int(x) for x in re.findall(r'\d+', str(v))) + except Exception: + return (0,) + + +def version_str(t: tuple) -> str: + return ".".join(str(x) for x in t) + + +def satisfies_constraint(version: tuple, constraint: str) -> tuple[bool, str]: + """ + Check semver constraint like '>=1.2.0', '==1.3.0', '>1.0'. + Returns (ok, explanation). + """ + constraint = constraint.strip() + m = re.match(r'^(>=|<=|==|!=|>|<)\s*(.+)$', constraint) + if not m: + return True, "" # can't parse → assume compatible + + op, ver_str = m.group(1), m.group(2).strip() + required = parse_version(ver_str) + + ops = { + ">=": lambda a, b: a >= b, + "<=": lambda a, b: a <= b, + "==": lambda a, b: a == b, + "!=": lambda a, b: a != b, + ">": lambda a, b: a > b, + "<": lambda a, b: a < b, + } + ok = ops[op](version, required) + explanation = f"{op}{ver_str}" + return ok, explanation + + +def detect_openclaw_version() -> tuple[int, ...]: + """Try to detect OpenClaw version from CLI or version file.""" + # Try CLI + try: + result = subprocess.run( + ["openclaw", "--version"], + capture_output=True, text=True, timeout=5 + ) + output = result.stdout.strip() or result.stderr.strip() + m = re.search(r'(\d+\.\d+(?:\.\d+)?)', output) + if m: + return parse_version(m.group(1)) + except Exception: + pass + + # Try version file + for candidate in [ + OPENCLAW_DIR / "VERSION", + OPENCLAW_DIR / "version.txt", + Path("/usr/local/lib/openclaw/VERSION"), + ]: + if candidate.exists(): + try: + return parse_version(candidate.read_text().strip()) + except Exception: + pass + + return (1, 0, 0) # safe fallback + + +# ── Frontmatter parser ──────────────────────────────────────────────────────── + +def parse_frontmatter(skill_md: Path) -> dict: + try: + text = skill_md.read_text() + lines = text.splitlines() + if not lines or lines[0].strip() != "---": + return {} + end = None + for i, line in enumerate(lines[1:], 1): + if line.strip() == "---": + end = i + break + if end is None: + return {} + fm_text = "\n".join(lines[1:end]) + if HAS_YAML: + return yaml.safe_load(fm_text) or {} + # Minimal fallback + fields = {} + for line in fm_text.splitlines(): + if ":" in line and not line.startswith(" "): + k, _, v = line.partition(":") + fields[k.strip()] = v.strip().strip('"').strip("'") + return fields + except Exception: + return {} + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"incompatibilities": [], "check_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Check logic ─────────────────────────────────────────────────────────────── + +def check_skill(skill_dir: Path, oc_version: tuple) -> list[dict]: + skill_name = skill_dir.name + skill_md = skill_dir / "SKILL.md" + if not skill_md.exists(): + return [] + + fm = parse_frontmatter(skill_md) + now = datetime.now().isoformat() + issues = [] + + def issue(level, constraint, feature, detail): + return { + "skill_name": skill_name, + "level": level, + "constraint": constraint, + "feature": feature, + "detail": detail, + "detected_at": now, + } + + # Check requires_openclaw version constraint + req_ver = fm.get("requires_openclaw", "") + if req_ver: + ok, explanation = satisfies_constraint(oc_version, str(req_ver)) + if not ok: + issues.append(issue( + "FAIL", + str(req_ver), + "openclaw_version", + f"Requires OpenClaw {explanation}, have {version_str(oc_version)}" + )) + + # Check requires_features list + req_features = fm.get("requires_features") or [] + if isinstance(req_features, str): + req_features = [req_features] + + for feature in req_features: + feature = str(feature).strip() + if feature not in FEATURE_REGISTRY: + issues.append(issue( + "WARN", + "", + feature, + f"Unknown feature '{feature}' — not in feature registry" + )) + continue + introduced = FEATURE_REGISTRY[feature] + if oc_version < introduced: + level = "FAIL" if oc_version < introduced else "WARN" + issues.append(issue( + level, + f">={version_str(introduced)}", + feature, + f"Requires feature '{feature}' (≥{version_str(introduced)}), " + f"have {version_str(oc_version)}" + )) + + return issues + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_check(state: dict, oc_version: tuple, single_skill: str, + fmt: str) -> None: + all_issues = [] + skills_checked = 0 + + for skills_root in SKILLS_DIRS: + if not skills_root.exists(): + continue + for skill_dir in sorted(skills_root.iterdir()): + if not skill_dir.is_dir(): + continue + if single_skill and skill_dir.name != single_skill: + continue + issues = check_skill(skill_dir, oc_version) + all_issues.extend(issues) + skills_checked += 1 + + fails = sum(1 for i in all_issues if i["level"] == "FAIL") + warns = sum(1 for i in all_issues if i["level"] == "WARN") + now = datetime.now().isoformat() + + if fmt == "json": + print(json.dumps({ + "checked_at": now, + "openclaw_version": version_str(oc_version), + "skills_checked": skills_checked, + "fail_count": fails, + "warn_count": warns, + "incompatibilities": all_issues, + }, indent=2)) + else: + print(f"\nSkill Compatibility Check — OpenClaw {version_str(oc_version)}") + print("─" * 50) + print(f" {skills_checked} skills checked | " + f"{fails} incompatible | {warns} warnings") + print() + if not all_issues: + print(" ✓ All skills compatible.") + else: + for issue in all_issues: + icon = "✗" if issue["level"] == "FAIL" else "⚠" + print(f" {icon} {issue['level']:4s} {issue['skill_name']}: " + f"{issue['detail']}") + print() + + # Persist + history = state.get("check_history") or [] + history.insert(0, { + "checked_at": now, + "openclaw_version": version_str(oc_version), + "skills_checked": skills_checked, + "fail_count": fails, + "warn_count": warns, + }) + state["check_history"] = history[:MAX_HISTORY] + state["last_check_at"] = now + state["openclaw_version"] = version_str(oc_version) + state["incompatibilities"] = all_issues + save_state(state) + + sys.exit(1 if fails > 0 else 0) + + +def cmd_features(fmt: str) -> None: + if fmt == "json": + print(json.dumps({k: version_str(v) for k, v in FEATURE_REGISTRY.items()}, + indent=2)) + return + print("\nOpenClaw Feature Registry") + print("─" * 40) + for feature, introduced in sorted(FEATURE_REGISTRY.items(), + key=lambda x: x[1]): + print(f" {feature:25s} introduced in {version_str(introduced)}") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_check_at", "never") + ver = state.get("openclaw_version", "unknown") + history = state.get("check_history") or [] + print(f"\nSkill Compatibility Checker — Last run: {last} (OpenClaw {ver})") + if history: + h = history[0] + print(f" {h['skills_checked']} checked | " + f"{h['fail_count']} incompatible | {h['warn_count']} warnings") + active = state.get("incompatibilities") or [] + if active: + print(f"\n Active incompatibilities ({len(active)}):") + for i in active[:5]: + print(f" [{i['level']}] {i['skill_name']}: {i['detail'][:60]}") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Skill Compatibility Checker") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--check", action="store_true") + group.add_argument("--features", action="store_true") + group.add_argument("--status", action="store_true") + parser.add_argument("--skill", metavar="NAME") + parser.add_argument("--openclaw-version", metavar="VER", + help="Override detected OpenClaw version") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + + if args.features: + cmd_features(args.format) + return + + if args.status: + cmd_status(state) + return + + oc_version = (parse_version(args.openclaw_version) + if args.openclaw_version + else detect_openclaw_version()) + + cmd_check(state, oc_version, single_skill=args.skill, fmt=args.format) + + +if __name__ == "__main__": + main() diff --git a/skills/openclaw-native/skill-compatibility-checker/example-state.yaml b/skills/openclaw-native/skill-compatibility-checker/example-state.yaml new file mode 100644 index 0000000..bd7b64a --- /dev/null +++ b/skills/openclaw-native/skill-compatibility-checker/example-state.yaml @@ -0,0 +1,47 @@ +# Example runtime state for skill-compatibility-checker +last_check_at: "2026-03-15T09:30:00.000000" +openclaw_version: "1.3.2" +incompatibilities: + - skill_name: channel-context-bridge + level: WARN + constraint: ">=1.4.0" + feature: sessions_send + detail: "Requires feature 'sessions_send' (≥1.4.0), have 1.3.2" + detected_at: "2026-03-15T09:30:00.000000" + - skill_name: multi-agent-coordinator + level: WARN + constraint: ">=1.4.0" + feature: sessions_send + detail: "Requires feature 'sessions_send' (≥1.4.0), have 1.3.2" + detected_at: "2026-03-15T09:30:00.000000" +check_history: + - checked_at: "2026-03-15T09:30:00.000000" + openclaw_version: "1.3.2" + skills_checked: 32 + fail_count: 0 + warn_count: 2 + - checked_at: "2026-03-01T10:00:00.000000" + openclaw_version: "1.3.0" + skills_checked: 31 + fail_count: 0 + warn_count: 2 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# python3 check.py --check +# +# Skill Compatibility Check — OpenClaw 1.3.2 +# ────────────────────────────────────────────────────────────── +# 32 skills checked | 0 incompatible | 2 warnings +# +# ⚠ WARN channel-context-bridge: Requires feature 'sessions_send' (≥1.4.0), have 1.3.2 +# ⚠ WARN multi-agent-coordinator: Requires feature 'sessions_send' (≥1.4.0), have 1.3.2 +# +# Upgrade OpenClaw to 1.4.0+ to fully enable these skills. +# +# python3 check.py --features +# OpenClaw Feature Registry +# ──────────────────────────────────────── +# cron introduced in 1.0.0 +# state_storage introduced in 1.0.0 +# session_isolation introduced in 1.2.0 +# context_compaction introduced in 1.3.0 +# sessions_send introduced in 1.4.0 From cc01630866bee004b310fb001c9530ba43b4de42 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:56:54 +0530 Subject: [PATCH 06/23] Add skill-conflict-detector: detect name shadowing and description overlap (#21) Detects NAME_SHADOW (CRITICAL), EXACT_DUPLICATE (CRITICAL), HIGH_OVERLAP (HIGH), and MEDIUM_OVERLAP (MEDIUM) conflicts between installed skills using Jaccard similarity on description tokens. Co-authored-by: Claude Sonnet 4.6 --- skills/core/skill-conflict-detector/SKILL.md | 85 +++++ skills/core/skill-conflict-detector/detect.py | 293 ++++++++++++++++++ 2 files changed, 378 insertions(+) create mode 100644 skills/core/skill-conflict-detector/SKILL.md create mode 100755 skills/core/skill-conflict-detector/detect.py diff --git a/skills/core/skill-conflict-detector/SKILL.md b/skills/core/skill-conflict-detector/SKILL.md new file mode 100644 index 0000000..4dfb400 --- /dev/null +++ b/skills/core/skill-conflict-detector/SKILL.md @@ -0,0 +1,85 @@ +--- +name: skill-conflict-detector +version: "1.0" +category: core +description: Detects skill name shadowing and description-overlap conflicts that cause OpenClaw to trigger the wrong skill or silently ignore one when two skills compete for the same intent. +--- + +# Skill Conflict Detector + +## What it does + +Two types of conflict cause skills to misbehave silently: + +**1. Name shadowing** — Two installed skills have the same `name:` field. OpenClaw loads the last one lexicographically; the other silently disappears. No warning. + +**2. Description overlap** — Two skills' descriptions are so semantically similar that OpenClaw can't reliably distinguish them. The wrong skill fires. You think one skill is broken; actually the other is intercepting it. + +Skill Conflict Detector scans all installed skills for both types and reports them with overlap scores and resolution suggestions. + +## When to invoke + +- After installing a new skill from ClawHub +- When a skill fires inconsistently or triggers on unexpected prompts +- Before publishing a new skill (ensure it doesn't shadow an existing one) +- As part of `install.sh` post-install validation + +## Conflict types + +| Type | Severity | Effect | +|---|---|---| +| NAME_SHADOW | CRITICAL | One skill completely hidden | +| EXACT_DUPLICATE | CRITICAL | Identical description — both fire or neither does | +| HIGH_OVERLAP | HIGH | >75% semantic similarity — unreliable trigger routing | +| MEDIUM_OVERLAP | MEDIUM | 50–75% similarity — possible confusion | + +## Output + +``` +Skill Conflict Report — 32 skills +──────────────────────────────────────────────── +0 CRITICAL | 1 HIGH | 0 MEDIUM + +HIGH skill-vetting ↔ installed-skill-auditor overlap: 0.81 + Both describe "scanning skills for security issues" + Suggestion: Differentiate — skill-vetting is pre-install, + installed-skill-auditor is post-install ongoing audit. +``` + +## How to use + +```bash +python3 detect.py --scan # Full conflict scan +python3 detect.py --scan --skill my-skill # Check one skill vs all others +python3 detect.py --scan --threshold 0.6 # Custom similarity threshold +python3 detect.py --names # Check name shadowing only +python3 detect.py --format json +``` + +## Procedure + +**Step 1 — Run the scan** + +```bash +python3 detect.py --scan +``` + +**Step 2 — Resolve CRITICAL conflicts first** + +NAME_SHADOW: Rename one skill's `name:` field and its directory. Run `bash scripts/validate-skills.sh` to confirm. + +EXACT_DUPLICATE: One skill is redundant. Remove or differentiate it. + +**Step 3 — Assess HIGH_OVERLAP pairs** + +Read both descriptions. Ask: could a user's natural-language request unambiguously route to one and not the other? If no, differentiate. Common fix: add the scope or timing to the description (e.g., "before install" vs. "after install"). + +**Step 4 — Accept or suppress MEDIUM_OVERLAP** + +Medium overlaps are informational. If the two skills serve genuinely different contexts and users would naturally phrase requests differently, they can coexist. Document why in the skill's SKILL.md if it's non-obvious. + +## Similarity model + +Token-overlap Jaccard similarity between description strings after stop-word removal. Fast and deterministic — no external dependencies. + +Threshold defaults: HIGH ≥ 0.75, MEDIUM ≥ 0.50. diff --git a/skills/core/skill-conflict-detector/detect.py b/skills/core/skill-conflict-detector/detect.py new file mode 100755 index 0000000..7f6d484 --- /dev/null +++ b/skills/core/skill-conflict-detector/detect.py @@ -0,0 +1,293 @@ +#!/usr/bin/env python3 +""" +Skill Conflict Detector for openclaw-superpowers. + +Detects name shadowing and description-overlap conflicts between +installed skills that cause silent trigger routing failures. + +Usage: + python3 detect.py --scan + python3 detect.py --scan --skill my-skill + python3 detect.py --scan --threshold 0.6 + python3 detect.py --names # Name shadowing only + python3 detect.py --format json +""" + +import argparse +import json +import os +import re +import sys +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +SUPERPOWERS_DIR = Path(os.environ.get( + "SUPERPOWERS_DIR", + Path.home() / ".openclaw" / "extensions" / "superpowers" +)) +SKILLS_DIRS = [ + SUPERPOWERS_DIR / "skills" / "core", + SUPERPOWERS_DIR / "skills" / "openclaw-native", + SUPERPOWERS_DIR / "skills" / "community", +] + +DEFAULT_HIGH_THRESHOLD = 0.75 +DEFAULT_MEDIUM_THRESHOLD = 0.50 + +_STOPWORDS = { + "a", "an", "the", "and", "or", "but", "in", "on", "at", "to", "for", + "of", "with", "by", "from", "is", "are", "was", "were", "be", "been", + "it", "its", "this", "that", "so", "not", "no", "all", "any", "each", + "more", "most", "has", "have", "had", "do", "does", "did", "will", + "would", "could", "should", "may", "can", "which", "when", "where", + "how", "what", "who", "i", "you", "we", "they", "he", "she", +} + + +# ── Frontmatter parser ──────────────────────────────────────────────────────── + +def parse_frontmatter(skill_md: Path) -> dict: + try: + text = skill_md.read_text() + lines = text.splitlines() + if not lines or lines[0].strip() != "---": + return {} + end = None + for i, line in enumerate(lines[1:], 1): + if line.strip() == "---": + end = i + break + if end is None: + return {} + fm_text = "\n".join(lines[1:end]) + if HAS_YAML: + return yaml.safe_load(fm_text) or {} + fields = {} + for line in fm_text.splitlines(): + if ":" in line and not line.startswith(" "): + k, _, v = line.partition(":") + fields[k.strip()] = v.strip().strip('"').strip("'") + return fields + except Exception: + return {} + + +# ── Tokeniser + similarity ──────────────────────────────────────────────────── + +def tokenise(text: str) -> set[str]: + tokens = re.findall(r"[a-z0-9]+", text.lower()) + return {t for t in tokens if t not in _STOPWORDS and len(t) > 2} + + +def jaccard(a: set, b: set) -> float: + if not a and not b: + return 1.0 + inter = len(a & b) + union = len(a | b) + return inter / union if union > 0 else 0.0 + + +# ── Skill loader ────────────────────────────────────────────────────────────── + +def load_all_skills() -> list[dict]: + skills = [] + for skills_root in SKILLS_DIRS: + if not skills_root.exists(): + continue + for skill_dir in sorted(skills_root.iterdir()): + if not skill_dir.is_dir(): + continue + skill_md = skill_dir / "SKILL.md" + if not skill_md.exists(): + continue + fm = parse_frontmatter(skill_md) + skills.append({ + "dir_name": skill_dir.name, + "name": fm.get("name", skill_dir.name), + "description": fm.get("description", ""), + "path": str(skill_md), + }) + return skills + + +# ── Conflict detection ──────────────────────────────────────────────────────── + +def detect_conflicts(skills: list[dict], high_threshold: float, + medium_threshold: float, + single_skill: str = None) -> list[dict]: + conflicts = [] + + # Name shadowing: same name field, different directories + by_name: dict = {} + for s in skills: + by_name.setdefault(s["name"], []).append(s) + + for name, group in by_name.items(): + if len(group) > 1: + for i in range(len(group)): + for j in range(i + 1, len(group)): + a, b = group[i], group[j] + if single_skill and single_skill not in (a["dir_name"], b["dir_name"]): + continue + conflicts.append({ + "type": "NAME_SHADOW", + "severity": "CRITICAL", + "skill_a": a["dir_name"], + "skill_b": b["dir_name"], + "overlap_score": 1.0, + "detail": f"Both have `name: {name}` — one will be hidden", + "suggestion": f"Rename one skill's `name:` field and its directory.", + }) + + # Description overlap + for i in range(len(skills)): + for j in range(i + 1, len(skills)): + a, b = skills[i], skills[j] + if single_skill and single_skill not in (a["dir_name"], b["dir_name"]): + continue + + ta = tokenise(a["description"]) + tb = tokenise(b["description"]) + + if not ta or not tb: + continue + + score = jaccard(ta, tb) + + if score >= high_threshold: + # Check for exact duplicate + severity = "CRITICAL" if a["description"] == b["description"] else "HIGH" + ctype = "EXACT_DUPLICATE" if severity == "CRITICAL" else "HIGH_OVERLAP" + common = ta & tb + conflicts.append({ + "type": ctype, + "severity": severity, + "skill_a": a["dir_name"], + "skill_b": b["dir_name"], + "overlap_score": round(score, 3), + "detail": ( + f"Descriptions share key terms: " + + ", ".join(f'"{t}"' for t in sorted(common)[:5]) + ), + "suggestion": ( + "Differentiate descriptions — add scope, timing, or " + "context that distinguishes when each skill fires." + ), + }) + elif score >= medium_threshold: + common = ta & tb + conflicts.append({ + "type": "MEDIUM_OVERLAP", + "severity": "MEDIUM", + "skill_a": a["dir_name"], + "skill_b": b["dir_name"], + "overlap_score": round(score, 3), + "detail": ( + "Moderate description overlap — " + + ", ".join(f'"{t}"' for t in sorted(common)[:4]) + ), + "suggestion": ( + "Acceptable if use-cases are clearly distinct. " + "Consider adding differentiating context to each description." + ), + }) + + return conflicts + + +# ── Output ──────────────────────────────────────────────────────────────────── + +def print_report(conflicts: list, skills_count: int, fmt: str) -> None: + criticals = [c for c in conflicts if c["severity"] == "CRITICAL"] + highs = [c for c in conflicts if c["severity"] == "HIGH"] + mediums = [c for c in conflicts if c["severity"] == "MEDIUM"] + + if fmt == "json": + print(json.dumps({ + "skills_scanned": skills_count, + "critical_count": len(criticals), + "high_count": len(highs), + "medium_count": len(mediums), + "conflicts": conflicts, + }, indent=2)) + return + + print(f"\nSkill Conflict Report — {skills_count} skills") + print("─" * 50) + print(f" {len(criticals)} CRITICAL | {len(highs)} HIGH | {len(mediums)} MEDIUM") + print() + + if not conflicts: + print(" ✓ No conflicts detected.") + else: + for c in conflicts: + icon = "✗" if c["severity"] in ("CRITICAL",) else ( + "!" if c["severity"] == "HIGH" else "⚠" + ) + score_str = f" overlap: {c['overlap_score']:.2f}" if c["type"] != "NAME_SHADOW" else "" + print(f" {icon} {c['severity']:8s} {c['skill_a']} ↔ {c['skill_b']}" + f"{score_str}") + print(f" {c['detail']}") + print(f" → {c['suggestion']}") + print() + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_scan(high_threshold: float, medium_threshold: float, + single_skill: str, fmt: str) -> None: + skills = load_all_skills() + conflicts = detect_conflicts(skills, high_threshold, medium_threshold, single_skill) + print_report(conflicts, len(skills), fmt) + critical_count = sum(1 for c in conflicts if c["severity"] == "CRITICAL") + sys.exit(1 if critical_count > 0 else 0) + + +def cmd_names(fmt: str) -> None: + skills = load_all_skills() + conflicts = detect_conflicts(skills, high_threshold=2.0, medium_threshold=2.0) + name_conflicts = [c for c in conflicts if c["type"] == "NAME_SHADOW"] + if fmt == "json": + print(json.dumps(name_conflicts, indent=2)) + else: + if not name_conflicts: + print("✓ No name shadowing detected.") + else: + for c in name_conflicts: + print(f"✗ SHADOW: {c['skill_a']} ↔ {c['skill_b']} {c['detail']}") + sys.exit(1 if name_conflicts else 0) + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Skill Conflict Detector") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--scan", action="store_true") + group.add_argument("--names", action="store_true", + help="Check name shadowing only") + parser.add_argument("--skill", metavar="NAME", + help="Check one skill against all others") + parser.add_argument("--threshold", type=float, default=DEFAULT_HIGH_THRESHOLD, + help=f"HIGH similarity threshold (default: {DEFAULT_HIGH_THRESHOLD})") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + if args.names: + cmd_names(args.format) + elif args.scan: + cmd_scan( + high_threshold=args.threshold, + medium_threshold=DEFAULT_MEDIUM_THRESHOLD, + single_skill=args.skill, + fmt=args.format, + ) + + +if __name__ == "__main__": + main() From 292f707e0f91776cd6db76ad60e955a2ff6125d5 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:56:56 +0530 Subject: [PATCH 07/23] Add heartbeat-governor: per-skill execution budgets for cron skills (#22) Tracks 30-day rolling spend and wall-clock time per scheduled skill. Auto-pauses skills that exceed monthly/per-run budgets. Cron: every hour. Supports manual pause/resume and per-skill budget overrides. Co-authored-by: Claude Sonnet 4.6 --- .../heartbeat-governor/SKILL.md | 105 ++++++ .../heartbeat-governor/STATE_SCHEMA.yaml | 41 +++ .../heartbeat-governor/example-state.yaml | 64 ++++ .../heartbeat-governor/governor.py | 333 ++++++++++++++++++ 4 files changed, 543 insertions(+) create mode 100644 skills/openclaw-native/heartbeat-governor/SKILL.md create mode 100644 skills/openclaw-native/heartbeat-governor/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/heartbeat-governor/example-state.yaml create mode 100755 skills/openclaw-native/heartbeat-governor/governor.py diff --git a/skills/openclaw-native/heartbeat-governor/SKILL.md b/skills/openclaw-native/heartbeat-governor/SKILL.md new file mode 100644 index 0000000..5d5dc52 --- /dev/null +++ b/skills/openclaw-native/heartbeat-governor/SKILL.md @@ -0,0 +1,105 @@ +--- +name: heartbeat-governor +version: "1.0" +category: openclaw-native +description: Enforces per-skill execution budgets for scheduled cron skills — pauses runaway skills that exceed their token or wall-clock budget before they drain your monthly API allowance. +stateful: true +cron: "0 * * * *" +--- + +# Heartbeat Governor + +## What it does + +Cron skills run autonomously. A skill with a bug — an infinite retry, an unexpectedly large context, a model call inside a loop — can silently consume hundreds of dollars before you notice. + +Heartbeat Governor tracks cumulative execution cost and wall-clock time per scheduled skill on a rolling 30-day basis. When a skill exceeds its budget, the governor pauses it and sends an alert. The skill won't fire again until you explicitly review and resume it. + +It runs every hour to catch budget overruns within one cron cycle. + +## When to invoke + +- Automatically, every hour (cron) +- Manually after noticing an unexpected API bill spike +- When a cron skill has been running unusually long + +## Budget types + +| Budget type | Default | Configurable | +|---|---|---| +| `max_usd_monthly` | $5.00 | Yes, per skill | +| `max_usd_per_run` | $0.50 | Yes, per skill | +| `max_wall_minutes` | 30 | Yes, per skill | +| `max_runs_daily` | 48 | Yes, per skill | + +## Actions on budget breach + +| Breach type | Action | +|---|---| +| `monthly_usd` exceeded | Pause skill, log breach, alert | +| `per_run_usd` exceeded | Abort current run, log breach | +| `wall_clock` exceeded | Abort current run, log breach | +| `daily_runs` exceeded | Skip remaining runs today, log | + +## How to use + +```bash +python3 governor.py --status # Show all skills and budget utilisation +python3 governor.py --record --usd 0.12 --minutes 4 # Record a run +python3 governor.py --pause # Manually pause a skill +python3 governor.py --resume # Resume a paused skill after review +python3 governor.py --set-budget --monthly 10.00 # Override budget +python3 governor.py --check # Run hourly check (called by cron) +python3 governor.py --report # Full monthly spend report +python3 governor.py --format json +``` + +## Cron wakeup behaviour + +Every hour the governor runs `--check`: + +1. Load all skill ledgers from state +2. For each skill with `paused: false`: + - If 30-day rolling spend exceeds `max_usd_monthly` → `paused: true`, log + - If runs today exceed `max_runs_daily` → skip, log +3. Print summary of paused skills and budget utilisation +4. Save updated state + +## Procedure + +**Step 1 — Set sensible budgets** + +After installing any new cron skill, set its monthly budget: + +```bash +python3 governor.py --set-budget daily-review --monthly 2.00 +python3 governor.py --set-budget morning-briefing --monthly 3.00 +``` + +Defaults are conservative ($5/month) but explicit is better. + +**Step 2 — Monitor utilisation** + +```bash +python3 governor.py --status +``` + +Review the utilisation column. Any skill above 80% monthly budget warrants investigation. + +**Step 3 — Respond to pause alerts** + +When the governor pauses a skill, investigate why it's over budget: +- Was there a one-time expensive run (large context)? +- Is there a bug causing repeated expensive calls? +- Does the budget simply need to be raised? + +Resume after investigating: +```bash +python3 governor.py --resume +``` + +## State + +Per-skill ledgers and pause flags stored in `~/.openclaw/skill-state/heartbeat-governor/state.yaml`. + +Fields: `skill_ledgers` map, `paused_skills` list, `breach_log`, `monthly_summary`. diff --git a/skills/openclaw-native/heartbeat-governor/STATE_SCHEMA.yaml b/skills/openclaw-native/heartbeat-governor/STATE_SCHEMA.yaml new file mode 100644 index 0000000..535e4cf --- /dev/null +++ b/skills/openclaw-native/heartbeat-governor/STATE_SCHEMA.yaml @@ -0,0 +1,41 @@ +version: "1.0" +description: Per-skill execution budgets, spend ledgers, pause flags, and breach log. +fields: + skill_ledgers: + type: object + description: Map of skill_name -> budget + rolling spend ledger + items: + budget: + type: object + properties: + max_usd_monthly: { type: float, default: 5.0 } + max_usd_per_run: { type: float, default: 0.5 } + max_wall_minutes: { type: integer, default: 30 } + max_runs_daily: { type: integer, default: 48 } + paused: { type: boolean, default: false } + pause_reason: { type: string } + paused_at: { type: datetime } + runs: + type: list + description: Rolling 30-day run log + items: + ran_at: { type: datetime } + usd_spent: { type: float } + wall_minutes: { type: float } + breach_log: + type: list + description: All budget breach events + items: + skill_name: { type: string } + breach_type: { type: enum, values: [monthly_usd, per_run_usd, wall_clock, daily_runs] } + value: { type: float } + limit: { type: float } + breached_at: { type: datetime } + resolved: { type: boolean } + monthly_summary: + type: object + description: Aggregated spend by skill for current calendar month + items: + skill_name: { type: string } + total_usd: { type: float } + total_runs: { type: integer } diff --git a/skills/openclaw-native/heartbeat-governor/example-state.yaml b/skills/openclaw-native/heartbeat-governor/example-state.yaml new file mode 100644 index 0000000..6a93eb7 --- /dev/null +++ b/skills/openclaw-native/heartbeat-governor/example-state.yaml @@ -0,0 +1,64 @@ +# Example runtime state for heartbeat-governor +skill_ledgers: + morning-briefing: + budget: + max_usd_monthly: 4.00 + max_usd_per_run: 0.30 + max_wall_minutes: 15 + max_runs_daily: 1 + paused: false + pause_reason: null + paused_at: null + runs: + - ran_at: "2026-03-15T07:00:05.000000" + usd_spent: 0.18 + wall_minutes: 6.2 + - ran_at: "2026-03-14T07:00:03.000000" + usd_spent: 0.21 + wall_minutes: 7.1 + long-running-task-management: + budget: + max_usd_monthly: 5.00 + max_usd_per_run: 0.50 + max_wall_minutes: 30 + max_runs_daily: 96 + paused: true + pause_reason: "30-day spend $5.12 reached monthly limit $5.00" + paused_at: "2026-03-15T08:00:00.000000" + runs: [] + cron-hygiene: + budget: + max_usd_monthly: 1.00 + max_usd_per_run: 0.10 + max_wall_minutes: 10 + max_runs_daily: 2 + paused: false + pause_reason: null + paused_at: null + runs: + - ran_at: "2026-03-10T09:00:07.000000" + usd_spent: 0.07 + wall_minutes: 2.1 +breach_log: + - skill_name: long-running-task-management + breach_type: monthly_usd + value: 5.12 + limit: 5.00 + breached_at: "2026-03-15T08:00:00.000000" + resolved: false +monthly_summary: {} +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Hourly cron runs: python3 governor.py --check +# +# Heartbeat Governor — 2026-03-15 08:00 +# ────────────────────────────────────────────────────────────── +# ⏸ Paused: long-running-task-management +# +# python3 governor.py --status +# Skill Spend Budget % Status +# cron-hygiene $0.07 $1.00 7% ✓ +# long-running-task-management $5.12 $5.00 102% ⏸ PAUSED +# morning-briefing $0.39 $4.00 10% ✓ +# +# python3 governor.py --resume long-running-task-management +# ✓ Resumed 'long-running-task-management'. Will fire on next scheduled run. diff --git a/skills/openclaw-native/heartbeat-governor/governor.py b/skills/openclaw-native/heartbeat-governor/governor.py new file mode 100755 index 0000000..39664a6 --- /dev/null +++ b/skills/openclaw-native/heartbeat-governor/governor.py @@ -0,0 +1,333 @@ +#!/usr/bin/env python3 +""" +Heartbeat Governor for openclaw-superpowers. + +Enforces per-skill execution budgets for cron skills. +Pauses runaway skills before they drain your monthly API allowance. + +Usage: + python3 governor.py --check # Hourly cron check + python3 governor.py --status # Current utilisation + python3 governor.py --record --usd 0.12 --minutes 4 + python3 governor.py --pause # Manual pause + python3 governor.py --resume # Resume after review + python3 governor.py --set-budget --monthly 10.00 [--per-run 1.00] + python3 governor.py --report # Monthly spend report + python3 governor.py --format json +""" + +import argparse +import json +import os +from datetime import datetime, timedelta +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "heartbeat-governor" / "state.yaml" + +DEFAULT_BUDGET = { + "max_usd_monthly": 5.0, + "max_usd_per_run": 0.50, + "max_wall_minutes": 30, + "max_runs_daily": 48, +} +ROLLING_DAYS = 30 +MAX_BREACH_LOG = 200 + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"skill_ledgers": {}, "breach_log": [], "monthly_summary": {}} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Ledger helpers ──────────────────────────────────────────────────────────── + +def get_ledger(state: dict, skill_name: str) -> dict: + ledgers = state.setdefault("skill_ledgers", {}) + if skill_name not in ledgers: + ledgers[skill_name] = { + "budget": dict(DEFAULT_BUDGET), + "paused": False, + "pause_reason": None, + "paused_at": None, + "runs": [], + } + return ledgers[skill_name] + + +def prune_old_runs(runs: list) -> list: + cutoff = datetime.now() - timedelta(days=ROLLING_DAYS) + return [r for r in runs if _parse_dt(r.get("ran_at", "")) >= cutoff] + + +def _parse_dt(s: str) -> datetime: + try: + return datetime.fromisoformat(s) + except Exception: + return datetime.min + + +def rolling_usd(runs: list) -> float: + return sum(r.get("usd_spent", 0) for r in runs) + + +def runs_today(runs: list) -> int: + today = datetime.now().date() + return sum(1 for r in runs if _parse_dt(r.get("ran_at", "")).date() == today) + + +def add_breach(state: dict, skill_name: str, breach_type: str, + value: float, limit: float) -> None: + breach_log = state.setdefault("breach_log", []) + breach_log.append({ + "skill_name": skill_name, + "breach_type": breach_type, + "value": round(value, 4), + "limit": round(limit, 4), + "breached_at": datetime.now().isoformat(), + "resolved": False, + }) + state["breach_log"] = breach_log[-MAX_BREACH_LOG:] + + +def pause_skill(state: dict, skill_name: str, reason: str) -> None: + ledger = get_ledger(state, skill_name) + ledger["paused"] = True + ledger["pause_reason"] = reason + ledger["paused_at"] = datetime.now().isoformat() + print(f" ⏸ PAUSED: {skill_name} — {reason}") + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_record(state: dict, skill_name: str, usd: float, minutes: float) -> None: + ledger = get_ledger(state, skill_name) + ledger["runs"] = prune_old_runs(ledger.get("runs") or []) + + now = datetime.now().isoformat() + run = {"ran_at": now, "usd_spent": usd, "wall_minutes": minutes} + + # Per-run checks + budget = ledger.get("budget") or DEFAULT_BUDGET + per_run_limit = budget.get("max_usd_per_run", DEFAULT_BUDGET["max_usd_per_run"]) + wall_limit = budget.get("max_wall_minutes", DEFAULT_BUDGET["max_wall_minutes"]) + + if usd > per_run_limit: + add_breach(state, skill_name, "per_run_usd", usd, per_run_limit) + print(f"⚠ {skill_name}: per-run spend ${usd:.2f} exceeds limit ${per_run_limit:.2f}") + + if minutes > wall_limit: + add_breach(state, skill_name, "wall_clock", minutes, wall_limit) + print(f"⚠ {skill_name}: wall-clock {minutes:.1f}m exceeds limit {wall_limit}m") + + ledger["runs"].append(run) + save_state(state) + print(f"✓ Recorded run for '{skill_name}': ${usd:.4f} in {minutes:.1f}m") + + +def cmd_check(state: dict, fmt: str) -> None: + """Hourly cron check — evaluate all skill budgets.""" + ledgers = state.get("skill_ledgers") or {} + paused_now = [] + alerts = [] + + for skill_name, ledger in ledgers.items(): + if ledger.get("paused"): + continue + + budget = ledger.get("budget") or DEFAULT_BUDGET + ledger["runs"] = prune_old_runs(ledger.get("runs") or []) + + # Monthly budget check + monthly_limit = budget.get("max_usd_monthly", DEFAULT_BUDGET["max_usd_monthly"]) + total = rolling_usd(ledger["runs"]) + if total >= monthly_limit: + reason = f"30-day spend ${total:.2f} reached monthly limit ${monthly_limit:.2f}" + pause_skill(state, skill_name, reason) + add_breach(state, skill_name, "monthly_usd", total, monthly_limit) + paused_now.append(skill_name) + alerts.append({"skill": skill_name, "breach": "monthly_usd", + "value": total, "limit": monthly_limit}) + continue + + # Daily runs check + daily_limit = budget.get("max_runs_daily", DEFAULT_BUDGET["max_runs_daily"]) + today_runs = runs_today(ledger["runs"]) + if today_runs >= daily_limit: + alerts.append({"skill": skill_name, "breach": "daily_runs", + "value": today_runs, "limit": daily_limit}) + + now = datetime.now().isoformat() + if fmt == "json": + print(json.dumps({ + "checked_at": now, + "paused_this_run": paused_now, + "alerts": alerts, + }, indent=2)) + else: + print(f"\nHeartbeat Governor — {datetime.now().strftime('%Y-%m-%d %H:%M')}") + print("─" * 48) + if paused_now: + for name in paused_now: + print(f" ⏸ Paused: {name}") + if not paused_now and not alerts: + print(" ✓ All skills within budget.") + for a in alerts: + if a["breach"] == "daily_runs": + print(f" ⚠ {a['skill']}: {int(a['value'])} runs today " + f"(limit {int(a['limit'])})") + print() + + save_state(state) + + +def cmd_status(state: dict, fmt: str) -> None: + ledgers = state.get("skill_ledgers") or {} + rows = [] + for skill_name, ledger in sorted(ledgers.items()): + budget = ledger.get("budget") or DEFAULT_BUDGET + ledger["runs"] = prune_old_runs(ledger.get("runs") or []) + total = rolling_usd(ledger["runs"]) + monthly_limit = budget.get("max_usd_monthly", DEFAULT_BUDGET["max_usd_monthly"]) + pct = int(100 * total / monthly_limit) if monthly_limit else 0 + rows.append({ + "skill": skill_name, + "paused": ledger.get("paused", False), + "monthly_usd": round(total, 4), + "monthly_limit": monthly_limit, + "pct": pct, + }) + + if fmt == "json": + print(json.dumps(rows, indent=2)) + return + + print(f"\nHeartbeat Governor — Skill Budget Status") + print("─" * 55) + print(f" {'Skill':30s} {'Spend':>7s} {'Budget':>7s} {'%':>4s} Status") + for r in rows: + status = "⏸ PAUSED" if r["paused"] else ("⚠" if r["pct"] >= 80 else "✓") + print(f" {r['skill']:30s} ${r['monthly_usd']:>6.2f} " + f"${r['monthly_limit']:>6.2f} {r['pct']:>3d}% {status}") + print() + + +def cmd_pause(state: dict, skill_name: str) -> None: + pause_skill(state, skill_name, "Manual pause") + save_state(state) + + +def cmd_resume(state: dict, skill_name: str) -> None: + ledger = get_ledger(state, skill_name) + ledger["paused"] = False + ledger["pause_reason"] = None + ledger["paused_at"] = None + save_state(state) + print(f"✓ Resumed '{skill_name}'. Will fire on next scheduled run.") + + +def cmd_set_budget(state: dict, skill_name: str, monthly: float, + per_run: float, wall_minutes: int, daily_runs: int) -> None: + ledger = get_ledger(state, skill_name) + budget = ledger.setdefault("budget", dict(DEFAULT_BUDGET)) + if monthly is not None: + budget["max_usd_monthly"] = monthly + if per_run is not None: + budget["max_usd_per_run"] = per_run + if wall_minutes is not None: + budget["max_wall_minutes"] = wall_minutes + if daily_runs is not None: + budget["max_runs_daily"] = daily_runs + save_state(state) + print(f"✓ Budget updated for '{skill_name}': {budget}") + + +def cmd_report(state: dict, fmt: str) -> None: + ledgers = state.get("skill_ledgers") or {} + month_start = datetime.now().replace(day=1, hour=0, minute=0, second=0) + rows = [] + for skill_name, ledger in sorted(ledgers.items()): + runs = [r for r in (ledger.get("runs") or []) + if _parse_dt(r.get("ran_at", "")) >= month_start] + total = sum(r.get("usd_spent", 0) for r in runs) + rows.append({"skill": skill_name, "runs": len(runs), + "total_usd": round(total, 4)}) + + grand_total = sum(r["total_usd"] for r in rows) + + if fmt == "json": + print(json.dumps({"rows": rows, "grand_total_usd": round(grand_total, 4)}, + indent=2)) + return + + print(f"\nMonthly Spend Report — {datetime.now().strftime('%B %Y')}") + print("─" * 48) + for r in rows: + print(f" {r['skill']:35s} {r['runs']:3d} runs ${r['total_usd']:.4f}") + print(f" {'TOTAL':35s} ${grand_total:.4f}") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Heartbeat Governor") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--check", action="store_true", + help="Hourly budget check (cron entry point)") + group.add_argument("--status", action="store_true") + group.add_argument("--record", metavar="SKILL") + group.add_argument("--pause", metavar="SKILL") + group.add_argument("--resume", metavar="SKILL") + group.add_argument("--set-budget", metavar="SKILL") + group.add_argument("--report", action="store_true") + parser.add_argument("--usd", type=float, default=0.0) + parser.add_argument("--minutes", type=float, default=0.0) + parser.add_argument("--monthly", type=float) + parser.add_argument("--per-run", type=float) + parser.add_argument("--wall-minutes", type=int) + parser.add_argument("--daily-runs", type=int) + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + + if args.check: + cmd_check(state, args.format) + elif args.status: + cmd_status(state, args.format) + elif args.record: + cmd_record(state, args.record, args.usd, args.minutes) + elif args.pause: + cmd_pause(state, args.pause) + elif args.resume: + cmd_resume(state, args.resume) + elif args.set_budget: + cmd_set_budget(state, args.set_budget, args.monthly, args.per_run, + args.wall_minutes, args.daily_runs) + elif args.report: + cmd_report(state, args.format) + + +if __name__ == "__main__": + main() From 206fcbf062b69761cea6712efb0ddc430ffdf6fe Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Sun, 15 Mar 2026 23:57:09 +0530 Subject: [PATCH 08/23] Add skill-portability-checker: validate OS/binary dependencies in scripts (#23) Detects OS_SPECIFIC_CALL, MISSING_BINARY, BREW_ONLY, PYTHON_IMPORT, and HARDCODED_PATH issues in companion scripts. Cross-checks against os_filter: frontmatter field. No external dependencies. Co-authored-by: Claude Sonnet 4.6 --- .../core/skill-portability-checker/SKILL.md | 92 +++++ .../core/skill-portability-checker/check.py | 369 ++++++++++++++++++ 2 files changed, 461 insertions(+) create mode 100644 skills/core/skill-portability-checker/SKILL.md create mode 100755 skills/core/skill-portability-checker/check.py diff --git a/skills/core/skill-portability-checker/SKILL.md b/skills/core/skill-portability-checker/SKILL.md new file mode 100644 index 0000000..41037da --- /dev/null +++ b/skills/core/skill-portability-checker/SKILL.md @@ -0,0 +1,92 @@ +--- +name: skill-portability-checker +version: "1.0" +category: core +description: Validates that a skill's companion scripts declare their OS and binary dependencies correctly, and checks whether those dependencies are actually present on the current machine. +--- + +# Skill Portability Checker + +## What it does + +Skills with companion scripts (`.py`, `.sh`) can silently fail on machines where their dependencies aren't installed. A skill written on macOS may call `brew`, `pbcopy`, or use `/usr/local/bin` paths that don't exist on Linux. A Python script may `import pandas` on a system without it. + +Skill Portability Checker: +1. Scans companion scripts for OS-specific patterns and external binary calls +2. Checks whether those binaries are present on the current system (`PATH` lookup + `which`) +3. Cross-checks against the skill's declared `os_filter:` frontmatter field (if any) +4. Reports portability issues before the skill fails at runtime + +## Frontmatter field checked + +```yaml +--- +name: my-skill +os_filter: [macos] # optional: ["macos", "linux", "windows"] +--- +``` + +If `os_filter:` is absent the skill is treated as cross-platform. The checker then warns if OS-specific calls are detected without a corresponding `os_filter:`. + +## Checks performed + +| Check | Description | +|---|---| +| OS_SPECIFIC_CALL | Script calls macOS/Linux/Windows-only binary without `os_filter:` | +| MISSING_BINARY | Required binary not found on current system PATH | +| BREW_ONLY | Script uses `brew` (macOS-only) but `os_filter:` includes non-macOS | +| PYTHON_IMPORT | Script imports a non-stdlib module; checks if importable | +| HARDCODED_PATH | Absolute path that doesn't exist on this machine (`/usr/local`, `C:\`) | + +## How to use + +```bash +python3 check.py --check # Full portability scan +python3 check.py --check --skill my-skill # Single skill +python3 check.py --fix-hints my-skill # Print fix suggestions +python3 check.py --format json +``` + +## Procedure + +**Step 1 — Run the scan** + +```bash +python3 check.py --check +``` + +**Step 2 — Triage FAILs first** + +- **MISSING_BINARY**: The script calls a binary that isn't installed. Either install it or add a graceful fallback in the script. +- **OS_SPECIFIC_CALL without os_filter**: Add `os_filter: [macos]` (or whichever OS applies) to the frontmatter so users on other platforms know the skill won't work. + +**Step 3 — Review WARNs** + +- **PYTHON_IMPORT**: Install the missing module or add a `try/except ImportError` with a graceful degradation path (like `HAS_MODULE = False`). +- **HARDCODED_PATH**: Replace with `Path.home()` or environment-variable-based paths. + +**Step 4 — Add os_filter when needed** + +If a skill genuinely only works on one OS, declare it: + +```yaml +os_filter: [macos] +``` + +This prevents the skill from being shown as broken on other platforms — it simply won't be loaded there. + +## Output example + +``` +Skill Portability Report — linux (Python 3.11) +──────────────────────────────────────────────── +32 skills checked | 1 FAIL | 2 WARN + +FAIL obsidian-sync: MISSING_BINARY + sync.py calls `osascript` — not found on this system + Fix: add os_filter: [macos] to frontmatter (osascript is macOS-only) + +WARN morning-briefing: PYTHON_IMPORT + run.py imports `pync` — not importable on this system + Fix: wrap in try/except ImportError and degrade gracefully +``` diff --git a/skills/core/skill-portability-checker/check.py b/skills/core/skill-portability-checker/check.py new file mode 100755 index 0000000..673940b --- /dev/null +++ b/skills/core/skill-portability-checker/check.py @@ -0,0 +1,369 @@ +#!/usr/bin/env python3 +""" +Skill Portability Checker for openclaw-superpowers. + +Validates companion script OS/binary dependencies and checks whether +they are present on the current machine. + +Usage: + python3 check.py --check + python3 check.py --check --skill obsidian-sync + python3 check.py --fix-hints + python3 check.py --format json +""" + +import argparse +import importlib +import json +import os +import platform +import re +import shutil +import sys +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +SUPERPOWERS_DIR = Path(os.environ.get( + "SUPERPOWERS_DIR", + Path.home() / ".openclaw" / "extensions" / "superpowers" +)) +SKILLS_DIRS = [ + SUPERPOWERS_DIR / "skills" / "core", + SUPERPOWERS_DIR / "skills" / "openclaw-native", + SUPERPOWERS_DIR / "skills" / "community", +] + +# Known OS-specific binaries +MACOS_ONLY_BINARIES = { + "osascript", "pbcopy", "pbpaste", "open", "launchctl", "caffeinate", + "defaults", "plutil", "say", "afplay", "mdfind", "mdls", +} +LINUX_ONLY_BINARIES = { + "systemctl", "journalctl", "apt", "apt-get", "dpkg", "yum", "dnf", + "pacman", "snap", "xclip", "xdotool", "notify-send", "xdg-open", +} +BREW_BINARY = "brew" + +# Stdlib modules (not exhaustive, covers common ones) +STDLIB_MODULES = { + "os", "sys", "re", "json", "yaml", "pathlib", "datetime", "time", + "collections", "itertools", "functools", "typing", "io", "math", + "random", "hashlib", "hmac", "base64", "struct", "copy", "enum", + "abc", "dataclasses", "contextlib", "threading", "subprocess", + "shutil", "tempfile", "glob", "fnmatch", "stat", "socket", "http", + "urllib", "email", "csv", "sqlite3", "logging", "unittest", "argparse", + "configparser", "getpass", "platform", "importlib", "inspect", + "traceback", "warnings", "weakref", "gc", "signal", "textwrap", + "string", "difflib", "html", "xml", "pprint", "decimal", "fractions", +} + +# Patterns for detecting binary calls in scripts +BINARY_CALL_RE = re.compile( + r'(?:subprocess\.(?:run|Popen|call|check_output|check_call)\s*\(\s*[\[\(]?\s*["\'])([a-z_\-]+)', + re.I +) +SHELL_CALL_RE = re.compile(r'(?:os\.system|os\.popen)\s*\(\s*["\']([a-z_\-]+)', re.I) +SHUTIL_WHICH_RE = re.compile(r'shutil\.which\s*\(\s*["\']([a-z_\-]+)', re.I) +IMPORT_RE = re.compile(r'^(?:import|from)\s+([a-zA-Z_][a-zA-Z0-9_]*)', re.M) +HARDCODED_PATH_RE = re.compile( + r'["\'](?:/usr/local/|/opt/homebrew/|/home/[a-z]+/|C:\\\\)', re.I +) + + +def current_os() -> str: + s = platform.system().lower() + if s == "darwin": + return "macos" + if s == "windows": + return "windows" + return "linux" + + +# ── Frontmatter ─────────────────────────────────────────────────────────────── + +def parse_frontmatter(skill_md: Path) -> dict: + try: + text = skill_md.read_text() + lines = text.splitlines() + if not lines or lines[0].strip() != "---": + return {} + end = None + for i, line in enumerate(lines[1:], 1): + if line.strip() == "---": + end = i + break + if end is None: + return {} + fm_text = "\n".join(lines[1:end]) + if HAS_YAML: + return yaml.safe_load(fm_text) or {} + fields = {} + for line in fm_text.splitlines(): + if ":" in line and not line.startswith(" "): + k, _, v = line.partition(":") + fields[k.strip()] = v.strip().strip('"').strip("'") + return fields + except Exception: + return {} + + +# ── Script analysis ─────────────────────────────────────────────────────────── + +def extract_binary_calls(text: str) -> set[str]: + binaries = set() + for pattern in (BINARY_CALL_RE, SHELL_CALL_RE, SHUTIL_WHICH_RE): + for m in pattern.finditer(text): + binaries.add(m.group(1).lower()) + return binaries + + +def extract_imports(text: str) -> set[str]: + imports = set() + for m in IMPORT_RE.finditer(text): + mod = m.group(1).split(".")[0] + if mod not in STDLIB_MODULES: + imports.add(mod) + return imports + + +def is_importable(module: str) -> bool: + try: + importlib.import_module(module) + return True + except ImportError: + return False + + +def binary_present(name: str) -> bool: + return shutil.which(name) is not None + + +# ── Skill checker ───────────────────────────────────────────────────────────── + +def check_skill(skill_dir: Path, os_name: str) -> list[dict]: + skill_name = skill_dir.name + skill_md = skill_dir / "SKILL.md" + if not skill_md.exists(): + return [] + + fm = parse_frontmatter(skill_md) + os_filter = fm.get("os_filter") or [] + if isinstance(os_filter, str): + os_filter = [os_filter] + os_filter = [str(o).lower() for o in os_filter] + + issues = [] + + def issue(level, check, file_path, detail, fix_hint): + return { + "skill_name": skill_name, + "level": level, + "check": check, + "file": str(file_path), + "detail": detail, + "fix_hint": fix_hint, + } + + # Scan companion scripts + for script in skill_dir.iterdir(): + if not script.is_file(): + continue + if script.suffix not in (".py", ".sh"): + continue + + try: + text = script.read_text(errors="replace") + except Exception: + continue + + # Binary call analysis + binaries = extract_binary_calls(text) + + for binary in binaries: + # macOS-only binary + if binary in MACOS_ONLY_BINARIES: + if os_filter and "macos" not in os_filter: + issues.append(issue( + "FAIL", "OS_SPECIFIC_CALL", script, + f"Calls `{binary}` (macOS-only) but os_filter excludes macOS", + f"Remove macOS calls or set `os_filter: [macos]` in frontmatter." + )) + elif not os_filter: + issues.append(issue( + "WARN", "OS_SPECIFIC_CALL", script, + f"Calls `{binary}` (macOS-only) but no os_filter declared", + f"Add `os_filter: [macos]` to frontmatter." + )) + if os_name != "macos" and binary not in os_filter: + if not binary_present(binary): + issues.append(issue( + "FAIL", "MISSING_BINARY", script, + f"`{binary}` not found on this system", + f"Install `{binary}` or add `os_filter: [macos]` to frontmatter." + )) + + # Linux-only binary + elif binary in LINUX_ONLY_BINARIES: + if os_filter and "linux" not in os_filter: + issues.append(issue( + "FAIL", "OS_SPECIFIC_CALL", script, + f"Calls `{binary}` (Linux-only) but os_filter excludes Linux", + "Remove Linux-specific calls or add `linux` to os_filter." + )) + elif not os_filter: + issues.append(issue( + "WARN", "OS_SPECIFIC_CALL", script, + f"Calls `{binary}` (Linux-only) but no os_filter declared", + "Add `os_filter: [linux]` to frontmatter." + )) + + # brew special case + elif binary == BREW_BINARY: + issues.append(issue( + "WARN", "BREW_ONLY", script, + "Script calls `brew` (Homebrew/macOS-only)", + "Add `os_filter: [macos]` or use a cross-platform alternative." + )) + + # General binary — check if present + else: + if not binary_present(binary) and binary not in ( + "python3", "python", "bash", "sh", "openclaw" + ): + issues.append(issue( + "WARN", "MISSING_BINARY", script, + f"`{binary}` not found on PATH", + f"Install `{binary}` or add a fallback when it's missing." + )) + + # Hardcoded paths + if HARDCODED_PATH_RE.search(text): + issues.append(issue( + "WARN", "HARDCODED_PATH", script, + "Script contains hardcoded absolute paths that may not exist on all systems", + "Replace with `Path.home()` or environment-variable-based paths." + )) + + # Python imports (only for .py files) + if script.suffix == ".py": + imports = extract_imports(text) + for mod in imports: + if not is_importable(mod): + issues.append(issue( + "WARN", "PYTHON_IMPORT", script, + f"imports `{mod}` which is not installed on this system", + f"Install with `pip install {mod}` or add try/except ImportError." + )) + + # os_filter correctness: if os_filter present, check it's valid values + valid_os_values = {"macos", "linux", "windows"} + for os_val in os_filter: + if os_val not in valid_os_values: + issues.append(issue( + "WARN", "INVALID_OS_FILTER", skill_md, + f"os_filter contains unknown value: `{os_val}`", + f"Valid values: {sorted(valid_os_values)}" + )) + + return issues + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_check(single_skill: str, fmt: str) -> None: + os_name = current_os() + all_issues = [] + skills_checked = 0 + + for skills_root in SKILLS_DIRS: + if not skills_root.exists(): + continue + for skill_dir in sorted(skills_root.iterdir()): + if not skill_dir.is_dir(): + continue + if single_skill and skill_dir.name != single_skill: + continue + issues = check_skill(skill_dir, os_name) + all_issues.extend(issues) + skills_checked += 1 + + fails = sum(1 for i in all_issues if i["level"] == "FAIL") + warns = sum(1 for i in all_issues if i["level"] == "WARN") + py_ver = f"Python {sys.version_info.major}.{sys.version_info.minor}" + + if fmt == "json": + print(json.dumps({ + "os": os_name, + "python_version": py_ver, + "skills_checked": skills_checked, + "fail_count": fails, + "warn_count": warns, + "issues": all_issues, + }, indent=2)) + else: + print(f"\nSkill Portability Report — {os_name} ({py_ver})") + print("─" * 50) + print(f" {skills_checked} skills checked | {fails} FAIL | {warns} WARN") + print() + if not all_issues: + print(" ✓ All skills portable on this system.") + else: + by_skill: dict = {} + for iss in all_issues: + by_skill.setdefault(iss["skill_name"], []).append(iss) + for sname, issues in sorted(by_skill.items()): + for iss in issues: + icon = "✗" if iss["level"] == "FAIL" else "⚠" + print(f" {icon} {sname}: {iss['check']}") + print(f" {iss['detail']}") + print(f" Fix: {iss['fix_hint']}") + print() + print() + + sys.exit(1 if fails > 0 else 0) + + +def cmd_fix_hints(skill_name: str) -> None: + os_name = current_os() + for skills_root in SKILLS_DIRS: + skill_dir = skills_root / skill_name + if skill_dir.exists(): + issues = check_skill(skill_dir, os_name) + if not issues: + print(f"✓ No portability issues found for '{skill_name}'.") + return + print(f"\nFix hints for: {skill_name}") + print("─" * 40) + for iss in issues: + print(f" [{iss['level']}] {iss['check']}") + print(f" {iss['detail']}") + print(f" → {iss['fix_hint']}") + print() + return + print(f"Skill '{skill_name}' not found.") + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Skill Portability Checker") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--check", action="store_true") + group.add_argument("--fix-hints", metavar="SKILL") + parser.add_argument("--skill", metavar="NAME", help="Check single skill only") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + if args.fix_hints: + cmd_fix_hints(args.fix_hints) + elif args.check: + cmd_check(single_skill=args.skill, fmt=args.format) + + +if __name__ == "__main__": + main() From 1d2ed21b81c2619aaf801b1935b747c120c3a0ce Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 00:01:51 +0530 Subject: [PATCH 09/23] Update README: document all 39 skills across 3 categories Adds 8 new integration-focused skills to the tables: - Core: skill-trigger-tester, skill-conflict-detector, skill-portability-checker - OpenClaw-native: skill-doctor, installed-skill-auditor, skill-loadout-manager, skill-compatibility-checker, heartbeat-governor Expands security section from 3 to 5 skills (adds installed-skill-auditor, skill-doctor). Updates companion script list and total skill counts. Co-Authored-By: Claude Sonnet 4.6 --- README.md | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/README.md b/README.md index 157ff02..cfb7349 100644 --- a/README.md +++ b/README.md @@ -42,7 +42,7 @@ That's it. Your agent now has superpowers. ## Skills Included -### Core (12 skills) +### Core (15 skills) Methodology skills that work in any runtime. Adapted from [obra/superpowers](https://github.com/obra/superpowers) plus OpenClaw-specific additions. @@ -60,8 +60,11 @@ Methodology skills that work in any runtime. Adapted from [obra/superpowers](htt | `skill-vetting` | Security scanner for ClawHub skills before installing | `vet.sh` | | `project-onboarding` | Crawls a new codebase to generate a `PROJECT.md` context file | `onboard.py` | | `fact-check-before-trust` | Secondary verification pass for factual claims before acting on them | — | +| `skill-trigger-tester` | Scores a skill's description against sample prompts to predict trigger reliability | `test.py` | +| `skill-conflict-detector` | Detects name shadowing and description-overlap conflicts between installed skills | `detect.py` | +| `skill-portability-checker` | Validates OS/binary dependencies in companion scripts; catches non-portable calls | `check.py` | -### OpenClaw-Native (18 skills) +### OpenClaw-Native (23 skills) Skills that require OpenClaw's persistent runtime — cron scheduling, session state, or long-running execution. Not useful in session-based tools. @@ -85,6 +88,11 @@ Skills that require OpenClaw's persistent runtime — cron scheduling, session s | `multi-agent-coordinator` | Manages parallel agent fleets: health checks, consistency, handoffs | — | ✓ | `run.py` | | `cron-hygiene` | Audits cron skills for session mode waste and token efficiency | Mondays 9am | ✓ | `audit.py` | | `channel-context-bridge` | Writes a resumé card at session end for seamless channel switching | — | ✓ | `bridge.py` | +| `skill-doctor` | Diagnoses silent skill discovery failures — YAML errors, path violations, schema mismatches | — | ✓ | `doctor.py` | +| `installed-skill-auditor` | Weekly post-install audit of all skills for injection, credentials, and drift | Mondays 9am | ✓ | `audit.py` | +| `skill-loadout-manager` | Named skill profiles to manage active skill sets and prevent system prompt bloat | — | ✓ | `loadout.py` | +| `skill-compatibility-checker` | Checks installed skills against the current OpenClaw version for feature compatibility | — | ✓ | `check.py` | +| `heartbeat-governor` | Enforces per-skill execution budgets for cron skills; auto-pauses runaway skills | every hour | ✓ | `governor.py` | ### Community (1 skill) @@ -104,7 +112,7 @@ Stateful skills commit a `STATE_SCHEMA.yaml` defining the shape of their runtime Skills marked with a script in the table above ship a small executable alongside their `SKILL.md`: -- **Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies required; `pyyaml` is optional but recommended. +- **Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies required; `pyyaml` is optional but recommended. - **`vet.sh`** — Pure bash scanner; runs on any system with grep. - Each script supports `--help` and prints a human-readable summary. JSON output available where useful (`--format json`). Dry-run mode available on scripts that make changes. - See the `example-state.yaml` in each skill directory for sample state and a commented walkthrough of the skill's cron behaviour. @@ -113,13 +121,15 @@ Skills marked with a script in the table above ship a small executable alongside ## Security skills at a glance -Three skills address the documented top security risks for OpenClaw agents: +Five skills address the documented top security risks for OpenClaw agents: | Threat | Skill | How | |---|---|---| | Malicious skill install (36% of ClawHub skills contain injection payloads) | `skill-vetting` | Scans before install — 6 security flags, SAFE / CAUTION / DO NOT INSTALL | | Runtime injection from emails, web pages, scraped data | `prompt-injection-guard` | Detects 6 signal types at runtime; blocks on 2+ signals | | Agent takes destructive action without confirmation | `dangerous-action-guard` | Pre-execution gate with 5-min expiry window and full audit trail | +| Post-install skill tampering or credential injection | `installed-skill-auditor` | Weekly content-hash drift detection; INJECTION / CREDENTIAL / EXFILTRATION checks | +| Silent skill loading failures hiding broken skills | `skill-doctor` | 6 diagnostic checks per skill; surfaces every load-time failure before it disappears | --- @@ -129,6 +139,7 @@ obra/superpowers was built for session-based tools (Claude Code, Cursor, Codex). - Runs **24/7**, not just per-session - Handles tasks that take **hours, not minutes** +- Has **native cron scheduling** — skills wake up automatically on a schedule - Needs skills around **handoff, memory persistence, and self-recovery** that session tools don't require The OpenClaw-native skills in this repo exist because of that difference. From 93664bf549fd6f76cc4e191474ccf5574ce1cdc6 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 00:18:42 +0530 Subject: [PATCH 10/23] Add community-skill-radar: scan Reddit for skill ideas every 3 days (#26) Searches 5 subreddits (openclaw, LocalLLaMA, ClaudeAI, MachineLearning, AIAgents) via Reddit's public JSON API. Scores candidates by upvotes, comments, recurrence, and keyword density. Writes prioritized PROPOSALS.md for review. Cron: every 3 days at 9am. Co-authored-by: Claude Sonnet 4.6 --- .../community-skill-radar/SKILL.md | 138 ++++ .../community-skill-radar/STATE_SCHEMA.yaml | 46 ++ .../community-skill-radar/example-state.yaml | 84 +++ .../community-skill-radar/radar.py | 602 ++++++++++++++++++ 4 files changed, 870 insertions(+) create mode 100644 skills/openclaw-native/community-skill-radar/SKILL.md create mode 100644 skills/openclaw-native/community-skill-radar/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/community-skill-radar/example-state.yaml create mode 100755 skills/openclaw-native/community-skill-radar/radar.py diff --git a/skills/openclaw-native/community-skill-radar/SKILL.md b/skills/openclaw-native/community-skill-radar/SKILL.md new file mode 100644 index 0000000..8a798e7 --- /dev/null +++ b/skills/openclaw-native/community-skill-radar/SKILL.md @@ -0,0 +1,138 @@ +--- +name: community-skill-radar +version: "1.0" +category: openclaw-native +description: Searches Reddit communities for OpenClaw pain points and feature requests, scores them by signal strength, and writes a prioritized PROPOSALS.md for you to review and act on. +stateful: true +cron: "0 9 */3 * *" +--- + +# Community Skill Radar + +## What it does + +Your best skill ideas don't come from guessing — they come from what the community is actually struggling with. Community Skill Radar scans Reddit every 3 days for posts and comments mentioning OpenClaw pain points, feature requests, and skill gaps. It scores them by signal strength (upvotes, comment depth, recurrence) and writes a prioritized `PROPOSALS.md` in the repo root. + +You review the proposals. You decide what to build. The radar just makes sure you never miss a signal. + +## When to invoke + +- Automatically, every 3 days (cron) +- Manually when you want a fresh pulse-check on community needs +- Before planning a new batch of skills + +## Subreddits searched + +| Subreddit | Why | +|---|---| +| `openclaw` | Primary OpenClaw community | +| `LocalLLaMA` | Local AI users — many run OpenClaw | +| `ClaudeAI` | Claude ecosystem — overlaps with OpenClaw users | +| `MachineLearning` | Broader AI practitioners | +| `AIAgents` | Agent-specific discussions | + +Custom subreddits can be configured via `--subreddits`. + +## Signal scoring + +Each candidate is scored on 5 dimensions: + +| Signal | Weight | Source | +|---|---|---| +| Upvotes | 2x | Post/comment score | +| Comment depth | 1.5x | Number of replies — more discussion = stronger signal | +| Recurrence | 3x | Same pain point appearing across multiple posts | +| Keyword density | 1x | Concentration of problem/request keywords | +| Recency | 1.5x | Newer posts score higher (7-day decay) | + +## How to use + +```bash +python3 radar.py --scan # Full scan, write PROPOSALS.md +python3 radar.py --scan --lookback 7 # Scan last 7 days (default: 3) +python3 radar.py --scan --subreddits openclaw,LocalLLaMA +python3 radar.py --scan --min-score 5.0 # Only proposals scoring ≥5.0 +python3 radar.py --status # Last scan summary from state +python3 radar.py --history # Show past scan results +python3 radar.py --format json # Machine-readable output +``` + +## Cron wakeup behaviour + +Every 3 days at 9am: + +1. Fetch recent posts from each configured subreddit via Reddit's public JSON API (no auth required) +2. Filter for posts/comments containing OpenClaw-related keywords +3. Extract pain points and feature request signals +4. Score each candidate +5. Deduplicate against previously seen proposals (stored in state) +6. Write `PROPOSALS.md` to the repo root +7. Print summary to stdout + +## PROPOSALS.md format + +```markdown +# Skill Proposals — Community Radar + +*Last scanned: 2026-03-16 09:00 | 5 subreddits | 14 candidates* + +## High Signal (score ≥ 8.0) + +### 1. Skill auto-update mechanism (score: 12.4) +- **Source:** r/openclaw — "Anyone else manually pulling skill updates?" +- **Signal:** 47 upvotes, 23 comments, seen 3 times across 2 subreddits +- **Pain point:** No way to update installed skills without manual git pull +- **Potential skill:** `skill-auto-updater` — checks upstream repos for new versions + +### 2. Context window usage dashboard (score: 9.1) +- **Source:** r/LocalLLaMA — "My openclaw agent keeps losing context mid-task" +- **Signal:** 31 upvotes, 18 comments +- **Pain point:** No visibility into how much context each skill consumes +- **Potential skill:** `context-usage-dashboard` — real-time token budget display + +## Medium Signal (score 4.0–8.0) + +... + +## Previously Seen (already in state — not re-proposed) + +... +``` + +## Procedure + +**Step 1 — Let the cron run (or trigger manually)** + +```bash +python3 radar.py --scan +``` + +**Step 2 — Review PROPOSALS.md** + +Open `PROPOSALS.md` in the repo root. High-signal proposals are the ones the community is loudest about. + +**Step 3 — Act on proposals you want to build** + +For each proposal you decide to build, either: +- Ask your agent to create it: `"Build a skill for using create-skill"` +- Open a GitHub issue for the community + +**Step 4 — Mark proposals as actioned** + +```bash +python3 radar.py --mark-actioned "skill-auto-updater" +``` + +This moves the proposal to the "actioned" list in state so it won't be re-proposed on future scans. + +## State + +Scan results, seen proposals, and actioned items stored in `~/.openclaw/skill-state/community-skill-radar/state.yaml`. + +Fields: `last_scan_at`, `subreddits`, `proposals` list, `actioned` list, `scan_history`. + +## Notes + +- Uses Reddit's public JSON API at `reddit.com//search.json`. No authentication required. Rate-limited to 1 request per 2 seconds to respect Reddit's guidelines. +- Does not post, comment, or interact with Reddit in any way — read-only scanning. +- `PROPOSALS.md` is gitignored by default (local working document). Add to `.gitignore` if not already present. diff --git a/skills/openclaw-native/community-skill-radar/STATE_SCHEMA.yaml b/skills/openclaw-native/community-skill-radar/STATE_SCHEMA.yaml new file mode 100644 index 0000000..d5a2ace --- /dev/null +++ b/skills/openclaw-native/community-skill-radar/STATE_SCHEMA.yaml @@ -0,0 +1,46 @@ +version: "1.0" +description: Community radar scan results, proposal ledger, and actioned tracking. +fields: + last_scan_at: + type: datetime + subreddits: + type: list + description: Subreddits included in the last scan + items: + type: string + proposals: + type: list + description: All proposals from the most recent scan (newest first) + items: + id: { type: string, description: "slug derived from title" } + title: { type: string } + pain_point: { type: string } + potential_skill: { type: string } + score: { type: float } + sources: + type: list + items: + subreddit: { type: string } + post_title: { type: string } + url: { type: string } + upvotes: { type: integer } + comments: { type: integer } + fetched_at: { type: datetime } + first_seen_at: { type: datetime } + times_seen: { type: integer } + actioned: + type: list + description: Proposal IDs that have been acted on (built, filed as issues) + items: + id: { type: string } + actioned_at: { type: datetime } + action: { type: string, description: "built, issue-filed, rejected" } + scan_history: + type: list + description: Rolling log of past scans (last 20) + items: + scanned_at: { type: datetime } + subreddits: { type: integer } + posts_fetched: { type: integer } + candidates_found: { type: integer } + proposals_written: { type: integer } diff --git a/skills/openclaw-native/community-skill-radar/example-state.yaml b/skills/openclaw-native/community-skill-radar/example-state.yaml new file mode 100644 index 0000000..f880f33 --- /dev/null +++ b/skills/openclaw-native/community-skill-radar/example-state.yaml @@ -0,0 +1,84 @@ +# Example runtime state for community-skill-radar +last_scan_at: "2026-03-16T09:00:22.441000" +subreddits: + - openclaw + - LocalLLaMA + - ClaudeAI + - MachineLearning + - AIAgents +proposals: + - id: skill-auto-update-mechanism + title: "Anyone else manually pulling skill updates every time?" + pain_point: "No way to update installed skills without manual git pull" + potential_skill: skill-auto-updater + category: integration + score: 12.4 + sources: + - subreddit: openclaw + post_title: "Anyone else manually pulling skill updates every time?" + url: "https://reddit.com/r/openclaw/comments/abc123/..." + upvotes: 47 + comments: 23 + score: 8.2 + fetched_at: "2026-03-16T09:00:20.000000" + - subreddit: LocalLLaMA + post_title: "OpenClaw skills need an update mechanism" + url: "https://reddit.com/r/LocalLLaMA/comments/def456/..." + upvotes: 18 + comments: 9 + score: 4.2 + fetched_at: "2026-03-16T09:00:21.000000" + first_seen_at: "2026-03-13T09:00:00.000000" + times_seen: 3 + - id: context-window-usage-dashboard + title: "My openclaw agent keeps losing context mid-task" + pain_point: "No visibility into how much context each skill consumes" + potential_skill: context-usage-dashboard + category: context + score: 9.1 + sources: + - subreddit: LocalLLaMA + post_title: "My openclaw agent keeps losing context mid-task" + url: "https://reddit.com/r/LocalLLaMA/comments/ghi789/..." + upvotes: 31 + comments: 18 + score: 9.1 + fetched_at: "2026-03-16T09:00:22.000000" + first_seen_at: "2026-03-16T09:00:22.000000" + times_seen: 1 +actioned: + - id: skill-load-failure-detection + actioned_at: "2026-03-15T10:00:00.000000" + action: built +scan_history: + - scanned_at: "2026-03-16T09:00:22.000000" + subreddits: 5 + posts_fetched: 142 + candidates_found: 8 + proposals_written: 8 + - scanned_at: "2026-03-13T09:00:00.000000" + subreddits: 5 + posts_fetched: 118 + candidates_found: 5 + proposals_written: 5 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Every 3 days cron runs: python3 radar.py --scan +# +# Community Skill Radar — scanning 5 subreddits (last 3 days) +# ────────────────────────────────────────────────────────────── +# Fetching r/openclaw... 28 posts +# Fetching r/LocalLLaMA... 42 posts +# Fetching r/ClaudeAI... 35 posts +# Fetching r/MachineLearning... 22 posts +# Fetching r/AIAgents... 15 posts +# +# Posts fetched : 142 +# Candidates found: 8 +# High signal : 2 +# Medium signal : 4 +# Low signal : 2 +# +# Written to: ~/.openclaw/extensions/superpowers/PROPOSALS.md +# +# python3 radar.py --mark-actioned skill-auto-update-mechanism --action built +# ✓ Marked 'skill-auto-update-mechanism' as built. Won't be re-proposed. diff --git a/skills/openclaw-native/community-skill-radar/radar.py b/skills/openclaw-native/community-skill-radar/radar.py new file mode 100755 index 0000000..a816904 --- /dev/null +++ b/skills/openclaw-native/community-skill-radar/radar.py @@ -0,0 +1,602 @@ +#!/usr/bin/env python3 +""" +Community Skill Radar for openclaw-superpowers. + +Scans Reddit communities for OpenClaw pain points and feature requests. +Scores candidates by signal strength and writes a prioritized PROPOSALS.md. + +Usage: + python3 radar.py --scan # Full scan, write PROPOSALS.md + python3 radar.py --scan --lookback 7 # Scan last 7 days + python3 radar.py --scan --subreddits openclaw,LocalLLaMA + python3 radar.py --scan --min-score 5.0 + python3 radar.py --mark-actioned # Mark proposal as actioned + python3 radar.py --status # Last scan summary + python3 radar.py --history # Past scan results + python3 radar.py --format json +""" + +import argparse +import json +import math +import os +import re +import sys +import time +import urllib.error +import urllib.request +from datetime import datetime, timedelta +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "community-skill-radar" / "state.yaml" +SUPERPOWERS_DIR = Path(os.environ.get( + "SUPERPOWERS_DIR", + Path.home() / ".openclaw" / "extensions" / "superpowers" +)) +PROPOSALS_FILE = SUPERPOWERS_DIR / "PROPOSALS.md" + +DEFAULT_SUBREDDITS = ["openclaw", "LocalLLaMA", "ClaudeAI", "MachineLearning", "AIAgents"] +DEFAULT_LOOKBACK_DAYS = 3 +RATE_LIMIT_SECONDS = 2.0 +MAX_POSTS_PER_SUB = 50 +MAX_HISTORY = 20 +USER_AGENT = "openclaw-superpowers:community-skill-radar:v1.0 (by /u/openclaw-bot)" + +# ── Keywords ────────────────────────────────────────────────────────────────── + +# Keywords that signal an OpenClaw-relevant post +OPENCLAW_KEYWORDS = [ + "openclaw", "open claw", "open-claw", + "skill", "skills", "superpowers", + "agent", "ai agent", "ai assistant", + "cron", "scheduled task", +] + +# Keywords that signal a pain point or feature request +SIGNAL_KEYWORDS = [ + "wish", "want", "need", "missing", "broken", "frustrat", + "bug", "issue", "problem", "annoying", "doesn't work", + "feature request", "would be nice", "someone should", + "why can't", "how do i", "is there a way", + "pain point", "struggle", "stuck", "help", + "workaround", "hack", "janky", "ugly", + "silent", "silently", "no error", "no warning", + "expensive", "cost", "budget", "bill", + "context window", "context limit", "overflow", + "memory", "forget", "lost context", +] + +# Keywords that suggest a potential skill category +SKILL_CATEGORY_KEYWORDS = { + "security": ["injection", "malicious", "credential", "secret", "vulnerability", "attack"], + "cost": ["expensive", "cost", "budget", "spend", "bill", "token", "usage", "price"], + "reliability": ["crash", "loop", "stuck", "hang", "timeout", "retry", "fail", "broken"], + "context": ["context", "memory", "forget", "window", "overflow", "limit", "compaction"], + "workflow": ["workflow", "chain", "pipeline", "orchestrat", "automat", "schedule", "cron"], + "integration": ["install", "load", "config", "setup", "compati", "version", "portab"], + "ux": ["confusing", "unclear", "verbose", "noisy", "silent", "dashboard", "status"], +} + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"proposals": [], "actioned": [], "scan_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Reddit fetcher ──────────────────────────────────────────────────────────── + +def fetch_subreddit(subreddit: str, lookback_days: int) -> list[dict]: + """Fetch recent posts from a subreddit via Reddit's public JSON API.""" + url = (f"https://www.reddit.com/r/{subreddit}/new.json" + f"?limit={MAX_POSTS_PER_SUB}&t=week") + + req = urllib.request.Request(url, headers={"User-Agent": USER_AGENT}) + posts = [] + + try: + with urllib.request.urlopen(req, timeout=15) as resp: + data = json.loads(resp.read().decode()) + + cutoff = datetime.now() - timedelta(days=lookback_days) + + for child in data.get("data", {}).get("children", []): + post = child.get("data", {}) + created = datetime.fromtimestamp(post.get("created_utc", 0)) + if created < cutoff: + continue + + posts.append({ + "subreddit": subreddit, + "post_id": post.get("id", ""), + "title": post.get("title", ""), + "selftext": post.get("selftext", "")[:2000], + "url": f"https://reddit.com{post.get('permalink', '')}", + "upvotes": post.get("score", 0), + "comments": post.get("num_comments", 0), + "created_utc": post.get("created_utc", 0), + "created_at": created.isoformat(), + }) + except (urllib.error.URLError, urllib.error.HTTPError, json.JSONDecodeError, + OSError) as e: + print(f" ⚠ Failed to fetch r/{subreddit}: {e}") + + return posts + + +def fetch_search(query: str, subreddit: str, lookback_days: int) -> list[dict]: + """Search a subreddit for a specific query.""" + encoded_q = urllib.parse.quote(query) + url = (f"https://www.reddit.com/r/{subreddit}/search.json" + f"?q={encoded_q}&restrict_sr=on&sort=new&t=week&limit=25") + + req = urllib.request.Request(url, headers={"User-Agent": USER_AGENT}) + posts = [] + + try: + with urllib.request.urlopen(req, timeout=15) as resp: + data = json.loads(resp.read().decode()) + + cutoff = datetime.now() - timedelta(days=lookback_days) + + for child in data.get("data", {}).get("children", []): + post = child.get("data", {}) + created = datetime.fromtimestamp(post.get("created_utc", 0)) + if created < cutoff: + continue + + posts.append({ + "subreddit": subreddit, + "post_id": post.get("id", ""), + "title": post.get("title", ""), + "selftext": post.get("selftext", "")[:2000], + "url": f"https://reddit.com{post.get('permalink', '')}", + "upvotes": post.get("score", 0), + "comments": post.get("num_comments", 0), + "created_utc": post.get("created_utc", 0), + "created_at": created.isoformat(), + }) + except (urllib.error.URLError, urllib.error.HTTPError, json.JSONDecodeError, + OSError): + pass # search failures are non-critical + + return posts + + +# ── Signal analysis ─────────────────────────────────────────────────────────── + +def is_relevant(post: dict) -> bool: + """Check if a post is relevant to OpenClaw/agent skills.""" + text = (post.get("title", "") + " " + post.get("selftext", "")).lower() + return any(kw in text for kw in OPENCLAW_KEYWORDS) + + +def has_signal(post: dict) -> bool: + """Check if a post contains a pain point or feature request signal.""" + text = (post.get("title", "") + " " + post.get("selftext", "")).lower() + return any(kw in text for kw in SIGNAL_KEYWORDS) + + +def classify_category(text: str) -> str: + """Classify the likely skill category from text.""" + text_lower = text.lower() + scores = {} + for category, keywords in SKILL_CATEGORY_KEYWORDS.items(): + scores[category] = sum(1 for kw in keywords if kw in text_lower) + best = max(scores, key=scores.get) + return best if scores[best] > 0 else "general" + + +def slugify(text: str) -> str: + """Create a slug from text for proposal IDs.""" + slug = re.sub(r'[^a-z0-9\s-]', '', text.lower()) + slug = re.sub(r'[\s-]+', '-', slug).strip('-') + return slug[:60] + + +def extract_pain_point(post: dict) -> str: + """Extract a concise pain point summary from a post.""" + title = post.get("title", "") + text = post.get("selftext", "") + + # Use the title as primary signal + if len(title) > 20: + return title[:120] + + # Fall back to first sentence of selftext + sentences = re.split(r'[.!?\n]', text) + for s in sentences: + s = s.strip() + if len(s) > 20 and any(kw in s.lower() for kw in SIGNAL_KEYWORDS): + return s[:120] + + return title[:120] if title else "(no summary available)" + + +def suggest_skill_name(pain_point: str) -> str: + """Suggest a potential skill name from a pain point.""" + words = re.findall(r'[a-z]+', pain_point.lower()) + # Remove very common words + stopwords = {"the", "a", "an", "is", "are", "was", "to", "for", "in", + "on", "it", "my", "i", "how", "do", "can", "does", "when", + "not", "with", "that", "this", "have", "has", "and", "or", + "but", "be", "been", "any", "there", "just", "way", "get"} + meaningful = [w for w in words if w not in stopwords and len(w) > 2] + if len(meaningful) >= 2: + return "-".join(meaningful[:3]) + return "unnamed-skill" + + +# ── Scoring ─────────────────────────────────────────────────────────────────── + +def score_post(post: dict, lookback_days: int) -> float: + """Score a post by signal strength.""" + score = 0.0 + + # Upvotes (weight 2x) + upvotes = max(0, post.get("upvotes", 0)) + score += min(upvotes, 100) * 0.2 # cap at 100 upvotes = 20 points + + # Comment depth (weight 1.5x) + comments = max(0, post.get("comments", 0)) + score += min(comments, 50) * 0.3 # cap at 50 comments = 15 points + + # Keyword density (weight 1x) + text = (post.get("title", "") + " " + post.get("selftext", "")).lower() + signal_hits = sum(1 for kw in SIGNAL_KEYWORDS if kw in text) + score += min(signal_hits, 8) * 1.0 # cap at 8 hits = 8 points + + # Recency (weight 1.5x, 7-day decay) + age_seconds = time.time() - post.get("created_utc", time.time()) + age_days = age_seconds / 86400 + recency_factor = max(0, 1.0 - (age_days / (lookback_days + 4))) + score *= (0.5 + 0.5 * recency_factor) # decay to 50% at lookback boundary + + return round(score, 2) + + +def score_proposal(sources: list) -> float: + """Score a proposal based on its aggregated sources.""" + total = sum(s.get("score", 0) for s in sources) + + # Recurrence bonus (weight 3x) + recurrence = len(sources) + if recurrence > 1: + total += recurrence * 3.0 + + # Cross-subreddit bonus + unique_subs = len(set(s.get("subreddit", "") for s in sources)) + if unique_subs > 1: + total += unique_subs * 2.0 + + return round(total, 2) + + +# ── Proposal generation ────────────────────────────────────────────────────── + +def build_proposals(posts: list, lookback_days: int, + actioned_ids: set) -> list[dict]: + """Build scored proposals from relevant posts.""" + # Filter for relevant + signal posts + candidates = [] + for post in posts: + if is_relevant(post) and has_signal(post): + post["score"] = score_post(post, lookback_days) + candidates.append(post) + + if not candidates: + return [] + + # Group by pain point similarity (simple: by shared keywords) + proposals = {} + for post in candidates: + pain = extract_pain_point(post) + slug = slugify(pain) + + if slug in actioned_ids: + continue + + if slug not in proposals: + proposals[slug] = { + "id": slug, + "title": pain, + "pain_point": pain, + "potential_skill": suggest_skill_name(pain), + "category": classify_category(pain + " " + post.get("selftext", "")), + "sources": [], + "first_seen_at": datetime.now().isoformat(), + "times_seen": 0, + } + + proposals[slug]["sources"].append({ + "subreddit": post["subreddit"], + "post_title": post.get("title", ""), + "url": post.get("url", ""), + "upvotes": post.get("upvotes", 0), + "comments": post.get("comments", 0), + "score": post.get("score", 0), + "fetched_at": datetime.now().isoformat(), + }) + proposals[slug]["times_seen"] = len(proposals[slug]["sources"]) + + # Score each proposal + result = [] + for p in proposals.values(): + p["score"] = score_proposal(p["sources"]) + result.append(p) + + result.sort(key=lambda x: x["score"], reverse=True) + return result + + +# ── PROPOSALS.md writer ─────────────────────────────────────────────────────── + +def write_proposals_md(proposals: list, subreddits: list, + posts_fetched: int) -> None: + """Write PROPOSALS.md to the repo root.""" + now = datetime.now().strftime("%Y-%m-%d %H:%M") + lines = [ + "# Skill Proposals — Community Radar", + "", + f"*Last scanned: {now} | {len(subreddits)} subreddits | " + f"{posts_fetched} posts fetched | {len(proposals)} candidates*", + "", + "---", + "", + ] + + high = [p for p in proposals if p["score"] >= 8.0] + medium = [p for p in proposals if 4.0 <= p["score"] < 8.0] + low = [p for p in proposals if p["score"] < 4.0] + + if high: + lines.append("## High Signal (score >= 8.0)") + lines.append("") + for i, p in enumerate(high, 1): + lines.extend(_format_proposal(i, p)) + lines.append("") + + if medium: + lines.append("## Medium Signal (score 4.0-8.0)") + lines.append("") + for i, p in enumerate(medium, 1): + lines.extend(_format_proposal(i, p)) + lines.append("") + + if low: + lines.append("## Low Signal (score < 4.0)") + lines.append("") + for i, p in enumerate(low, 1): + lines.extend(_format_proposal(i, p)) + lines.append("") + + if not proposals: + lines.append("*No new proposals found this scan.*") + lines.append("") + + lines.extend([ + "---", + "", + "*Generated by `community-skill-radar`. " + "Run `python3 radar.py --mark-actioned ` to dismiss a proposal.*", + ]) + + PROPOSALS_FILE.parent.mkdir(parents=True, exist_ok=True) + PROPOSALS_FILE.write_text("\n".join(lines) + "\n") + + +def _format_proposal(idx: int, p: dict) -> list[str]: + lines = [] + lines.append(f"### {idx}. {p['title']} (score: {p['score']})") + lines.append(f"- **Category:** {p.get('category', 'general')}") + lines.append(f"- **Potential skill name:** `{p['potential_skill']}`") + lines.append(f"- **Times seen:** {p['times_seen']}") + + for src in p.get("sources", [])[:3]: + lines.append( + f"- **Source:** r/{src['subreddit']} — " + f"\"{src['post_title'][:80]}\" " + f"({src['upvotes']} upvotes, {src['comments']} comments)" + ) + if src.get("url"): + lines.append(f" - {src['url']}") + + lines.append("") + return lines + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_scan(state: dict, subreddits: list, lookback_days: int, + min_score: float, fmt: str) -> None: + actioned = set(a.get("id", "") for a in (state.get("actioned") or [])) + all_posts = [] + + print(f"\nCommunity Skill Radar — scanning {len(subreddits)} subreddits " + f"(last {lookback_days} days)") + print("─" * 50) + + for sub in subreddits: + print(f" Fetching r/{sub}...", end=" ", flush=True) + posts = fetch_subreddit(sub, lookback_days) + time.sleep(RATE_LIMIT_SECONDS) + + # Also search for "openclaw" specifically in non-openclaw subs + if sub.lower() != "openclaw": + search_posts = fetch_search("openclaw", sub, lookback_days) + time.sleep(RATE_LIMIT_SECONDS) + # Deduplicate by post_id + seen_ids = {p["post_id"] for p in posts} + for sp in search_posts: + if sp["post_id"] not in seen_ids: + posts.append(sp) + + print(f"{len(posts)} posts") + all_posts.extend(posts) + + # Build and score proposals + proposals = build_proposals(all_posts, lookback_days, actioned) + + if min_score > 0: + proposals = [p for p in proposals if p["score"] >= min_score] + + # Merge with existing proposals (bump times_seen for recurring) + existing = {p["id"]: p for p in (state.get("proposals") or [])} + for p in proposals: + if p["id"] in existing: + old = existing[p["id"]] + p["first_seen_at"] = old.get("first_seen_at", p["first_seen_at"]) + p["times_seen"] = old.get("times_seen", 0) + len(p.get("sources", [])) + # Recurrence boost + p["score"] = round(p["score"] + old.get("times_seen", 0) * 1.5, 2) + + # Write PROPOSALS.md + write_proposals_md(proposals, subreddits, len(all_posts)) + + # Print summary + high = sum(1 for p in proposals if p["score"] >= 8.0) + medium = sum(1 for p in proposals if 4.0 <= p["score"] < 8.0) + low = sum(1 for p in proposals if p["score"] < 4.0) + + print() + print(f" Posts fetched : {len(all_posts)}") + print(f" Candidates found: {len(proposals)}") + print(f" High signal : {high}") + print(f" Medium signal : {medium}") + print(f" Low signal : {low}") + print(f"\n Written to: {PROPOSALS_FILE}") + print() + + if fmt == "json": + print(json.dumps({ + "scanned_at": datetime.now().isoformat(), + "subreddits": subreddits, + "posts_fetched": len(all_posts), + "proposals": proposals, + }, indent=2)) + + # Persist state + now = datetime.now().isoformat() + history = state.get("scan_history") or [] + history.insert(0, { + "scanned_at": now, + "subreddits": len(subreddits), + "posts_fetched": len(all_posts), + "candidates_found": len(proposals), + "proposals_written": high + medium + low, + }) + state["scan_history"] = history[:MAX_HISTORY] + state["last_scan_at"] = now + state["subreddits"] = subreddits + state["proposals"] = proposals + save_state(state) + + +def cmd_mark_actioned(state: dict, proposal_id: str, action: str) -> None: + actioned = state.get("actioned") or [] + actioned.append({ + "id": proposal_id, + "actioned_at": datetime.now().isoformat(), + "action": action, + }) + state["actioned"] = actioned + + # Remove from active proposals + proposals = state.get("proposals") or [] + state["proposals"] = [p for p in proposals if p.get("id") != proposal_id] + + save_state(state) + print(f"✓ Marked '{proposal_id}' as {action}. Won't be re-proposed.") + + +def cmd_status(state: dict) -> None: + last = state.get("last_scan_at", "never") + subs = state.get("subreddits") or [] + proposals = state.get("proposals") or [] + actioned = state.get("actioned") or [] + + print(f"\nCommunity Skill Radar — Last scan: {last}") + print(f" Subreddits : {', '.join(subs) if subs else 'none'}") + print(f" Proposals : {len(proposals)} active, {len(actioned)} actioned") + + if proposals: + top = proposals[:3] + print(f"\n Top proposals:") + for p in top: + print(f" [{p['score']:5.1f}] {p['title'][:60]}") + print() + + +def cmd_history(state: dict) -> None: + history = state.get("scan_history") or [] + if not history: + print("No scan history yet.") + return + + print(f"\nScan History ({len(history)} scans)") + print("─" * 50) + for h in history: + print(f" {h.get('scanned_at','')[:16]} " + f"{h.get('subreddits',0)} subs " + f"{h.get('posts_fetched',0)} posts " + f"{h.get('candidates_found',0)} candidates " + f"{h.get('proposals_written',0)} written") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Community Skill Radar") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--scan", action="store_true") + group.add_argument("--mark-actioned", metavar="ID") + group.add_argument("--status", action="store_true") + group.add_argument("--history", action="store_true") + parser.add_argument("--subreddits", default=None, + help="Comma-separated subreddits (default: built-in list)") + parser.add_argument("--lookback", type=int, default=DEFAULT_LOOKBACK_DAYS, + help=f"Days to look back (default: {DEFAULT_LOOKBACK_DAYS})") + parser.add_argument("--min-score", type=float, default=0.0, + help="Minimum score threshold") + parser.add_argument("--action", default="actioned", + help="Action label for --mark-actioned (built/issue-filed/rejected)") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + + if args.status: + cmd_status(state) + elif args.history: + cmd_history(state) + elif args.mark_actioned: + cmd_mark_actioned(state, args.mark_actioned, args.action) + elif args.scan: + subreddits = (args.subreddits.split(",") if args.subreddits + else DEFAULT_SUBREDDITS) + cmd_scan(state, subreddits, args.lookback, args.min_score, args.format) + + +if __name__ == "__main__": + main() From 0edcf462e1ab80fd6ad87c12ebe35622f94fb8a3 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 00:21:15 +0530 Subject: [PATCH 11/23] Update README: add community-skill-radar (40 skills total) Adds community-skill-radar to the OpenClaw-Native table (24 skills), updates companion script list with radar.py, bumps total to 40 skills. Co-Authored-By: Claude Sonnet 4.6 --- README.md | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index cfb7349..4aebc7e 100644 --- a/README.md +++ b/README.md @@ -64,7 +64,7 @@ Methodology skills that work in any runtime. Adapted from [obra/superpowers](htt | `skill-conflict-detector` | Detects name shadowing and description-overlap conflicts between installed skills | `detect.py` | | `skill-portability-checker` | Validates OS/binary dependencies in companion scripts; catches non-portable calls | `check.py` | -### OpenClaw-Native (23 skills) +### OpenClaw-Native (24 skills) Skills that require OpenClaw's persistent runtime — cron scheduling, session state, or long-running execution. Not useful in session-based tools. @@ -93,6 +93,7 @@ Skills that require OpenClaw's persistent runtime — cron scheduling, session s | `skill-loadout-manager` | Named skill profiles to manage active skill sets and prevent system prompt bloat | — | ✓ | `loadout.py` | | `skill-compatibility-checker` | Checks installed skills against the current OpenClaw version for feature compatibility | — | ✓ | `check.py` | | `heartbeat-governor` | Enforces per-skill execution budgets for cron skills; auto-pauses runaway skills | every hour | ✓ | `governor.py` | +| `community-skill-radar` | Scans Reddit for OpenClaw pain points and feature requests; writes prioritized PROPOSALS.md | every 3 days | ✓ | `radar.py` | ### Community (1 skill) @@ -112,7 +113,7 @@ Stateful skills commit a `STATE_SCHEMA.yaml` defining the shape of their runtime Skills marked with a script in the table above ship a small executable alongside their `SKILL.md`: -- **Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies required; `pyyaml` is optional but recommended. +- **Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`, `radar.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies required; `pyyaml` is optional but recommended. - **`vet.sh`** — Pure bash scanner; runs on any system with grep. - Each script supports `--help` and prints a human-readable summary. JSON output available where useful (`--format json`). Dry-run mode available on scripts that make changes. - See the `example-state.yaml` in each skill directory for sample state and a commented walkthrough of the skill's cron behaviour. @@ -142,7 +143,7 @@ obra/superpowers was built for session-based tools (Claude Code, Cursor, Codex). - Has **native cron scheduling** — skills wake up automatically on a schedule - Needs skills around **handoff, memory persistence, and self-recovery** that session tools don't require -The OpenClaw-native skills in this repo exist because of that difference. +The OpenClaw-native skills in this repo exist because of that difference. And with `community-skill-radar`, the library discovers what to build next by scanning Reddit communities automatically. --- From e7cabf4cd914ec167eebb1f9eae437a5bde80a5d Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 00:49:43 +0530 Subject: [PATCH 12/23] Add config-encryption-auditor skill (#29) Scans ~/.openclaw/ config files for plaintext API keys, tokens, and world-readable permissions. Suggests environment variable migration. Cron runs Sundays 9am. Companion script: audit.py with --scan, --fix-permissions, --suggest-env, --status commands. Inspired by OpenLobster's AES-GCM config encryption layer. Co-authored-by: Claude Opus 4.6 --- .../config-encryption-auditor/SKILL.md | 81 +++++ .../STATE_SCHEMA.yaml | 27 ++ .../config-encryption-auditor/audit.py | 300 ++++++++++++++++++ .../example-state.yaml | 76 +++++ 4 files changed, 484 insertions(+) create mode 100644 skills/openclaw-native/config-encryption-auditor/SKILL.md create mode 100644 skills/openclaw-native/config-encryption-auditor/STATE_SCHEMA.yaml create mode 100755 skills/openclaw-native/config-encryption-auditor/audit.py create mode 100644 skills/openclaw-native/config-encryption-auditor/example-state.yaml diff --git a/skills/openclaw-native/config-encryption-auditor/SKILL.md b/skills/openclaw-native/config-encryption-auditor/SKILL.md new file mode 100644 index 0000000..7ee5994 --- /dev/null +++ b/skills/openclaw-native/config-encryption-auditor/SKILL.md @@ -0,0 +1,81 @@ +--- +name: config-encryption-auditor +version: "1.0" +category: openclaw-native +description: Scans OpenClaw config directories for plaintext API keys, tokens, and secrets in unencrypted files — flags exposure risks and suggests encryption or environment variable migration. +stateful: true +cron: "0 9 * * 0" +--- + +# Config Encryption Auditor + +## What it does + +OpenClaw stores configuration in `~/.openclaw/` — API keys, channel tokens, provider credentials. By default, these are plaintext YAML or JSON files readable by any process on your machine. + +OpenLobster solved this with AES-GCM encrypted config files. We can't change OpenClaw's config format, but we can audit it — scanning for exposed secrets, flagging unencrypted credential files, and suggesting migrations to environment variables or encrypted vaults. + +## When to invoke + +- Automatically, every Sunday at 9am (cron) +- After initial OpenClaw setup +- Before deploying to shared infrastructure +- After any config change that adds new API keys + +## Checks performed + +| Check | Severity | What it detects | +|---|---|---| +| PLAINTEXT_API_KEY | CRITICAL | API key patterns in config files (sk-, AKIA, ghp_, etc.) | +| PLAINTEXT_TOKEN | HIGH | OAuth tokens, bearer tokens, passwords in config | +| WORLD_READABLE | HIGH | Config files with 644/755 permissions (readable by all users) | +| NO_GITIGNORE | MEDIUM | Config directory not gitignored (risk of committing secrets) | +| ENV_AVAILABLE | INFO | Secret could be migrated to environment variable | + +## How to use + +```bash +python3 audit.py --scan # Full audit +python3 audit.py --scan --critical-only # CRITICAL findings only +python3 audit.py --fix-permissions # chmod 600 on config files +python3 audit.py --suggest-env # Print env var migration guide +python3 audit.py --status # Last audit summary +python3 audit.py --format json +``` + +## Procedure + +**Step 1 — Run the audit** + +```bash +python3 audit.py --scan +``` + +**Step 2 — Fix CRITICAL issues first** + +For each PLAINTEXT_API_KEY finding, migrate the key to an environment variable: + +```bash +# Instead of storing in config.yaml: +# api_key: sk-abc123... +# Use: +export OPENCLAW_API_KEY="sk-abc123..." +``` + +**Step 3 — Fix file permissions** + +```bash +python3 audit.py --fix-permissions +``` + +This sets `chmod 600` on all config files (owner read/write only). + +**Step 4 — Verify gitignore coverage** + +Ensure `~/.openclaw/` or at minimum the config files are in your global `.gitignore`. + +## State + +Audit results and history stored in `~/.openclaw/skill-state/config-encryption-auditor/state.yaml`. + +Fields: `last_audit_at`, `findings`, `files_scanned`, `audit_history`. diff --git a/skills/openclaw-native/config-encryption-auditor/STATE_SCHEMA.yaml b/skills/openclaw-native/config-encryption-auditor/STATE_SCHEMA.yaml new file mode 100644 index 0000000..05bdb54 --- /dev/null +++ b/skills/openclaw-native/config-encryption-auditor/STATE_SCHEMA.yaml @@ -0,0 +1,27 @@ +version: "1.0" +description: Config file audit results — plaintext secrets, permission issues, and migration suggestions. +fields: + last_audit_at: + type: datetime + files_scanned: + type: integer + default: 0 + findings: + type: list + items: + file_path: { type: string } + check: { type: enum, values: [PLAINTEXT_API_KEY, PLAINTEXT_TOKEN, WORLD_READABLE, NO_GITIGNORE, ENV_AVAILABLE] } + severity: { type: enum, values: [CRITICAL, HIGH, MEDIUM, INFO] } + detail: { type: string } + suggestion: { type: string } + detected_at: { type: datetime } + resolved: { type: boolean } + audit_history: + type: list + description: Rolling audit summaries (last 12) + items: + audited_at: { type: datetime } + files_scanned: { type: integer } + critical_count: { type: integer } + high_count: { type: integer } + medium_count: { type: integer } diff --git a/skills/openclaw-native/config-encryption-auditor/audit.py b/skills/openclaw-native/config-encryption-auditor/audit.py new file mode 100755 index 0000000..daf55fe --- /dev/null +++ b/skills/openclaw-native/config-encryption-auditor/audit.py @@ -0,0 +1,300 @@ +#!/usr/bin/env python3 +""" +Config Encryption Auditor for openclaw-superpowers. + +Scans OpenClaw config directories for plaintext API keys, tokens, +and secrets in unencrypted files. + +Usage: + python3 audit.py --scan + python3 audit.py --scan --critical-only + python3 audit.py --fix-permissions + python3 audit.py --suggest-env + python3 audit.py --status + python3 audit.py --format json +""" + +import argparse +import json +import os +import re +import stat +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "config-encryption-auditor" / "state.yaml" +MAX_HISTORY = 12 + +# Scan these directories for config files +SCAN_DIRS = [OPENCLAW_DIR] +SCAN_EXTENSIONS = {".yaml", ".yml", ".json", ".toml", ".env", ".conf", ".cfg", ".ini"} + +# ── Secret patterns ─────────────────────────────────────────────────────────── + +API_KEY_PATTERNS = [ + (re.compile(r'sk-[A-Za-z0-9]{20,}'), "OpenAI/Anthropic API key"), + (re.compile(r'AKIA[0-9A-Z]{16}'), "AWS Access Key ID"), + (re.compile(r'(?:ghp|gho|ghu|ghs|ghr)_[A-Za-z0-9]{36}'), "GitHub token"), + (re.compile(r'xoxb-[0-9A-Za-z\-]{50,}'), "Slack bot token"), + (re.compile(r'xoxp-[0-9A-Za-z\-]{50,}'), "Slack user token"), + (re.compile(r'[0-9]+:AA[A-Za-z0-9_-]{33}'), "Telegram bot token"), + (re.compile(r'AIza[0-9A-Za-z_-]{35}'), "Google API key"), + (re.compile(r'sk_live_[0-9a-zA-Z]{24,}'), "Stripe secret key"), +] + +TOKEN_PATTERNS = [ + (re.compile(r'(?:token|secret|password|passwd|pwd|apikey|api_key)\s*[:=]\s*["\']?[A-Za-z0-9_\-\.]{8,}', re.I), + "Generic secret assignment"), + (re.compile(r'Bearer [A-Za-z0-9\-_\.]{20,}'), "Bearer token"), + (re.compile(r'Basic [A-Za-z0-9+/=]{20,}'), "Basic auth credential"), +] + +# Environment variable mapping suggestions +ENV_SUGGESTIONS = { + "anthropic": "OPENCLAW_ANTHROPIC_API_KEY", + "openai": "OPENCLAW_OPENAI_API_KEY", + "slack": "OPENCLAW_SLACK_TOKEN", + "telegram": "OPENCLAW_TELEGRAM_TOKEN", + "discord": "OPENCLAW_DISCORD_TOKEN", + "github": "OPENCLAW_GITHUB_TOKEN", + "stripe": "OPENCLAW_STRIPE_KEY", + "aws": "OPENCLAW_AWS_ACCESS_KEY", +} + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"findings": [], "audit_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Scanning ────────────────────────────────────────────────────────────────── + +def scan_file(filepath: Path) -> list[dict]: + findings = [] + now = datetime.now().isoformat() + rel = str(filepath.relative_to(OPENCLAW_DIR)) if str(filepath).startswith(str(OPENCLAW_DIR)) else str(filepath) + + try: + text = filepath.read_text(errors="replace") + except (PermissionError, OSError): + return findings + + # Check for API keys + for pattern, label in API_KEY_PATTERNS: + if pattern.search(text): + findings.append({ + "file_path": rel, "check": "PLAINTEXT_API_KEY", + "severity": "CRITICAL", + "detail": f"Found {label} pattern in plaintext", + "suggestion": f"Migrate to environment variable or encrypted vault.", + "detected_at": now, "resolved": False, + }) + + # Check for tokens + for pattern, label in TOKEN_PATTERNS: + if pattern.search(text): + findings.append({ + "file_path": rel, "check": "PLAINTEXT_TOKEN", + "severity": "HIGH", + "detail": f"Found {label} pattern in plaintext", + "suggestion": "Use environment variables instead of inline credentials.", + "detected_at": now, "resolved": False, + }) + + # Check file permissions (Unix only) + try: + mode = filepath.stat().st_mode + if mode & stat.S_IROTH or mode & stat.S_IRGRP: + findings.append({ + "file_path": rel, "check": "WORLD_READABLE", + "severity": "HIGH", + "detail": f"File permissions {oct(mode)[-3:]} — readable by other users", + "suggestion": "Run: chmod 600 " + str(filepath), + "detected_at": now, "resolved": False, + }) + except (OSError, AttributeError): + pass + + return findings + + +def scan_all(critical_only: bool = False) -> tuple[list, int]: + all_findings = [] + files_scanned = 0 + now = datetime.now().isoformat() + + for scan_dir in SCAN_DIRS: + if not scan_dir.exists(): + continue + for filepath in scan_dir.rglob("*"): + if not filepath.is_file(): + continue + if filepath.suffix not in SCAN_EXTENSIONS: + continue + # Skip skill-state (our own state files) + if "skill-state" in str(filepath): + continue + files_scanned += 1 + findings = scan_file(filepath) + all_findings.extend(findings) + + # Check gitignore + gitignore = Path.home() / ".gitignore" + openclaw_gitignored = False + if gitignore.exists(): + try: + gi_text = gitignore.read_text() + if ".openclaw" in gi_text or "openclaw" in gi_text: + openclaw_gitignored = True + except Exception: + pass + if not openclaw_gitignored: + all_findings.append({ + "file_path": str(OPENCLAW_DIR), "check": "NO_GITIGNORE", + "severity": "MEDIUM", + "detail": "~/.openclaw not found in global .gitignore", + "suggestion": "Add '.openclaw/' to ~/.gitignore to prevent accidental commits.", + "detected_at": now, "resolved": False, + }) + + if critical_only: + all_findings = [f for f in all_findings if f["severity"] == "CRITICAL"] + + return all_findings, files_scanned + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_scan(state: dict, critical_only: bool, fmt: str) -> None: + findings, files_scanned = scan_all(critical_only) + now = datetime.now().isoformat() + critical = sum(1 for f in findings if f["severity"] == "CRITICAL") + high = sum(1 for f in findings if f["severity"] == "HIGH") + medium = sum(1 for f in findings if f["severity"] == "MEDIUM") + + if fmt == "json": + print(json.dumps({"files_scanned": files_scanned, "findings": findings}, indent=2)) + else: + print(f"\nConfig Encryption Audit — {datetime.now().strftime('%Y-%m-%d')}") + print("─" * 50) + print(f" {files_scanned} files scanned | " + f"{critical} CRITICAL | {high} HIGH | {medium} MEDIUM") + print() + if not findings: + print(" ✓ No exposed secrets detected.") + else: + for f in findings: + icon = "✗" if f["severity"] == "CRITICAL" else ("!" if f["severity"] == "HIGH" else "⚠") + print(f" {icon} [{f['severity']}] {f['file_path']}: {f['check']}") + print(f" {f['detail']}") + print(f" → {f['suggestion']}") + print() + + # Persist + history = state.get("audit_history") or [] + history.insert(0, { + "audited_at": now, "files_scanned": files_scanned, + "critical_count": critical, "high_count": high, "medium_count": medium, + }) + state["audit_history"] = history[:MAX_HISTORY] + state["last_audit_at"] = now + state["files_scanned"] = files_scanned + state["findings"] = findings + save_state(state) + sys.exit(1 if critical > 0 else 0) + + +def cmd_fix_permissions(state: dict) -> None: + fixed = 0 + for scan_dir in SCAN_DIRS: + if not scan_dir.exists(): + continue + for filepath in scan_dir.rglob("*"): + if not filepath.is_file() or filepath.suffix not in SCAN_EXTENSIONS: + continue + if "skill-state" in str(filepath): + continue + try: + mode = filepath.stat().st_mode + if mode & stat.S_IROTH or mode & stat.S_IRGRP: + filepath.chmod(0o600) + print(f" ✓ chmod 600: {filepath}") + fixed += 1 + except (OSError, AttributeError): + pass + print(f"\n✓ Fixed permissions on {fixed} files.") + + +def cmd_suggest_env() -> None: + print("\nEnvironment Variable Migration Guide") + print("─" * 48) + print("Replace plaintext credentials in config files with environment variables:\n") + for service, env_var in sorted(ENV_SUGGESTIONS.items()): + print(f" {service:12s} → export {env_var}=\"\"") + print(f"\nAdd these to your shell profile (~/.zshrc, ~/.bashrc) or a .env file.") + print("OpenClaw reads OPENCLAW_* environment variables automatically.\n") + + +def cmd_status(state: dict) -> None: + last = state.get("last_audit_at", "never") + print(f"\nConfig Encryption Auditor — Last run: {last}") + history = state.get("audit_history") or [] + if history: + h = history[0] + print(f" {h.get('files_scanned',0)} files | " + f"{h.get('critical_count',0)} CRITICAL | " + f"{h.get('high_count',0)} HIGH | {h.get('medium_count',0)} MEDIUM") + active = [f for f in (state.get("findings") or []) if not f.get("resolved")] + if active: + print(f"\n Unresolved ({len(active)}):") + for f in active[:3]: + print(f" [{f['severity']}] {f['file_path']}: {f['check']}") + print() + + +def main(): + parser = argparse.ArgumentParser(description="Config Encryption Auditor") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--scan", action="store_true") + group.add_argument("--fix-permissions", action="store_true") + group.add_argument("--suggest-env", action="store_true") + group.add_argument("--status", action="store_true") + parser.add_argument("--critical-only", action="store_true") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.scan: + cmd_scan(state, args.critical_only, args.format) + elif args.fix_permissions: + cmd_fix_permissions(state) + elif args.suggest_env: + cmd_suggest_env() + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() diff --git a/skills/openclaw-native/config-encryption-auditor/example-state.yaml b/skills/openclaw-native/config-encryption-auditor/example-state.yaml new file mode 100644 index 0000000..d435292 --- /dev/null +++ b/skills/openclaw-native/config-encryption-auditor/example-state.yaml @@ -0,0 +1,76 @@ +# Example runtime state for config-encryption-auditor +last_audit_at: "2026-03-16T09:00:15.332000" +files_scanned: 14 +findings: + - file_path: "config/providers.yaml" + check: PLAINTEXT_API_KEY + severity: CRITICAL + detail: "Found OpenAI/Anthropic API key pattern in plaintext" + suggestion: "Migrate to environment variable or encrypted vault." + detected_at: "2026-03-16T09:00:15.000000" + resolved: false + - file_path: "config/integrations.yaml" + check: PLAINTEXT_TOKEN + severity: HIGH + detail: "Found Generic secret assignment pattern in plaintext" + suggestion: "Use environment variables instead of inline credentials." + detected_at: "2026-03-16T09:00:15.100000" + resolved: false + - file_path: "config/providers.yaml" + check: WORLD_READABLE + severity: HIGH + detail: "File permissions 644 — readable by other users" + suggestion: "Run: chmod 600 ~/.openclaw/config/providers.yaml" + detected_at: "2026-03-16T09:00:15.200000" + resolved: false + - file_path: "/Users/you/.openclaw" + check: NO_GITIGNORE + severity: MEDIUM + detail: "~/.openclaw not found in global .gitignore" + suggestion: "Add '.openclaw/' to ~/.gitignore to prevent accidental commits." + detected_at: "2026-03-16T09:00:15.300000" + resolved: false +audit_history: + - audited_at: "2026-03-16T09:00:15.332000" + files_scanned: 14 + critical_count: 1 + high_count: 2 + medium_count: 1 + - audited_at: "2026-03-09T09:00:12.000000" + files_scanned: 12 + critical_count: 0 + high_count: 1 + medium_count: 1 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Cron runs every Sunday at 9am: python3 audit.py --scan +# +# Config Encryption Audit — 2026-03-16 +# ────────────────────────────────────────────────── +# 14 files scanned | 1 CRITICAL | 2 HIGH | 1 MEDIUM +# +# ✗ [CRITICAL] config/providers.yaml: PLAINTEXT_API_KEY +# Found OpenAI/Anthropic API key pattern in plaintext +# → Migrate to environment variable or encrypted vault. +# +# ! [HIGH] config/integrations.yaml: PLAINTEXT_TOKEN +# Found Generic secret assignment pattern in plaintext +# → Use environment variables instead of inline credentials. +# +# ! [HIGH] config/providers.yaml: WORLD_READABLE +# File permissions 644 — readable by other users +# → Run: chmod 600 ~/.openclaw/config/providers.yaml +# +# ⚠ [MEDIUM] /Users/you/.openclaw: NO_GITIGNORE +# ~/.openclaw not found in global .gitignore +# → Add '.openclaw/' to ~/.gitignore to prevent accidental commits. +# +# python3 audit.py --fix-permissions +# ✓ chmod 600: /Users/you/.openclaw/config/providers.yaml +# ✓ Fixed permissions on 1 files. +# +# python3 audit.py --suggest-env +# Environment Variable Migration Guide +# ──────────────────────────────────────────────── +# anthropic → export OPENCLAW_ANTHROPIC_API_KEY="" +# aws → export OPENCLAW_AWS_ACCESS_KEY="" +# ... From 42e19317ca20e5d985266d8865d727b092ccb6c7 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 00:53:07 +0530 Subject: [PATCH 13/23] Add tool-description-optimizer skill (#30) Scores skill descriptions for trigger quality across 5 dimensions: clarity, specificity, keyword density, uniqueness, and length. Grades A-F with concrete rewrite suggestions. Companion script: optimize.py with --scan, --skill, --suggest, --compare, --status. Inspired by OpenLobster's tool-description scoring layer. Co-authored-by: Claude Opus 4.6 --- .../tool-description-optimizer/SKILL.md | 109 ++++ .../STATE_SCHEMA.yaml | 27 + .../example-state.yaml | 94 +++ .../tool-description-optimizer/optimize.py | 549 ++++++++++++++++++ 4 files changed, 779 insertions(+) create mode 100644 skills/openclaw-native/tool-description-optimizer/SKILL.md create mode 100644 skills/openclaw-native/tool-description-optimizer/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/tool-description-optimizer/example-state.yaml create mode 100755 skills/openclaw-native/tool-description-optimizer/optimize.py diff --git a/skills/openclaw-native/tool-description-optimizer/SKILL.md b/skills/openclaw-native/tool-description-optimizer/SKILL.md new file mode 100644 index 0000000..75011c6 --- /dev/null +++ b/skills/openclaw-native/tool-description-optimizer/SKILL.md @@ -0,0 +1,109 @@ +--- +name: tool-description-optimizer +version: "1.0" +category: openclaw-native +description: Analyzes skill descriptions for trigger quality — scores clarity, keyword density, and specificity, then suggests rewrites that improve discovery accuracy. +stateful: true +--- + +# Tool Description Optimizer + +## What it does + +A skill's description is its only discovery mechanism. If the description is vague, overlapping, or keyword-poor, the agent won't trigger it — or worse, will trigger the wrong skill. Tool Description Optimizer analyzes every installed skill's description for trigger quality and suggests concrete rewrites. + +Inspired by OpenLobster's tool-description scoring layer, which penalizes vague descriptions and rewards keyword-rich, action-specific ones. + +## When to invoke + +- After installing new skills — check if descriptions are trigger-ready +- When a skill isn't firing when expected — diagnose whether the description is the problem +- Periodically to audit all descriptions for quality drift +- Before publishing a skill — polish the description for discoverability + +## How it works + +### Scoring dimensions (5 metrics, 0–10 each) + +| Metric | What it measures | Weight | +|---|---|---| +| Clarity | Single clear purpose, no ambiguity | 2x | +| Specificity | Action verbs, concrete nouns vs. vague terms | 2x | +| Keyword density | Trigger-relevant keywords per sentence | 1.5x | +| Uniqueness | Low overlap with other installed skill descriptions | 1.5x | +| Length | Optimal range (15–40 words) — too short = vague, too long = diluted | 1x | + +### Quality grades + +| Grade | Score range | Meaning | +|---|---|---| +| A | 8.0–10.0 | Excellent — high trigger accuracy expected | +| B | 6.0–7.9 | Good — minor improvements possible | +| C | 4.0–5.9 | Fair — likely to miss triggers or overlap | +| D | 2.0–3.9 | Poor — needs rewrite | +| F | 0.0–1.9 | Failing — will not trigger reliably | + +## How to use + +```bash +python3 optimize.py --scan # Score all installed skills +python3 optimize.py --scan --grade C # Only show skills graded C or below +python3 optimize.py --skill # Deep analysis of a single skill +python3 optimize.py --suggest # Generate rewrite suggestions +python3 optimize.py --compare "desc A" "desc B" # Compare two descriptions +python3 optimize.py --status # Last scan summary +python3 optimize.py --format json # Machine-readable output +``` + +## Procedure + +**Step 1 — Run a full scan** + +```bash +python3 optimize.py --scan +``` + +Review the scorecard. Focus on skills graded C or below — these are the ones most likely to cause trigger failures. + +**Step 2 — Get rewrite suggestions for low-scoring skills** + +```bash +python3 optimize.py --suggest +``` + +The optimizer generates 2–3 alternative descriptions with predicted score improvements. + +**Step 3 — Compare alternatives** + +```bash +python3 optimize.py --compare "original description" "suggested rewrite" +``` + +Side-by-side scoring shows exactly which metrics improved. + +**Step 4 — Apply the best rewrite** + +Edit the skill's `SKILL.md` frontmatter `description:` field with the chosen rewrite. + +## Vague word penalties + +These words score 0 on specificity — they say nothing actionable: + +`helps`, `manages`, `handles`, `deals with`, `works with`, `does stuff`, `various`, `things`, `general`, `misc`, `utility`, `tool for`, `assistant for` + +## Strong trigger keywords (examples) + +`scans`, `detects`, `validates`, `generates`, `audits`, `monitors`, `checks`, `reports`, `fixes`, `migrates`, `syncs`, `schedules`, `blocks`, `scores`, `diagnoses` + +## State + +Scan results and per-skill scores stored in `~/.openclaw/skill-state/tool-description-optimizer/state.yaml`. + +Fields: `last_scan_at`, `skill_scores` list, `scan_history`. + +## Notes + +- Does not modify any skill files — analysis and suggestions only +- Uniqueness scoring uses Jaccard similarity against all other installed descriptions +- Length scoring uses a bell curve centered at 25 words (optimal) +- Rewrite suggestions are heuristic-based, not LLM-generated — deterministic and fast diff --git a/skills/openclaw-native/tool-description-optimizer/STATE_SCHEMA.yaml b/skills/openclaw-native/tool-description-optimizer/STATE_SCHEMA.yaml new file mode 100644 index 0000000..2a61879 --- /dev/null +++ b/skills/openclaw-native/tool-description-optimizer/STATE_SCHEMA.yaml @@ -0,0 +1,27 @@ +version: "1.0" +description: Tool description quality scores, rewrite suggestions, and scan history. +fields: + last_scan_at: + type: datetime + skill_scores: + type: list + description: Per-skill quality scores from the most recent scan + items: + skill_name: { type: string } + description: { type: string } + word_count: { type: integer } + clarity: { type: float, description: "0-10 clarity score" } + specificity: { type: float, description: "0-10 specificity score" } + keyword_density: { type: float, description: "0-10 keyword density score" } + uniqueness: { type: float, description: "0-10 uniqueness vs other skills" } + length_score: { type: float, description: "0-10 length optimality score" } + overall: { type: float, description: "Weighted composite score" } + grade: { type: string, description: "A/B/C/D/F" } + scan_history: + type: list + description: Rolling log of past scans (last 20) + items: + scanned_at: { type: datetime } + skills_scanned: { type: integer } + avg_score: { type: float } + grade_distribution: { type: object, description: "Count per grade: A, B, C, D, F" } diff --git a/skills/openclaw-native/tool-description-optimizer/example-state.yaml b/skills/openclaw-native/tool-description-optimizer/example-state.yaml new file mode 100644 index 0000000..2760fd7 --- /dev/null +++ b/skills/openclaw-native/tool-description-optimizer/example-state.yaml @@ -0,0 +1,94 @@ +# Example runtime state for tool-description-optimizer +last_scan_at: "2026-03-16T14:00:05.221000" +skill_scores: + - skill_name: using-superpowers + description: "Bootstrap — teaches the agent how to find and invoke skills" + word_count: 11 + clarity: 7.2 + specificity: 3.8 + keyword_density: 3.3 + uniqueness: 8.1 + length_score: 4.8 + overall: 5.6 + grade: C + - skill_name: config-encryption-auditor + description: "Scans OpenClaw config directories for plaintext API keys, tokens, and secrets in unencrypted files." + word_count: 15 + clarity: 9.2 + specificity: 8.5 + keyword_density: 8.0 + uniqueness: 9.0 + length_score: 7.5 + overall: 8.5 + grade: A + - skill_name: memory-graph-builder + description: "Parses MEMORY.md into a knowledge graph with typed relationships, detects duplicates and contradictions, and generates a compressed memory digest." + word_count: 22 + clarity: 8.8 + specificity: 7.6 + keyword_density: 7.2 + uniqueness: 9.4 + length_score: 9.5 + overall: 8.5 + grade: A +scan_history: + - scanned_at: "2026-03-16T14:00:05.221000" + skills_scanned: 40 + avg_score: 7.2 + grade_distribution: + A: 18 + B: 14 + C: 6 + D: 2 + F: 0 + - scanned_at: "2026-03-13T14:00:00.000000" + skills_scanned: 36 + avg_score: 6.8 + grade_distribution: + A: 14 + B: 12 + C: 7 + D: 3 + F: 0 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# python3 optimize.py --scan +# +# Tool Description Quality Scan — 2026-03-16 +# ──────────────────────────────────────────────────────────── +# 40 skills scanned | avg score: 7.2 +# Grades: 18xA 14xB 6xC 2xD 0xF +# +# ! [D] 3.8 — some-vague-skill +# clarity=2.0 spec=1.5 kw=1.2 uniq=8.0 len=6.5 +# "A helpful utility tool that manages various things..." +# +# ~ [C] 5.6 — using-superpowers +# clarity=7.2 spec=3.8 kw=3.3 uniq=8.1 len=4.8 +# "Bootstrap — teaches the agent how to find and invoke skills" +# +# python3 optimize.py --suggest using-superpowers +# +# Rewrite Suggestions: using-superpowers +# ────────────────────────────────────────────────── +# Current: "Bootstrap — teaches the agent how to find and invoke skills" +# Score: 5.6 (C) +# +# 1. Front-load action verb +# "Teaches the agent how to discover, invoke, and chain installed skills" +# Predicted: 7.4 (B) [+1.8] +# +# python3 optimize.py --compare "A tool that helps manage stuff" "Scans config files for plaintext secrets and suggests env var migration" +# +# Description Comparison +# ────────────────────────────────────────────────── +# A: "A tool that helps manage stuff" +# B: "Scans config files for plaintext secrets and suggests env var migration" +# +# Clarity A=2.0 B=9.5 B +# Specificity A=0.0 B=8.5 B +# Keywords A=0.0 B=7.8 B +# Uniqueness A=7.0 B=7.0 = +# Length A=5.2 B=8.8 B +# OVERALL A=2.8 B=8.4 B +# +# Grade: A=D B=A diff --git a/skills/openclaw-native/tool-description-optimizer/optimize.py b/skills/openclaw-native/tool-description-optimizer/optimize.py new file mode 100755 index 0000000..576ba76 --- /dev/null +++ b/skills/openclaw-native/tool-description-optimizer/optimize.py @@ -0,0 +1,549 @@ +#!/usr/bin/env python3 +""" +Tool Description Optimizer for openclaw-superpowers. + +Scores skill descriptions for trigger quality and suggests rewrites. + +Usage: + python3 optimize.py --scan + python3 optimize.py --scan --grade C + python3 optimize.py --skill + python3 optimize.py --suggest + python3 optimize.py --compare "desc A" "desc B" + python3 optimize.py --status + python3 optimize.py --format json +""" + +import argparse +import json +import math +import os +import re +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "tool-description-optimizer" / "state.yaml" +MAX_HISTORY = 20 + +# Skill directories to scan +SKILL_DIRS = [ + Path(__file__).resolve().parent.parent.parent, # repo skills/ root +] + +# ── Scoring constants ──────────────────────────────────────────────────────── + +VAGUE_WORDS = { + "helps", "manages", "handles", "deals", "works", "does", "stuff", + "various", "things", "general", "misc", "miscellaneous", "utility", + "tool", "assistant", "helper", "processor", "handler", "manager", + "simple", "basic", "easy", "nice", "good", "great", +} + +STRONG_VERBS = { + "scans", "detects", "validates", "generates", "audits", "monitors", + "checks", "reports", "fixes", "migrates", "syncs", "schedules", + "blocks", "scores", "diagnoses", "parses", "extracts", "compiles", + "compacts", "deduplicates", "prunes", "enforces", "breaks", "chains", + "writes", "creates", "builds", "searches", "filters", "tracks", + "prevents", "recovers", "resumes", "verifies", "tests", "measures", +} + +STRONG_NOUNS = { + "api", "key", "token", "secret", "credential", "permission", + "cron", "schedule", "context", "memory", "state", "schema", + "skill", "agent", "session", "task", "workflow", "budget", + "injection", "drift", "conflict", "error", "failure", "loop", + "graph", "node", "edge", "digest", "report", "proposal", + "reddit", "github", "slack", "config", "yaml", "json", +} + +OPTIMAL_LENGTH = 25 # words +LENGTH_SIGMA = 10 # std dev for bell curve + +GRADE_THRESHOLDS = [ + (8.0, "A"), (6.0, "B"), (4.0, "C"), (2.0, "D"), (0.0, "F"), +] + + +# ── State helpers ──────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"skill_scores": [], "scan_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Skill discovery ────────────────────────────────────────────────────────── + +def discover_skills() -> list[dict]: + """Find all installed skills and extract their descriptions.""" + skills = [] + for skill_root in SKILL_DIRS: + if not skill_root.exists(): + continue + for category_dir in sorted(skill_root.iterdir()): + if not category_dir.is_dir(): + continue + for skill_dir in sorted(category_dir.iterdir()): + skill_md = skill_dir / "SKILL.md" + if not skill_md.exists(): + continue + desc = extract_description(skill_md) + if desc: + skills.append({ + "name": skill_dir.name, + "category": category_dir.name, + "description": desc, + }) + return skills + + +def extract_description(skill_md: Path) -> str: + """Extract description from SKILL.md frontmatter.""" + try: + text = skill_md.read_text() + except (PermissionError, OSError): + return "" + # Parse YAML frontmatter + if not text.startswith("---"): + return "" + end = text.find("---", 3) + if end == -1: + return "" + frontmatter = text[3:end].strip() + for line in frontmatter.split("\n"): + if line.startswith("description:"): + desc = line[len("description:"):].strip().strip("\"'") + return desc + return "" + + +# ── Scoring ────────────────────────────────────────────────────────────────── + +def tokenize(text: str) -> list[str]: + return re.findall(r'[a-z0-9]+', text.lower()) + + +def jaccard(a: set, b: set) -> float: + if not a and not b: + return 1.0 + inter = len(a & b) + union = len(a | b) + return inter / union if union > 0 else 0.0 + + +def score_clarity(tokens: list[str]) -> float: + """Score clarity: penalize vague words, reward single clear purpose.""" + if not tokens: + return 0.0 + vague_count = sum(1 for t in tokens if t in VAGUE_WORDS) + vague_ratio = vague_count / len(tokens) + # Penalize heavily for high vague ratio + score = 10.0 * (1.0 - vague_ratio * 2.5) + # Bonus for having a verb early (signals clear purpose) + for t in tokens[:5]: + if t in STRONG_VERBS: + score += 1.0 + break + return max(0.0, min(10.0, score)) + + +def score_specificity(tokens: list[str]) -> float: + """Score specificity: strong verbs and concrete nouns.""" + if not tokens: + return 0.0 + verb_count = sum(1 for t in tokens if t in STRONG_VERBS) + noun_count = sum(1 for t in tokens if t in STRONG_NOUNS) + strong_ratio = (verb_count + noun_count) / len(tokens) + score = min(10.0, strong_ratio * 25.0) + return max(0.0, score) + + +def score_keyword_density(tokens: list[str]) -> float: + """Score keyword density: trigger-relevant terms per token.""" + if not tokens: + return 0.0 + all_keywords = STRONG_VERBS | STRONG_NOUNS + keyword_count = sum(1 for t in tokens if t in all_keywords) + density = keyword_count / len(tokens) + score = min(10.0, density * 30.0) + return max(0.0, score) + + +def score_uniqueness(tokens_set: set, all_other_sets: list[set]) -> float: + """Score uniqueness: low Jaccard similarity to other descriptions.""" + if not all_other_sets: + return 10.0 + max_sim = max(jaccard(tokens_set, other) for other in all_other_sets) + # 0.0 similarity = 10.0 score, 1.0 similarity = 0.0 score + score = 10.0 * (1.0 - max_sim) + return max(0.0, min(10.0, score)) + + +def score_length(word_count: int) -> float: + """Score length: bell curve centered on OPTIMAL_LENGTH.""" + z = (word_count - OPTIMAL_LENGTH) / LENGTH_SIGMA + score = 10.0 * math.exp(-0.5 * z * z) + return max(0.0, min(10.0, score)) + + +def compute_overall(clarity, specificity, keyword_density, uniqueness, length_score) -> float: + """Weighted composite score.""" + weighted = ( + clarity * 2.0 + + specificity * 2.0 + + keyword_density * 1.5 + + uniqueness * 1.5 + + length_score * 1.0 + ) + total_weight = 2.0 + 2.0 + 1.5 + 1.5 + 1.0 + return round(weighted / total_weight, 1) + + +def get_grade(score: float) -> str: + for threshold, grade in GRADE_THRESHOLDS: + if score >= threshold: + return grade + return "F" + + +def score_description(desc: str, all_other_descs: list[str]) -> dict: + """Full scoring of a single description.""" + tokens = tokenize(desc) + tokens_set = set(tokens) + other_sets = [set(tokenize(d)) for d in all_other_descs] + word_count = len(desc.split()) + + clarity = round(score_clarity(tokens), 1) + specificity = round(score_specificity(tokens), 1) + keyword_density = round(score_keyword_density(tokens), 1) + uniqueness = round(score_uniqueness(tokens_set, other_sets), 1) + length = round(score_length(word_count), 1) + overall = compute_overall(clarity, specificity, keyword_density, uniqueness, length) + grade = get_grade(overall) + + return { + "word_count": word_count, + "clarity": clarity, + "specificity": specificity, + "keyword_density": keyword_density, + "uniqueness": uniqueness, + "length_score": length, + "overall": overall, + "grade": grade, + } + + +# ── Suggestion engine ──────────────────────────────────────────────────────── + +def suggest_rewrites(name: str, desc: str, all_other_descs: list[str]) -> list[dict]: + """Generate 2-3 rewrite suggestions with predicted improvements.""" + suggestions = [] + tokens = tokenize(desc) + words = desc.split() + + # Strategy 1: Replace vague words with strong verbs + rewrite1_words = [] + replacements = { + "helps": "assists", "manages": "tracks", "handles": "processes", + "deals": "resolves", "works": "operates", "does": "executes", + } + changed = False + for w in words: + low = w.lower().rstrip(".,;:") + if low in VAGUE_WORDS and low in replacements: + rewrite1_words.append(replacements[low]) + changed = True + else: + rewrite1_words.append(w) + if changed: + rewrite1 = " ".join(rewrite1_words) + s1 = score_description(rewrite1, all_other_descs) + suggestions.append({ + "strategy": "Replace vague words", + "rewrite": rewrite1, + "predicted_score": s1["overall"], + "predicted_grade": s1["grade"], + }) + + # Strategy 2: Trim to optimal length if too long + if len(words) > 40: + trimmed = " ".join(words[:35]) + if not trimmed.endswith("."): + trimmed += "." + s2 = score_description(trimmed, all_other_descs) + suggestions.append({ + "strategy": "Trim to optimal length", + "rewrite": trimmed, + "predicted_score": s2["overall"], + "predicted_grade": s2["grade"], + }) + + # Strategy 3: Front-load with action verb if none in first 3 words + first_tokens = tokenize(" ".join(words[:3])) + has_verb = any(t in STRONG_VERBS for t in first_tokens) + if not has_verb: + # Try to extract the main verb from the description + for t in tokens: + if t in STRONG_VERBS: + verb = t.capitalize() + "s" + rewrite3 = f"{verb} {desc[0].lower()}{desc[1:]}" + s3 = score_description(rewrite3, all_other_descs) + suggestions.append({ + "strategy": "Front-load action verb", + "rewrite": rewrite3, + "predicted_score": s3["overall"], + "predicted_grade": s3["grade"], + }) + break + + if not suggestions: + suggestions.append({ + "strategy": "No automatic rewrites — description already scores well", + "rewrite": desc, + "predicted_score": score_description(desc, all_other_descs)["overall"], + "predicted_grade": score_description(desc, all_other_descs)["grade"], + }) + + return suggestions + + +# ── Commands ───────────────────────────────────────────────────────────────── + +def cmd_scan(state: dict, grade_filter: str, fmt: str) -> None: + skills = discover_skills() + now = datetime.now().isoformat() + all_descs = [s["description"] for s in skills] + results = [] + + for i, skill in enumerate(skills): + other_descs = all_descs[:i] + all_descs[i+1:] + scores = score_description(skill["description"], other_descs) + scores["skill_name"] = skill["name"] + scores["description"] = skill["description"] + results.append(scores) + + # Sort by overall score ascending (worst first) + results.sort(key=lambda r: r["overall"]) + + # Apply grade filter + if grade_filter: + grade_order = {"F": 0, "D": 1, "C": 2, "B": 3, "A": 4} + cutoff = grade_order.get(grade_filter.upper(), 2) + results = [r for r in results if grade_order.get(r["grade"], 0) <= cutoff] + + # Grade distribution + dist = {"A": 0, "B": 0, "C": 0, "D": 0, "F": 0} + all_results = [] + for i, skill in enumerate(skills): + other_descs = all_descs[:i] + all_descs[i+1:] + scores = score_description(skill["description"], other_descs) + dist[scores["grade"]] = dist.get(scores["grade"], 0) + 1 + scores["skill_name"] = skill["name"] + scores["description"] = skill["description"] + all_results.append(scores) + + avg_score = round(sum(r["overall"] for r in all_results) / len(all_results), 1) if all_results else 0.0 + + if fmt == "json": + print(json.dumps({"skills_scanned": len(skills), "results": results, "avg_score": avg_score, "grades": dist}, indent=2)) + else: + print(f"\nTool Description Quality Scan — {datetime.now().strftime('%Y-%m-%d')}") + print("-" * 60) + print(f" {len(skills)} skills scanned | avg score: {avg_score}") + print(f" Grades: {dist['A']}xA {dist['B']}xB {dist['C']}xC {dist['D']}xD {dist['F']}xF") + print() + if not results: + print(" All skills above grade threshold.") + else: + for r in results: + icon = {"A": "+", "B": "+", "C": "~", "D": "!", "F": "x"} + print(f" {icon.get(r['grade'], '?')} [{r['grade']}] {r['overall']:>4} — {r['skill_name']}") + print(f" clarity={r['clarity']} spec={r['specificity']} kw={r['keyword_density']} " + f"uniq={r['uniqueness']} len={r['length_score']}") + # Truncate description for display + desc = r["description"] + if len(desc) > 80: + desc = desc[:77] + "..." + print(f" \"{desc}\"") + print() + + # Persist + state["last_scan_at"] = now + state["skill_scores"] = all_results + history = state.get("scan_history") or [] + history.insert(0, { + "scanned_at": now, "skills_scanned": len(skills), + "avg_score": avg_score, "grade_distribution": dist, + }) + state["scan_history"] = history[:MAX_HISTORY] + save_state(state) + + +def cmd_skill(state: dict, name: str, fmt: str) -> None: + skills = discover_skills() + target = None + for s in skills: + if s["name"] == name: + target = s + break + if not target: + print(f"Error: skill '{name}' not found.") + sys.exit(1) + + all_descs = [s["description"] for s in skills if s["name"] != name] + scores = score_description(target["description"], all_descs) + + if fmt == "json": + scores["skill_name"] = name + scores["description"] = target["description"] + print(json.dumps(scores, indent=2)) + else: + print(f"\nDeep Analysis: {name}") + print("-" * 50) + print(f" Description: \"{target['description']}\"") + print(f" Word count: {scores['word_count']}") + print() + print(f" Clarity: {scores['clarity']:>4}/10 {'||' * int(scores['clarity'])}") + print(f" Specificity: {scores['specificity']:>4}/10 {'||' * int(scores['specificity'])}") + print(f" Keyword density: {scores['keyword_density']:>4}/10 {'||' * int(scores['keyword_density'])}") + print(f" Uniqueness: {scores['uniqueness']:>4}/10 {'||' * int(scores['uniqueness'])}") + print(f" Length score: {scores['length_score']:>4}/10 {'||' * int(scores['length_score'])}") + print(f" ─────────────────────────") + print(f" Overall: {scores['overall']:>4}/10 Grade: {scores['grade']}") + print() + + # Show vague words found + tokens = tokenize(target["description"]) + vague_found = [t for t in tokens if t in VAGUE_WORDS] + if vague_found: + print(f" Vague words: {', '.join(set(vague_found))}") + + strong_found = [t for t in tokens if t in STRONG_VERBS | STRONG_NOUNS] + if strong_found: + print(f" Strong keywords: {', '.join(set(strong_found))}") + print() + + +def cmd_suggest(state: dict, name: str, fmt: str) -> None: + skills = discover_skills() + target = None + for s in skills: + if s["name"] == name: + target = s + break + if not target: + print(f"Error: skill '{name}' not found.") + sys.exit(1) + + all_descs = [s["description"] for s in skills if s["name"] != name] + current = score_description(target["description"], all_descs) + suggestions = suggest_rewrites(name, target["description"], all_descs) + + if fmt == "json": + print(json.dumps({"skill": name, "current_score": current["overall"], + "current_grade": current["grade"], "suggestions": suggestions}, indent=2)) + else: + print(f"\nRewrite Suggestions: {name}") + print("-" * 50) + print(f" Current: \"{target['description']}\"") + print(f" Score: {current['overall']} ({current['grade']})") + print() + for i, s in enumerate(suggestions, 1): + delta = s["predicted_score"] - current["overall"] + arrow = "+" if delta > 0 else "" + print(f" {i}. {s['strategy']}") + print(f" \"{s['rewrite']}\"") + print(f" Predicted: {s['predicted_score']} ({s['predicted_grade']}) [{arrow}{delta}]") + print() + + +def cmd_compare(desc_a: str, desc_b: str, fmt: str) -> None: + scores_a = score_description(desc_a, [desc_b]) + scores_b = score_description(desc_b, [desc_a]) + + if fmt == "json": + print(json.dumps({"a": {"description": desc_a, **scores_a}, + "b": {"description": desc_b, **scores_b}}, indent=2)) + else: + print(f"\nDescription Comparison") + print("-" * 50) + print(f" A: \"{desc_a}\"") + print(f" B: \"{desc_b}\"") + print() + metrics = ["clarity", "specificity", "keyword_density", "uniqueness", "length_score", "overall"] + labels = ["Clarity", "Specificity", "Keywords", "Uniqueness", "Length", "OVERALL"] + for label, metric in zip(labels, metrics): + va = scores_a[metric] + vb = scores_b[metric] + winner = "A" if va > vb else ("B" if vb > va else "=") + print(f" {label:12s} A={va:<5} B={vb:<5} {winner}") + print(f"\n Grade: A={scores_a['grade']} B={scores_b['grade']}") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_scan_at", "never") + print(f"\nTool Description Optimizer — Last scan: {last}") + history = state.get("scan_history") or [] + if history: + h = history[0] + print(f" {h.get('skills_scanned', 0)} skills | avg score: {h.get('avg_score', 0)}") + dist = h.get("grade_distribution", {}) + print(f" Grades: {dist.get('A',0)}xA {dist.get('B',0)}xB " + f"{dist.get('C',0)}xC {dist.get('D',0)}xD {dist.get('F',0)}xF") + scores = state.get("skill_scores") or [] + low = [s for s in scores if s.get("grade") in ("D", "F")] + if low: + print(f"\n Low-scoring ({len(low)}):") + for s in low[:5]: + print(f" [{s['grade']}] {s['overall']} — {s['skill_name']}") + print() + + +def main(): + parser = argparse.ArgumentParser(description="Tool Description Optimizer") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--scan", action="store_true", help="Score all installed skill descriptions") + group.add_argument("--skill", type=str, metavar="NAME", help="Deep analysis of a single skill") + group.add_argument("--suggest", type=str, metavar="NAME", help="Generate rewrite suggestions") + group.add_argument("--compare", nargs=2, metavar=("DESC_A", "DESC_B"), help="Compare two descriptions") + group.add_argument("--status", action="store_true", help="Last scan summary") + parser.add_argument("--grade", type=str, metavar="GRADE", help="Only show skills at or below this grade (A-F)") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.scan: + cmd_scan(state, args.grade, args.format) + elif args.skill: + cmd_skill(state, args.skill, args.format) + elif args.suggest: + cmd_suggest(state, args.suggest, args.format) + elif args.compare: + cmd_compare(args.compare[0], args.compare[1], args.format) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() From 3630f7d15907540161829a03d9ae5b5adee6f636 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 00:57:19 +0530 Subject: [PATCH 14/23] Add mcp-health-checker skill (#31) Monitors MCP server connections for health, latency, and availability. Probes stdio servers via JSON-RPC initialize and HTTP servers via GET. Detects stale connections, timeouts, unreachable servers. Cron runs every 6 hours. Companion script: check.py with --ping, --config, --status, --history commands. Inspired by OpenLobster's MCP connection health monitoring. Co-authored-by: Claude Opus 4.6 --- .../mcp-health-checker/SKILL.md | 112 ++++ .../mcp-health-checker/STATE_SCHEMA.yaml | 31 ++ .../mcp-health-checker/check.py | 514 ++++++++++++++++++ .../mcp-health-checker/example-state.yaml | 86 +++ 4 files changed, 743 insertions(+) create mode 100644 skills/openclaw-native/mcp-health-checker/SKILL.md create mode 100644 skills/openclaw-native/mcp-health-checker/STATE_SCHEMA.yaml create mode 100755 skills/openclaw-native/mcp-health-checker/check.py create mode 100644 skills/openclaw-native/mcp-health-checker/example-state.yaml diff --git a/skills/openclaw-native/mcp-health-checker/SKILL.md b/skills/openclaw-native/mcp-health-checker/SKILL.md new file mode 100644 index 0000000..c66abfe --- /dev/null +++ b/skills/openclaw-native/mcp-health-checker/SKILL.md @@ -0,0 +1,112 @@ +--- +name: mcp-health-checker +version: "1.0" +category: openclaw-native +description: Monitors MCP server connections for health, latency, and availability — detects stale connections, timeouts, and unreachable servers before they cause silent tool failures. +stateful: true +cron: "0 */6 * * *" +--- + +# MCP Health Checker + +## What it does + +MCP (Model Context Protocol) servers are how OpenClaw connects to external tools — but connections go stale silently. A crashed MCP server doesn't throw an error until the agent tries to use it, causing confusing mid-task failures. + +MCP Health Checker proactively monitors all configured MCP connections. It pings servers, measures latency, tracks uptime history, and alerts you before a stale connection causes a problem. + +Inspired by OpenLobster's MCP connection health monitoring and OAuth 2.1+PKCE token refresh tracking. + +## When to invoke + +- Automatically every 6 hours (cron) — silent background health check +- Manually before starting a task that depends on MCP tools +- When an MCP tool call fails unexpectedly — diagnose the connection +- After restarting MCP servers — verify all connections restored + +## Health checks performed + +| Check | What it tests | Severity on failure | +|---|---|---| +| REACHABLE | Server responds to connection probe | CRITICAL | +| LATENCY | Response time under threshold (default: 5s) | HIGH | +| STALE | Connection age exceeds max (default: 24h) | HIGH | +| TOOL_COUNT | Server exposes expected number of tools | MEDIUM | +| CONFIG_VALID | MCP config entry has required fields | MEDIUM | +| AUTH_EXPIRY | OAuth/API token approaching expiration | HIGH | + +## How to use + +```bash +python3 check.py --ping # Ping all configured MCP servers +python3 check.py --ping --server # Ping a specific server +python3 check.py --ping --timeout 3 # Custom timeout in seconds +python3 check.py --status # Last check summary from state +python3 check.py --history # Show past check results +python3 check.py --config # Validate MCP config entries +python3 check.py --format json # Machine-readable output +``` + +## Cron wakeup behaviour + +Every 6 hours: + +1. Read MCP server configuration from `~/.openclaw/config/` (YAML/JSON) +2. For each configured server: + - Attempt connection probe (TCP or HTTP depending on transport) + - Measure response latency + - Check connection age against staleness threshold + - Verify tool listing matches expected count (if tracked) + - Check auth token expiry (if applicable) +3. Update state with per-server health records +4. Print summary: healthy / degraded / unreachable counts +5. Exit 1 if any CRITICAL findings + +## Procedure + +**Step 1 — Run a health check** + +```bash +python3 check.py --ping +``` + +Review the output. Healthy servers show a green check. Degraded servers show latency warnings. Unreachable servers show a critical alert. + +**Step 2 — Diagnose a specific server** + +```bash +python3 check.py --ping --server filesystem +``` + +Detailed output for a single server: latency, last seen, tool count, auth status. + +**Step 3 — Validate configuration** + +```bash +python3 check.py --config +``` + +Checks that all MCP config entries have the required fields (`command`, `args` or `url` depending on transport type). + +**Step 4 — Review history** + +```bash +python3 check.py --history +``` + +Shows uptime trends over the last 20 checks. Spot servers that are intermittently failing. + +## State + +Per-server health records and check history stored in `~/.openclaw/skill-state/mcp-health-checker/state.yaml`. + +Fields: `last_check_at`, `servers` list, `check_history`. + +## Notes + +- Does not modify MCP configuration — read-only monitoring +- Connection probes use the same transport as the MCP server (stdio subprocess spawn or HTTP GET) +- For stdio servers: probes verify the process can start and respond to `initialize` +- For HTTP/SSE servers: probes send a health-check HTTP request +- Latency threshold configurable via `--timeout` (default: 5s) +- Staleness threshold configurable via `--max-age` (default: 24h) diff --git a/skills/openclaw-native/mcp-health-checker/STATE_SCHEMA.yaml b/skills/openclaw-native/mcp-health-checker/STATE_SCHEMA.yaml new file mode 100644 index 0000000..a756c7e --- /dev/null +++ b/skills/openclaw-native/mcp-health-checker/STATE_SCHEMA.yaml @@ -0,0 +1,31 @@ +version: "1.0" +description: MCP server health records, per-server status, and check history. +fields: + last_check_at: + type: datetime + servers: + type: list + description: Per-server health status from the most recent check + items: + name: { type: string, description: "Server name from config" } + transport: { type: string, description: "stdio or http" } + status: { type: enum, values: [healthy, degraded, unreachable, unknown] } + latency_ms: { type: integer, description: "Response time in milliseconds" } + last_seen_at: { type: datetime, description: "Last successful probe" } + tool_count: { type: integer, description: "Number of tools exposed" } + findings: + type: list + items: + check: { type: string } + severity: { type: string } + detail: { type: string } + checked_at: { type: datetime } + check_history: + type: list + description: Rolling log of past checks (last 20) + items: + checked_at: { type: datetime } + servers_checked: { type: integer } + healthy: { type: integer } + degraded: { type: integer } + unreachable: { type: integer } diff --git a/skills/openclaw-native/mcp-health-checker/check.py b/skills/openclaw-native/mcp-health-checker/check.py new file mode 100755 index 0000000..57757be --- /dev/null +++ b/skills/openclaw-native/mcp-health-checker/check.py @@ -0,0 +1,514 @@ +#!/usr/bin/env python3 +""" +MCP Health Checker for openclaw-superpowers. + +Monitors MCP server connections for health, latency, and availability. + +Usage: + python3 check.py --ping + python3 check.py --ping --server + python3 check.py --ping --timeout 3 + python3 check.py --config + python3 check.py --status + python3 check.py --history + python3 check.py --format json +""" + +import argparse +import json +import os +import subprocess +import sys +import time +from datetime import datetime, timedelta +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "mcp-health-checker" / "state.yaml" +MAX_HISTORY = 20 + +# MCP config locations to search +MCP_CONFIG_PATHS = [ + OPENCLAW_DIR / "config" / "mcp.yaml", + OPENCLAW_DIR / "config" / "mcp.json", + OPENCLAW_DIR / "mcp.yaml", + OPENCLAW_DIR / "mcp.json", + Path.home() / ".config" / "openclaw" / "mcp.yaml", + Path.home() / ".config" / "openclaw" / "mcp.json", +] + +DEFAULT_TIMEOUT = 5 # seconds +DEFAULT_MAX_AGE = 24 # hours + + +# ── State helpers ──────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"servers": [], "check_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── MCP config discovery ──────────────────────────────────────────────────── + +def find_mcp_config() -> tuple[Path | None, dict]: + """Find and parse MCP configuration.""" + for config_path in MCP_CONFIG_PATHS: + if not config_path.exists(): + continue + try: + text = config_path.read_text() + if config_path.suffix == ".json": + data = json.loads(text) + elif HAS_YAML: + data = yaml.safe_load(text) or {} + else: + continue + return config_path, data + except Exception: + continue + return None, {} + + +def extract_servers(config: dict) -> list[dict]: + """Extract server definitions from MCP config.""" + servers = [] + # Support both flat and nested formats + mcp_servers = config.get("mcpServers") or config.get("servers") or config + if isinstance(mcp_servers, dict): + for name, defn in mcp_servers.items(): + if not isinstance(defn, dict): + continue + transport = "stdio" + if "url" in defn: + transport = "http" + elif "command" in defn: + transport = "stdio" + servers.append({ + "name": name, + "transport": transport, + "command": defn.get("command"), + "args": defn.get("args", []), + "url": defn.get("url"), + "env": defn.get("env", {}), + }) + return servers + + +# ── Health checks ──────────────────────────────────────────────────────────── + +def probe_stdio_server(server: dict, timeout: int) -> dict: + """Probe a stdio MCP server by attempting to start and initialize it.""" + command = server.get("command") + args = server.get("args", []) + if not command: + return { + "status": "unreachable", + "latency_ms": 0, + "findings": [{"check": "CONFIG_VALID", "severity": "MEDIUM", + "detail": "No command specified for stdio server"}], + } + + # Build the initialize JSON-RPC request + init_request = json.dumps({ + "jsonrpc": "2.0", + "id": 1, + "method": "initialize", + "params": { + "protocolVersion": "2024-11-05", + "capabilities": {}, + "clientInfo": {"name": "mcp-health-checker", "version": "1.0"}, + } + }) + "\n" + + start = time.monotonic() + try: + env = os.environ.copy() + env.update(server.get("env", {})) + proc = subprocess.Popen( + [command] + args, + stdin=subprocess.PIPE, + stdout=subprocess.PIPE, + stderr=subprocess.PIPE, + env=env, + ) + stdout, stderr = proc.communicate( + input=init_request.encode(), + timeout=timeout, + ) + elapsed_ms = int((time.monotonic() - start) * 1000) + + if proc.returncode is not None and proc.returncode != 0 and not stdout: + return { + "status": "unreachable", + "latency_ms": elapsed_ms, + "findings": [{"check": "REACHABLE", "severity": "CRITICAL", + "detail": f"Process exited with code {proc.returncode}"}], + } + + # Try to parse response + findings = [] + tool_count = 0 + try: + response = json.loads(stdout.decode().strip().split("\n")[0]) + if "result" in response: + caps = response["result"].get("capabilities", {}) + if "tools" in caps: + tool_count = -1 # Has tools capability but count unknown until list + except (json.JSONDecodeError, IndexError): + findings.append({"check": "REACHABLE", "severity": "HIGH", + "detail": "Server responded but output not valid JSON-RPC"}) + + # Check latency + status = "healthy" + if elapsed_ms > timeout * 1000: + findings.append({"check": "LATENCY", "severity": "HIGH", + "detail": f"Response time {elapsed_ms}ms exceeds {timeout}s threshold"}) + status = "degraded" + elif elapsed_ms > (timeout * 1000) // 2: + findings.append({"check": "LATENCY", "severity": "MEDIUM", + "detail": f"Response time {elapsed_ms}ms approaching threshold"}) + + if findings and status == "healthy": + status = "degraded" + + return { + "status": status, + "latency_ms": elapsed_ms, + "tool_count": tool_count, + "findings": findings, + } + + except subprocess.TimeoutExpired: + elapsed_ms = int((time.monotonic() - start) * 1000) + try: + proc.kill() + proc.wait(timeout=2) + except Exception: + pass + return { + "status": "unreachable", + "latency_ms": elapsed_ms, + "findings": [{"check": "LATENCY", "severity": "CRITICAL", + "detail": f"Server did not respond within {timeout}s"}], + } + except FileNotFoundError: + return { + "status": "unreachable", + "latency_ms": 0, + "findings": [{"check": "REACHABLE", "severity": "CRITICAL", + "detail": f"Command not found: {command}"}], + } + except Exception as e: + return { + "status": "unreachable", + "latency_ms": 0, + "findings": [{"check": "REACHABLE", "severity": "CRITICAL", + "detail": f"Probe failed: {str(e)[:100]}"}], + } + + +def probe_http_server(server: dict, timeout: int) -> dict: + """Probe an HTTP/SSE MCP server via HTTP GET.""" + url = server.get("url") + if not url: + return { + "status": "unreachable", + "latency_ms": 0, + "findings": [{"check": "CONFIG_VALID", "severity": "MEDIUM", + "detail": "No URL specified for HTTP server"}], + } + + start = time.monotonic() + try: + import urllib.request + req = urllib.request.Request(url, method="GET") + req.add_header("User-Agent", "mcp-health-checker/1.0") + with urllib.request.urlopen(req, timeout=timeout) as resp: + elapsed_ms = int((time.monotonic() - start) * 1000) + status_code = resp.status + + findings = [] + if status_code >= 400: + findings.append({"check": "REACHABLE", "severity": "CRITICAL", + "detail": f"HTTP {status_code} response"}) + return {"status": "unreachable", "latency_ms": elapsed_ms, "findings": findings} + + if elapsed_ms > timeout * 1000: + findings.append({"check": "LATENCY", "severity": "HIGH", + "detail": f"Response time {elapsed_ms}ms exceeds threshold"}) + + status = "degraded" if findings else "healthy" + return {"status": status, "latency_ms": elapsed_ms, "findings": findings} + + except Exception as e: + elapsed_ms = int((time.monotonic() - start) * 1000) + return { + "status": "unreachable", + "latency_ms": elapsed_ms, + "findings": [{"check": "REACHABLE", "severity": "CRITICAL", + "detail": f"Connection failed: {str(e)[:100]}"}], + } + + +def check_staleness(server_name: str, state: dict, max_age_hours: int) -> list[dict]: + """Check if a server connection is stale based on last seen time.""" + findings = [] + prev_servers = state.get("servers") or [] + for prev in prev_servers: + if prev.get("name") == server_name and prev.get("last_seen_at"): + try: + last_seen = datetime.fromisoformat(prev["last_seen_at"]) + age = datetime.now() - last_seen + if age > timedelta(hours=max_age_hours): + findings.append({ + "check": "STALE", + "severity": "HIGH", + "detail": f"Last successful probe was {age.total_seconds()/3600:.1f}h ago " + f"(threshold: {max_age_hours}h)", + }) + except (ValueError, TypeError): + pass + return findings + + +def validate_config_entry(server: dict) -> list[dict]: + """Validate a server config entry has required fields.""" + findings = [] + if server["transport"] == "stdio": + if not server.get("command"): + findings.append({"check": "CONFIG_VALID", "severity": "MEDIUM", + "detail": "Missing 'command' field for stdio server"}) + elif server["transport"] == "http": + if not server.get("url"): + findings.append({"check": "CONFIG_VALID", "severity": "MEDIUM", + "detail": "Missing 'url' field for HTTP server"}) + return findings + + +# ── Commands ───────────────────────────────────────────────────────────────── + +def cmd_ping(state: dict, server_filter: str, timeout: int, max_age: int, fmt: str) -> None: + config_path, config = find_mcp_config() + now = datetime.now().isoformat() + + if not config_path: + print("No MCP configuration found. Searched:") + for p in MCP_CONFIG_PATHS: + print(f" {p}") + print("\nCreate an MCP config to enable health checking.") + sys.exit(1) + + servers = extract_servers(config) + if server_filter: + servers = [s for s in servers if s["name"] == server_filter] + if not servers: + print(f"Error: server '{server_filter}' not found in config.") + sys.exit(1) + + results = [] + healthy = degraded = unreachable = 0 + + for server in servers: + # Probe based on transport + if server["transport"] == "http": + probe = probe_http_server(server, timeout) + else: + probe = probe_stdio_server(server, timeout) + + # Add staleness check + stale_findings = check_staleness(server["name"], state, max_age) + all_findings = probe.get("findings", []) + stale_findings + + # Determine final status + status = probe["status"] + if status == "healthy" and stale_findings: + status = "degraded" + + last_seen = now if status == "healthy" else None + # Preserve previous last_seen if current probe failed + if not last_seen: + for prev in (state.get("servers") or []): + if prev.get("name") == server["name"]: + last_seen = prev.get("last_seen_at") + break + + result = { + "name": server["name"], + "transport": server["transport"], + "status": status, + "latency_ms": probe.get("latency_ms", 0), + "last_seen_at": last_seen, + "tool_count": probe.get("tool_count", 0), + "findings": all_findings, + "checked_at": now, + } + results.append(result) + + if status == "healthy": + healthy += 1 + elif status == "degraded": + degraded += 1 + else: + unreachable += 1 + + if fmt == "json": + print(json.dumps({ + "config_path": str(config_path), + "servers_checked": len(results), + "healthy": healthy, "degraded": degraded, "unreachable": unreachable, + "servers": results, + }, indent=2)) + else: + print(f"\nMCP Health Check — {datetime.now().strftime('%Y-%m-%d %H:%M')}") + print("-" * 55) + print(f" Config: {config_path}") + print(f" {len(results)} servers | {healthy} healthy | {degraded} degraded | {unreachable} unreachable") + print() + for r in results: + if r["status"] == "healthy": + icon = "+" + elif r["status"] == "degraded": + icon = "!" + else: + icon = "x" + print(f" {icon} [{r['status'].upper():>11}] {r['name']} ({r['transport']}) — {r['latency_ms']}ms") + for f in r.get("findings", []): + print(f" [{f['severity']}] {f['check']}: {f['detail']}") + print() + + # Persist + state["last_check_at"] = now + state["servers"] = results + history = state.get("check_history") or [] + history.insert(0, { + "checked_at": now, "servers_checked": len(results), + "healthy": healthy, "degraded": degraded, "unreachable": unreachable, + }) + state["check_history"] = history[:MAX_HISTORY] + save_state(state) + + sys.exit(1 if unreachable > 0 else 0) + + +def cmd_config(fmt: str) -> None: + config_path, config = find_mcp_config() + if not config_path: + print("No MCP configuration found.") + sys.exit(1) + + servers = extract_servers(config) + issues = [] + for server in servers: + findings = validate_config_entry(server) + if findings: + issues.append({"server": server["name"], "findings": findings}) + + if fmt == "json": + print(json.dumps({ + "config_path": str(config_path), + "servers": len(servers), + "issues": issues, + }, indent=2)) + else: + print(f"\nMCP Config Validation — {config_path}") + print("-" * 50) + print(f" {len(servers)} servers configured") + print() + if not issues: + print(" All config entries valid.") + else: + for issue in issues: + print(f" ! {issue['server']}:") + for f in issue["findings"]: + print(f" [{f['severity']}] {f['detail']}") + print() + for server in servers: + print(f" {server['name']}: transport={server['transport']}", end="") + if server.get("command"): + print(f" cmd={server['command']}", end="") + if server.get("url"): + print(f" url={server['url']}", end="") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_check_at", "never") + print(f"\nMCP Health Checker — Last check: {last}") + servers = state.get("servers") or [] + if servers: + healthy = sum(1 for s in servers if s.get("status") == "healthy") + degraded = sum(1 for s in servers if s.get("status") == "degraded") + unreachable = sum(1 for s in servers if s.get("status") == "unreachable") + print(f" {len(servers)} servers | {healthy} healthy | {degraded} degraded | {unreachable} unreachable") + for s in servers: + icon = {"healthy": "+", "degraded": "!", "unreachable": "x"}.get(s.get("status", ""), "?") + print(f" {icon} {s['name']}: {s.get('status', 'unknown')} ({s.get('latency_ms', 0)}ms)") + print() + + +def cmd_history(state: dict, fmt: str) -> None: + history = state.get("check_history") or [] + if fmt == "json": + print(json.dumps({"check_history": history}, indent=2)) + else: + print(f"\nMCP Health Check History") + print("-" * 50) + if not history: + print(" No check history yet.") + else: + for h in history[:10]: + total = h.get("servers_checked", 0) + healthy = h.get("healthy", 0) + degraded = h.get("degraded", 0) + unreachable = h.get("unreachable", 0) + pct = round(healthy / total * 100) if total else 0 + ts = h.get("checked_at", "?")[:16] + bar = "=" * (pct // 10) + "-" * (10 - pct // 10) + print(f" {ts} [{bar}] {pct}% healthy ({healthy}/{total})") + print() + + +def main(): + parser = argparse.ArgumentParser(description="MCP Health Checker") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--ping", action="store_true", help="Ping all configured MCP servers") + group.add_argument("--config", action="store_true", help="Validate MCP config entries") + group.add_argument("--status", action="store_true", help="Last check summary from state") + group.add_argument("--history", action="store_true", help="Show past check results") + parser.add_argument("--server", type=str, metavar="NAME", help="Check a specific server only") + parser.add_argument("--timeout", type=int, default=DEFAULT_TIMEOUT, help="Timeout in seconds (default: 5)") + parser.add_argument("--max-age", type=int, default=DEFAULT_MAX_AGE, help="Max connection age in hours (default: 24)") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.ping: + cmd_ping(state, args.server, args.timeout, args.max_age, args.format) + elif args.config: + cmd_config(args.format) + elif args.status: + cmd_status(state) + elif args.history: + cmd_history(state, args.format) + + +if __name__ == "__main__": + main() diff --git a/skills/openclaw-native/mcp-health-checker/example-state.yaml b/skills/openclaw-native/mcp-health-checker/example-state.yaml new file mode 100644 index 0000000..1005202 --- /dev/null +++ b/skills/openclaw-native/mcp-health-checker/example-state.yaml @@ -0,0 +1,86 @@ +# Example runtime state for mcp-health-checker +last_check_at: "2026-03-16T12:00:08.554000" +servers: + - name: filesystem + transport: stdio + status: healthy + latency_ms: 120 + last_seen_at: "2026-03-16T12:00:02.000000" + tool_count: 11 + findings: [] + checked_at: "2026-03-16T12:00:02.000000" + - name: github + transport: stdio + status: healthy + latency_ms: 340 + last_seen_at: "2026-03-16T12:00:04.000000" + tool_count: 18 + findings: [] + checked_at: "2026-03-16T12:00:04.000000" + - name: web-search + transport: http + status: degraded + latency_ms: 4200 + last_seen_at: "2026-03-16T12:00:08.000000" + tool_count: 3 + findings: + - check: LATENCY + severity: MEDIUM + detail: "Response time 4200ms approaching threshold" + checked_at: "2026-03-16T12:00:08.000000" + - name: database + transport: stdio + status: unreachable + latency_ms: 0 + last_seen_at: "2026-03-15T06:00:00.000000" + tool_count: 0 + findings: + - check: REACHABLE + severity: CRITICAL + detail: "Command not found: pg-mcp-server" + - check: STALE + severity: HIGH + detail: "Last successful probe was 30.0h ago (threshold: 24h)" + checked_at: "2026-03-16T12:00:08.000000" +check_history: + - checked_at: "2026-03-16T12:00:08.554000" + servers_checked: 4 + healthy: 2 + degraded: 1 + unreachable: 1 + - checked_at: "2026-03-16T06:00:05.000000" + servers_checked: 4 + healthy: 3 + degraded: 1 + unreachable: 0 + - checked_at: "2026-03-16T00:00:04.000000" + servers_checked: 4 + healthy: 4 + degraded: 0 + unreachable: 0 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Cron runs every 6 hours: python3 check.py --ping +# +# MCP Health Check — 2026-03-16 12:00 +# ─────────────────────────────────────────────────────── +# Config: /Users/you/.openclaw/config/mcp.yaml +# 4 servers | 2 healthy | 1 degraded | 1 unreachable +# +# + [ HEALTHY] filesystem (stdio) — 120ms +# +# + [ HEALTHY] github (stdio) — 340ms +# +# ! [ DEGRADED] web-search (http) — 4200ms +# [MEDIUM] LATENCY: Response time 4200ms approaching threshold +# +# x [UNREACHABLE] database (stdio) — 0ms +# [CRITICAL] REACHABLE: Command not found: pg-mcp-server +# [HIGH] STALE: Last successful probe was 30.0h ago (threshold: 24h) +# +# python3 check.py --history +# +# MCP Health Check History +# ────────────────────────────────────────────────── +# 2026-03-16T12:00 [=====-----] 50% healthy (2/4) +# 2026-03-16T06:00 [=======---] 75% healthy (3/4) +# 2026-03-16T00:00 [==========] 100% healthy (4/4) From 5769116f23a116af503aa370c827778384e05c5b Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 00:58:22 +0530 Subject: [PATCH 15/23] =?UTF-8?q?Update=20README:=2040=20=E2=86=92=2044=20?= =?UTF-8?q?skills=20(4=20OpenLobster-inspired=20additions)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add memory-graph-builder, config-encryption-auditor, tool-description-optimizer, and mcp-health-checker to the OpenClaw-Native table. Update security section to 6 skills. Update companion scripts list. Co-Authored-By: Claude Opus 4.6 --- README.md | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 4aebc7e..c3a7017 100644 --- a/README.md +++ b/README.md @@ -64,7 +64,7 @@ Methodology skills that work in any runtime. Adapted from [obra/superpowers](htt | `skill-conflict-detector` | Detects name shadowing and description-overlap conflicts between installed skills | `detect.py` | | `skill-portability-checker` | Validates OS/binary dependencies in companion scripts; catches non-portable calls | `check.py` | -### OpenClaw-Native (24 skills) +### OpenClaw-Native (28 skills) Skills that require OpenClaw's persistent runtime — cron scheduling, session state, or long-running execution. Not useful in session-based tools. @@ -94,6 +94,10 @@ Skills that require OpenClaw's persistent runtime — cron scheduling, session s | `skill-compatibility-checker` | Checks installed skills against the current OpenClaw version for feature compatibility | — | ✓ | `check.py` | | `heartbeat-governor` | Enforces per-skill execution budgets for cron skills; auto-pauses runaway skills | every hour | ✓ | `governor.py` | | `community-skill-radar` | Scans Reddit for OpenClaw pain points and feature requests; writes prioritized PROPOSALS.md | every 3 days | ✓ | `radar.py` | +| `memory-graph-builder` | Parses MEMORY.md into a knowledge graph; detects duplicates, contradictions, and stale entries; generates compressed digest | daily 10pm | ✓ | `graph.py` | +| `config-encryption-auditor` | Scans config directories for plaintext API keys, tokens, and world-readable permissions | Sundays 9am | ✓ | `audit.py` | +| `tool-description-optimizer` | Scores skill descriptions for trigger quality — clarity, specificity, keyword density — and suggests rewrites | — | ✓ | `optimize.py` | +| `mcp-health-checker` | Monitors MCP server connections for health, latency, and availability; detects stale connections | every 6h | ✓ | `check.py` | ### Community (1 skill) @@ -113,7 +117,7 @@ Stateful skills commit a `STATE_SCHEMA.yaml` defining the shape of their runtime Skills marked with a script in the table above ship a small executable alongside their `SKILL.md`: -- **Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`, `radar.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies required; `pyyaml` is optional but recommended. +- **Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`, `radar.py`, `graph.py`, `optimize.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies required; `pyyaml` is optional but recommended. - **`vet.sh`** — Pure bash scanner; runs on any system with grep. - Each script supports `--help` and prints a human-readable summary. JSON output available where useful (`--format json`). Dry-run mode available on scripts that make changes. - See the `example-state.yaml` in each skill directory for sample state and a commented walkthrough of the skill's cron behaviour. @@ -122,7 +126,7 @@ Skills marked with a script in the table above ship a small executable alongside ## Security skills at a glance -Five skills address the documented top security risks for OpenClaw agents: +Six skills address the documented top security risks for OpenClaw agents: | Threat | Skill | How | |---|---|---| @@ -131,6 +135,7 @@ Five skills address the documented top security risks for OpenClaw agents: | Agent takes destructive action without confirmation | `dangerous-action-guard` | Pre-execution gate with 5-min expiry window and full audit trail | | Post-install skill tampering or credential injection | `installed-skill-auditor` | Weekly content-hash drift detection; INJECTION / CREDENTIAL / EXFILTRATION checks | | Silent skill loading failures hiding broken skills | `skill-doctor` | 6 diagnostic checks per skill; surfaces every load-time failure before it disappears | +| Plaintext API keys and tokens in config files | `config-encryption-auditor` | Scans for 8 API key patterns + 3 token patterns; auto-fixes permissions; suggests env var migration | --- From 1bf65287d95e61eb69308ff40e28420bbb52f3b8 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 01:01:42 +0530 Subject: [PATCH 16/23] Add memory-graph-builder: structured knowledge graph from MEMORY.md (#28) Parses flat MEMORY.md into nodes with categories, entities, and typed relationships. Detects duplicates (Jaccard >0.7), contradictions, and stale entries. Generates compressed memory digest saving 30-60% tokens. Inspired by OpenLobster's Neo4j graph memory. Cron: nightly 10pm. Co-authored-by: Claude Sonnet 4.6 --- .../memory-graph-builder/SKILL.md | 144 +++++ .../memory-graph-builder/STATE_SCHEMA.yaml | 54 ++ .../memory-graph-builder/example-state.yaml | 43 ++ .../memory-graph-builder/graph.py | 545 ++++++++++++++++++ 4 files changed, 786 insertions(+) create mode 100644 skills/openclaw-native/memory-graph-builder/SKILL.md create mode 100644 skills/openclaw-native/memory-graph-builder/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/memory-graph-builder/example-state.yaml create mode 100755 skills/openclaw-native/memory-graph-builder/graph.py diff --git a/skills/openclaw-native/memory-graph-builder/SKILL.md b/skills/openclaw-native/memory-graph-builder/SKILL.md new file mode 100644 index 0000000..0868901 --- /dev/null +++ b/skills/openclaw-native/memory-graph-builder/SKILL.md @@ -0,0 +1,144 @@ +--- +name: memory-graph-builder +version: "1.0" +category: openclaw-native +description: Parses OpenClaw's flat MEMORY.md into a structured knowledge graph — detects duplicates, contradictions, and stale entries, then builds a compressed memory digest optimized for system prompt injection. +stateful: true +cron: "0 22 * * *" +--- + +# Memory Graph Builder + +## What it does + +OpenClaw stores agent memory in a flat `MEMORY.md` file — one line per fact, no structure, no relationships. This works until your agent has 200+ memories and half of them are duplicates, three contradict each other, and the whole file costs 4,000 tokens every session. + +Memory Graph Builder treats MEMORY.md as a raw data source and builds a structured knowledge graph on top of it. Each memory becomes a node with typed relationships to other nodes. The graph enables: + +- **Duplicate detection** — "User prefers dark mode" and "User likes dark theme" are the same fact +- **Contradiction detection** — "User uses Python 3.8" vs "User uses Python 3.12" +- **Staleness detection** — Facts older than a configurable threshold that haven't been referenced +- **Memory digest** — A compressed, relationship-aware summary that replaces raw MEMORY.md in the system prompt, saving 30-60% tokens + +Inspired by OpenLobster's Neo4j-backed graph memory system, adapted to work on top of OpenClaw's existing MEMORY.md without requiring a database. + +## When to invoke + +- Automatically, nightly at 10pm (cron) +- After bulk memory additions (e.g., after project-onboarding) +- When the agent's context initialisation feels slow (memory bloat) +- Manually to audit memory quality + +## Graph structure + +Each memory line becomes a node: + +```yaml +nodes: + - id: "mem_001" + text: "User prefers Python for backend work" + category: preference # preference | fact | project | person | tool | config + entities: ["user", "python", "backend"] + added_at: "2026-03-01" + last_referenced: "2026-03-15" + confidence: 0.9 +edges: + - from: "mem_001" + to: "mem_014" + relation: related_to # related_to | contradicts | supersedes | depends_on +``` + +## How to use + +```bash +python3 graph.py --build # Parse MEMORY.md, build graph +python3 graph.py --duplicates # Show duplicate clusters +python3 graph.py --contradictions # Show contradicting pairs +python3 graph.py --stale --days 30 # Show memories not referenced in 30 days +python3 graph.py --digest # Generate compressed memory digest +python3 graph.py --digest --max-tokens 1500 # Digest with token budget +python3 graph.py --prune --dry-run # Show what would be removed +python3 graph.py --prune # Remove duplicates + stale entries +python3 graph.py --stats # Graph statistics +python3 graph.py --status # Last build summary +python3 graph.py --format json +``` + +## Cron wakeup behaviour + +Nightly at 10pm: + +1. Read MEMORY.md +2. Rebuild graph (incremental — only re-processes new/changed lines) +3. Detect duplicates and contradictions +4. Flag stale entries (>30 days unreferenced by default) +5. Generate fresh memory digest +6. Write digest to `~/.openclaw/workspace/memory-digest.md` +7. Log summary to state + +## Memory digest + +The digest is a compressed representation of the knowledge graph optimized for LLM consumption. Instead of dumping every raw line, it: + +- Groups related memories by category +- Merges duplicate facts into single entries +- Marks contradictions with `[CONFLICT]` so the agent can resolve them +- Omits stale entries below a confidence threshold +- Respects a configurable max-token budget + +Example digest output: + +```markdown +## Preferences +- Prefers Python for backend, TypeScript for frontend +- Dark mode everywhere; compact UI layouts +- Commit messages: imperative mood, max 72 chars + +## Active Projects +- openclaw-superpowers: skill library, 40 skills, MIT license +- personal-site: Next.js 14, deployed on Vercel + +## People +- Alice (teammate): works on auth, prefers Go + +## Conflicts (needs resolution) +- [CONFLICT] Python version: "3.8" vs "3.12" — ask user to clarify +``` + +## Procedure + +**Step 1 — Build the graph** + +```bash +python3 graph.py --build +``` + +**Step 2 — Review duplicates and contradictions** + +```bash +python3 graph.py --duplicates +python3 graph.py --contradictions +``` + +Fix contradictions by editing MEMORY.md directly or asking the agent to clarify. + +**Step 3 — Prune stale entries** + +```bash +python3 graph.py --prune --dry-run +python3 graph.py --prune +``` + +**Step 4 — Generate and use the digest** + +```bash +python3 graph.py --digest --max-tokens 1500 +``` + +Point OpenClaw's memory injection at `~/.openclaw/workspace/memory-digest.md` instead of raw MEMORY.md. + +## State + +Graph structure, digest cache, and audit history stored in `~/.openclaw/skill-state/memory-graph-builder/state.yaml`. + +Fields: `last_build_at`, `node_count`, `edge_count`, `duplicate_count`, `contradiction_count`, `stale_count`, `digest_tokens`, `build_history`. diff --git a/skills/openclaw-native/memory-graph-builder/STATE_SCHEMA.yaml b/skills/openclaw-native/memory-graph-builder/STATE_SCHEMA.yaml new file mode 100644 index 0000000..3a09c21 --- /dev/null +++ b/skills/openclaw-native/memory-graph-builder/STATE_SCHEMA.yaml @@ -0,0 +1,54 @@ +version: "1.0" +description: Knowledge graph built from MEMORY.md — nodes, edges, digest, and audit metrics. +fields: + last_build_at: + type: datetime + node_count: + type: integer + default: 0 + edge_count: + type: integer + default: 0 + duplicate_count: + type: integer + default: 0 + contradiction_count: + type: integer + default: 0 + stale_count: + type: integer + default: 0 + digest_tokens: + type: integer + default: 0 + nodes: + type: list + description: All memory nodes in the graph + items: + id: { type: string } + text: { type: string } + category: { type: enum, values: [preference, fact, project, person, tool, config, other] } + entities: { type: list, items: { type: string } } + added_at: { type: string } + last_referenced: { type: string } + confidence: { type: float } + is_duplicate_of: { type: string } + is_stale: { type: boolean } + edges: + type: list + description: Relationships between memory nodes + items: + from: { type: string } + to: { type: string } + relation: { type: enum, values: [related_to, contradicts, supersedes, depends_on, duplicate_of] } + weight: { type: float } + build_history: + type: list + description: Rolling log of graph builds (last 20) + items: + built_at: { type: datetime } + node_count: { type: integer } + duplicates_found: { type: integer } + contradictions_found: { type: integer } + stale_found: { type: integer } + digest_tokens: { type: integer } diff --git a/skills/openclaw-native/memory-graph-builder/example-state.yaml b/skills/openclaw-native/memory-graph-builder/example-state.yaml new file mode 100644 index 0000000..2f6c93d --- /dev/null +++ b/skills/openclaw-native/memory-graph-builder/example-state.yaml @@ -0,0 +1,43 @@ +# Example runtime state for memory-graph-builder +last_build_at: "2026-03-15T22:00:12.000000" +node_count: 48 +edge_count: 15 +duplicate_count: 4 +contradiction_count: 1 +stale_count: 3 +digest_tokens: 420 +build_history: + - built_at: "2026-03-15T22:00:12.000000" + node_count: 48 + duplicates_found: 4 + contradictions_found: 1 + stale_found: 3 + digest_tokens: 420 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Nightly cron runs: python3 graph.py --build +# +# Memory Graph Builder — 2026-03-15 22:00 +# ──────────────────────────────────────────────────────────────── +# Memory lines : 52 +# Nodes : 48 +# Edges : 15 +# Duplicates : 4 +# Contradictions : 1 +# Stale : 3 +# Digest tokens : ~420 +# +# Digest written to: ~/.openclaw/workspace/memory-digest.md +# +# python3 graph.py --duplicates +# DUP: "User prefers dark mode for all applications" +# ORIG: "User likes dark theme everywhere" +# +# python3 graph.py --contradictions +# A: "User uses Python 3.8 for backend services" +# B: "User recently upgraded to Python 3.12" +# → Resolve by editing MEMORY.md +# +# python3 graph.py --prune --dry-run +# Dry run — would prune 7 entries: +# [duplicate] "User prefers dark mode for all applications" +# [stale] "Working on migration to React 17" diff --git a/skills/openclaw-native/memory-graph-builder/graph.py b/skills/openclaw-native/memory-graph-builder/graph.py new file mode 100755 index 0000000..8711cc6 --- /dev/null +++ b/skills/openclaw-native/memory-graph-builder/graph.py @@ -0,0 +1,545 @@ +#!/usr/bin/env python3 +""" +Memory Graph Builder for openclaw-superpowers. + +Parses MEMORY.md into a structured knowledge graph. Detects duplicates, +contradictions, and stale entries. Generates a compressed memory digest. + +Usage: + python3 graph.py --build + python3 graph.py --duplicates + python3 graph.py --contradictions + python3 graph.py --stale [--days 30] + python3 graph.py --digest [--max-tokens 1500] + python3 graph.py --prune [--dry-run] + python3 graph.py --stats + python3 graph.py --status + python3 graph.py --format json +""" + +import argparse +import hashlib +import json +import os +import re +from datetime import datetime, timedelta +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "memory-graph-builder" / "state.yaml" +MEMORY_FILE = OPENCLAW_DIR / "MEMORY.md" +DIGEST_FILE = OPENCLAW_DIR / "workspace" / "memory-digest.md" +MAX_HISTORY = 20 + +# ── Categories ──────────────────────────────────────────────────────────────── + +CATEGORY_KEYWORDS = { + "preference": ["prefer", "like", "want", "always", "never", "favorite", "style", + "mode", "theme", "format", "convention"], + "project": ["project", "repo", "repository", "codebase", "app", "application", + "deploy", "build", "release", "version"], + "person": ["name is", "works on", "teammate", "colleague", "manager", "friend", + "email", "contact"], + "tool": ["uses", "installed", "runs", "tool", "editor", "ide", "framework", + "library", "database", "api"], + "config": ["config", "setting", "path", "directory", "port", "url", "endpoint", + "key", "token", "env"], + "fact": ["is", "has", "located", "lives", "born", "works at", "speaks", + "timezone", "language"], +} + + +def classify_category(text: str) -> str: + text_lower = text.lower() + scores = {cat: sum(1 for kw in kws if kw in text_lower) + for cat, kws in CATEGORY_KEYWORDS.items()} + best = max(scores, key=scores.get) + return best if scores[best] > 0 else "other" + + +def extract_entities(text: str) -> list[str]: + """Extract meaningful entities (nouns, proper names, tools) from text.""" + # Remove markdown formatting + clean = re.sub(r'[*_`#\[\]()]', '', text) + words = clean.split() + entities = [] + for w in words: + w_clean = w.strip(".,;:!?\"'") + if not w_clean: + continue + # Keep capitalized words, technical terms, or words > 3 chars that aren't stopwords + if (w_clean[0].isupper() and len(w_clean) > 1) or \ + re.match(r'^[A-Z][a-z]+', w_clean) or \ + (len(w_clean) > 3 and w_clean.lower() not in _STOPWORDS): + entities.append(w_clean.lower()) + return list(set(entities))[:8] + + +_STOPWORDS = { + "the", "and", "for", "with", "that", "this", "from", "have", "has", + "been", "were", "will", "would", "could", "should", "about", "into", + "when", "where", "which", "their", "there", "then", "than", "they", + "them", "these", "those", "some", "also", "just", "more", "most", + "very", "only", "over", "such", "after", "before", "between", "each", + "does", "doing", "being", "other", "using", +} + + +# ── Similarity ──────────────────────────────────────────────────────────────── + +def tokenize(text: str) -> set[str]: + words = re.findall(r'[a-z0-9]+', text.lower()) + return {w for w in words if w not in _STOPWORDS and len(w) > 2} + + +def jaccard(a: set, b: set) -> float: + if not a and not b: + return 1.0 + inter = len(a & b) + union = len(a | b) + return inter / union if union > 0 else 0.0 + + +def text_hash(text: str) -> str: + return hashlib.md5(text.strip().lower().encode()).hexdigest()[:12] + + +# ── State helpers ───────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"nodes": [], "edges": [], "build_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── MEMORY.md parser ────────────────────────────────────────────────────────── + +def parse_memory_file() -> list[str]: + """Read MEMORY.md and return non-empty, non-header lines.""" + if not MEMORY_FILE.exists(): + return [] + lines = [] + for line in MEMORY_FILE.read_text().splitlines(): + stripped = line.strip() + if not stripped or stripped.startswith("#") or stripped.startswith("---"): + continue + # Remove leading bullet/dash + if stripped.startswith(("- ", "* ", "+ ")): + stripped = stripped[2:].strip() + if len(stripped) > 5: + lines.append(stripped) + return lines + + +# ── Graph building ──────────────────────────────────────────────────────────── + +def build_graph(lines: list[str], stale_days: int = 30) -> tuple[list, list]: + """Build nodes and edges from memory lines.""" + nodes = [] + edges = [] + now = datetime.now() + + for i, line in enumerate(lines): + node_id = f"mem_{text_hash(line)}" + nodes.append({ + "id": node_id, + "text": line, + "category": classify_category(line), + "entities": extract_entities(line), + "added_at": now.strftime("%Y-%m-%d"), + "last_referenced": now.strftime("%Y-%m-%d"), + "confidence": 1.0, + "is_duplicate_of": None, + "is_stale": False, + }) + + # Detect duplicates (jaccard > 0.7) + for i in range(len(nodes)): + ti = tokenize(nodes[i]["text"]) + for j in range(i + 1, len(nodes)): + tj = tokenize(nodes[j]["text"]) + sim = jaccard(ti, tj) + if sim >= 0.7: + nodes[j]["is_duplicate_of"] = nodes[i]["id"] + nodes[j]["confidence"] = round(1.0 - sim, 2) + edges.append({ + "from": nodes[j]["id"], + "to": nodes[i]["id"], + "relation": "duplicate_of", + "weight": round(sim, 3), + }) + + # Detect contradictions (same entities, opposing signals) + contradiction_signals = [ + (r'\b3\.\d+\b', "version"), + (r'\b(true|false|yes|no|never|always)\b', "boolean"), + (r'\b(use|prefer|like|avoid|hate|dislike)\b', "preference"), + ] + for i in range(len(nodes)): + ei = set(nodes[i]["entities"]) + for j in range(i + 1, len(nodes)): + ej = set(nodes[j]["entities"]) + overlap = ei & ej + if len(overlap) < 2: + continue + # Check for opposing signals + ti = nodes[i]["text"].lower() + tj = nodes[j]["text"].lower() + for pattern, sig_type in contradiction_signals: + mi = re.findall(pattern, ti, re.I) + mj = re.findall(pattern, tj, re.I) + if mi and mj and set(mi) != set(mj): + edges.append({ + "from": nodes[i]["id"], + "to": nodes[j]["id"], + "relation": "contradicts", + "weight": round(len(overlap) / max(len(ei), len(ej), 1), 3), + }) + break + + # Detect related nodes (shared entities, not duplicates/contradictions) + existing_pairs = {(e["from"], e["to"]) for e in edges} + for i in range(len(nodes)): + ei = set(nodes[i]["entities"]) + for j in range(i + 1, len(nodes)): + pair = (nodes[i]["id"], nodes[j]["id"]) + rev = (nodes[j]["id"], nodes[i]["id"]) + if pair in existing_pairs or rev in existing_pairs: + continue + ej = set(nodes[j]["entities"]) + overlap = ei & ej + if len(overlap) >= 2: + edges.append({ + "from": nodes[i]["id"], + "to": nodes[j]["id"], + "relation": "related_to", + "weight": round(len(overlap) / max(len(ei | ej), 1), 3), + }) + + return nodes, edges + + +# ── Digest generator ────────────────────────────────────────────────────────── + +def generate_digest(nodes: list, edges: list, max_tokens: int = 2000) -> str: + """Generate compressed memory digest grouped by category.""" + # Filter out duplicates + active = [n for n in nodes if not n.get("is_duplicate_of") and not n.get("is_stale")] + contradictions = [e for e in edges if e["relation"] == "contradicts"] + + # Group by category + by_cat: dict = {} + for node in active: + cat = node.get("category", "other") + by_cat.setdefault(cat, []).append(node) + + # Build digest + lines = [] + cat_order = ["preference", "project", "person", "tool", "config", "fact", "other"] + cat_labels = { + "preference": "Preferences", "project": "Active Projects", + "person": "People", "tool": "Tools & Technologies", + "config": "Configuration", "fact": "Facts", "other": "Other", + } + + for cat in cat_order: + cat_nodes = by_cat.get(cat, []) + if not cat_nodes: + continue + lines.append(f"## {cat_labels.get(cat, cat.title())}") + for node in cat_nodes: + lines.append(f"- {node['text']}") + lines.append("") + + # Add conflicts section + if contradictions: + lines.append("## Conflicts (needs resolution)") + conflict_ids = set() + for edge in contradictions: + conflict_ids.add(edge["from"]) + conflict_ids.add(edge["to"]) + node_map = {n["id"]: n for n in nodes} + shown = set() + for edge in contradictions: + key = (edge["from"], edge["to"]) + if key in shown: + continue + shown.add(key) + a = node_map.get(edge["from"], {}) + b = node_map.get(edge["to"], {}) + lines.append(f"- [CONFLICT] \"{a.get('text','?')[:60]}\" vs \"{b.get('text','?')[:60]}\"") + lines.append("") + + digest = "\n".join(lines) + + # Rough token estimate: ~0.75 tokens per word + est_tokens = int(len(digest.split()) * 1.33) + if est_tokens > max_tokens: + # Truncate from bottom categories first + while est_tokens > max_tokens and lines: + lines.pop() + digest = "\n".join(lines) + est_tokens = int(len(digest.split()) * 1.33) + + return digest + + +def estimate_tokens(text: str) -> int: + return int(len(text.split()) * 1.33) + + +# ── Commands ────────────────────────────────────────────────────────────────── + +def cmd_build(state: dict, stale_days: int, fmt: str) -> None: + lines = parse_memory_file() + if not lines: + print("MEMORY.md not found or empty.") + return + + nodes, edges = build_graph(lines, stale_days) + dups = sum(1 for n in nodes if n.get("is_duplicate_of")) + contras = sum(1 for e in edges if e["relation"] == "contradicts") + stale = sum(1 for n in nodes if n.get("is_stale")) + + now = datetime.now().isoformat() + state["nodes"] = nodes + state["edges"] = edges + state["last_build_at"] = now + state["node_count"] = len(nodes) + state["edge_count"] = len(edges) + state["duplicate_count"] = dups + state["contradiction_count"] = contras + state["stale_count"] = stale + + # Generate and save digest + digest = generate_digest(nodes, edges) + DIGEST_FILE.parent.mkdir(parents=True, exist_ok=True) + DIGEST_FILE.write_text(digest) + state["digest_tokens"] = estimate_tokens(digest) + + history = state.get("build_history") or [] + history.insert(0, { + "built_at": now, "node_count": len(nodes), + "duplicates_found": dups, "contradictions_found": contras, + "stale_found": stale, "digest_tokens": state["digest_tokens"], + }) + state["build_history"] = history[:MAX_HISTORY] + save_state(state) + + if fmt == "json": + print(json.dumps({ + "node_count": len(nodes), "edge_count": len(edges), + "duplicates": dups, "contradictions": contras, + "stale": stale, "digest_tokens": state["digest_tokens"], + }, indent=2)) + else: + print(f"\nMemory Graph Builder — {now[:16]}") + print("─" * 48) + print(f" Memory lines : {len(lines)}") + print(f" Nodes : {len(nodes)}") + print(f" Edges : {len(edges)}") + print(f" Duplicates : {dups}") + print(f" Contradictions : {contras}") + print(f" Stale : {stale}") + print(f" Digest tokens : ~{state['digest_tokens']}") + print(f"\n Digest written to: {DIGEST_FILE}") + print() + + +def cmd_duplicates(state: dict) -> None: + nodes = state.get("nodes") or [] + dups = [n for n in nodes if n.get("is_duplicate_of")] + node_map = {n["id"]: n for n in nodes} + if not dups: + print("✓ No duplicates detected.") + return + print(f"\nDuplicate Clusters ({len(dups)} duplicates)") + print("─" * 48) + for dup in dups: + orig = node_map.get(dup["is_duplicate_of"], {}) + print(f" DUP: \"{dup['text'][:70]}\"") + print(f" ORIG: \"{orig.get('text','?')[:70]}\"") + print() + + +def cmd_contradictions(state: dict) -> None: + edges = state.get("edges") or [] + nodes = state.get("nodes") or [] + contras = [e for e in edges if e["relation"] == "contradicts"] + node_map = {n["id"]: n for n in nodes} + if not contras: + print("✓ No contradictions detected.") + return + print(f"\nContradictions ({len(contras)} pairs)") + print("─" * 48) + for c in contras: + a = node_map.get(c["from"], {}) + b = node_map.get(c["to"], {}) + print(f" A: \"{a.get('text','?')[:70]}\"") + print(f" B: \"{b.get('text','?')[:70]}\"") + print(f" → Resolve by editing MEMORY.md") + print() + + +def cmd_stale(state: dict, days: int) -> None: + nodes = state.get("nodes") or [] + stale = [n for n in nodes if n.get("is_stale")] + if not stale: + print(f"✓ No memories stale beyond {days} days.") + return + print(f"\nStale Memories ({len(stale)} entries, >{days} days)") + print("─" * 48) + for n in stale: + print(f" [{n.get('category','?')}] \"{n['text'][:70]}\"") + + +def cmd_digest(state: dict, max_tokens: int) -> None: + nodes = state.get("nodes") or [] + edges = state.get("edges") or [] + if not nodes: + print("No graph built yet. Run --build first.") + return + digest = generate_digest(nodes, edges, max_tokens) + DIGEST_FILE.parent.mkdir(parents=True, exist_ok=True) + DIGEST_FILE.write_text(digest) + tokens = estimate_tokens(digest) + print(f"✓ Digest written ({tokens} est. tokens) → {DIGEST_FILE}") + print() + print(digest) + + +def cmd_prune(state: dict, dry_run: bool) -> None: + nodes = state.get("nodes") or [] + to_remove = [n for n in nodes if n.get("is_duplicate_of") or n.get("is_stale")] + if not to_remove: + print("✓ Nothing to prune.") + return + if dry_run: + print(f"\nDry run — would prune {len(to_remove)} entries:") + for n in to_remove: + reason = "duplicate" if n.get("is_duplicate_of") else "stale" + print(f" [{reason}] \"{n['text'][:70]}\"") + return + + # Remove from MEMORY.md + if MEMORY_FILE.exists(): + original = MEMORY_FILE.read_text() + remove_texts = {n["text"] for n in to_remove} + kept_lines = [] + for line in original.splitlines(): + stripped = line.strip() + if stripped.startswith(("- ", "* ", "+ ")): + stripped = stripped[2:].strip() + if stripped not in remove_texts: + kept_lines.append(line) + MEMORY_FILE.write_text("\n".join(kept_lines) + "\n") + + # Rebuild graph + lines = parse_memory_file() + nodes, edges = build_graph(lines) + state["nodes"] = nodes + state["edges"] = edges + state["node_count"] = len(nodes) + state["edge_count"] = len(edges) + save_state(state) + print(f"✓ Pruned {len(to_remove)} entries. {len(nodes)} nodes remain.") + + +def cmd_stats(state: dict, fmt: str) -> None: + nodes = state.get("nodes") or [] + edges = state.get("edges") or [] + by_cat = {} + for n in nodes: + by_cat.setdefault(n.get("category", "other"), []).append(n) + + if fmt == "json": + print(json.dumps({ + "nodes": len(nodes), "edges": len(edges), + "categories": {k: len(v) for k, v in by_cat.items()}, + "duplicates": sum(1 for n in nodes if n.get("is_duplicate_of")), + "contradictions": sum(1 for e in edges if e["relation"] == "contradicts"), + }, indent=2)) + return + + print(f"\nMemory Graph Statistics") + print("─" * 40) + print(f" Total nodes : {len(nodes)}") + print(f" Total edges : {len(edges)}") + for cat, cat_nodes in sorted(by_cat.items()): + print(f" {cat:15s}: {len(cat_nodes)}") + dups = sum(1 for n in nodes if n.get("is_duplicate_of")) + contras = sum(1 for e in edges if e["relation"] == "contradicts") + print(f" Duplicates : {dups}") + print(f" Contradictions: {contras}") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_build_at", "never") + print(f"\nMemory Graph Builder — Last build: {last}") + print(f" Nodes: {state.get('node_count',0)} | " + f"Edges: {state.get('edge_count',0)} | " + f"Dups: {state.get('duplicate_count',0)} | " + f"Conflicts: {state.get('contradiction_count',0)} | " + f"Digest: ~{state.get('digest_tokens',0)} tokens") + print() + + +# ── Main ────────────────────────────────────────────────────────────────────── + +def main(): + parser = argparse.ArgumentParser(description="Memory Graph Builder") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--build", action="store_true") + group.add_argument("--duplicates", action="store_true") + group.add_argument("--contradictions", action="store_true") + group.add_argument("--stale", action="store_true") + group.add_argument("--digest", action="store_true") + group.add_argument("--prune", action="store_true") + group.add_argument("--stats", action="store_true") + group.add_argument("--status", action="store_true") + parser.add_argument("--days", type=int, default=30) + parser.add_argument("--max-tokens", type=int, default=2000) + parser.add_argument("--dry-run", action="store_true") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + + if args.build: + cmd_build(state, args.days, args.format) + elif args.duplicates: + cmd_duplicates(state) + elif args.contradictions: + cmd_contradictions(state) + elif args.stale: + cmd_stale(state, args.days) + elif args.digest: + cmd_digest(state, args.max_tokens) + elif args.prune: + cmd_prune(state, args.dry_run) + elif args.stats: + cmd_stats(state, args.format) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() From 103b909a095abfc0fb730445eb7db2a8eb0dd17b Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Mon, 16 Mar 2026 01:02:54 +0530 Subject: [PATCH 17/23] Rewrite README for maximum discoverability Restructured with SEO-rich opening, badges, comparison table, architecture diagram, use cases section, and defense-in-depth security overview. Leads with value proposition instead of implementation details. Co-Authored-By: Claude Opus 4.6 --- README.md | 228 ++++++++++++++++++++++++++++++++++-------------------- 1 file changed, 146 insertions(+), 82 deletions(-) diff --git a/README.md b/README.md index c3a7017..35e4d92 100644 --- a/README.md +++ b/README.md @@ -1,28 +1,44 @@ # openclaw-superpowers -Give your OpenClaw agent superpowers — and let it teach itself new ones. +**44 ready-to-use skills that make your AI agent autonomous, self-healing, and self-improving.** -A plug-and-play skill library for [OpenClaw](https://github.com/openclaw/openclaw), inspired by [obra/superpowers](https://github.com/obra/superpowers). +[![Skills](https://img.shields.io/badge/skills-44-blue)](#skills-included) +[![Security](https://img.shields.io/badge/security_skills-6-green)](#security--guardrails) +[![Cron](https://img.shields.io/badge/cron_scheduled-12-orange)](#openclaw-native-28-skills) +[![Scripts](https://img.shields.io/badge/companion_scripts-15-purple)](#companion-scripts) +[![License: MIT](https://img.shields.io/badge/license-MIT-yellow.svg)](LICENSE) + +A plug-and-play skill library for [OpenClaw](https://github.com/openclaw/openclaw) — the open-source AI agent runtime. Gives your agent structured thinking, security guardrails, persistent memory, cron scheduling, self-recovery, and the ability to write its own new skills during conversation. + +Built for developers who want their AI agent to run autonomously 24/7, not just respond to prompts in a chat window. + +> Inspired by [obra/superpowers](https://github.com/obra/superpowers). Extended for agents that never sleep. --- -## The idea that makes this different +## Why this exists -Most AI tools require a developer to add new behaviors. You file an issue, wait for a release, update your config. +Most AI agent frameworks give you a chatbot that forgets everything between sessions. OpenClaw is different — it runs persistently, handles multi-hour tasks, and has native cron scheduling. But out of the box, it doesn't know *how* to use those capabilities well. -**openclaw-superpowers makes your agent self-modifying.** +**openclaw-superpowers bridges that gap.** Install once, and your agent immediately knows how to: + +- **Think before it acts** — brainstorming, planning, and systematic debugging skills prevent the "dive in and break things" failure mode +- **Protect itself** — 6 security skills detect prompt injection, block dangerous actions, audit installed code, and scan for leaked credentials +- **Run unattended** — 12 cron-scheduled skills handle memory cleanup, health checks, budget tracking, and community monitoring while you sleep +- **Recover from failures** — self-recovery, loop-breaking, and task handoff skills keep long-running work alive across crashes and restarts +- **Improve itself** — the agent can write new skills during normal conversation using `create-skill`, encoding your preferences as permanent behaviors + +--- -> *"Every time I ask for a code review, always check for security issues first."* +## The self-modifying agent -Your agent invokes `create-skill`, writes a new `SKILL.md`, and that behavior is live — immediately, permanently, no restart needed. +This is what makes openclaw-superpowers different from every other plugin library: -The agent can encode your preferences as durable skills during normal conversation. You describe what you want. It teaches itself. +> *"Every time I do a code review, check for security issues first."* -- It runs **persistently (24/7)**, not just per-session -- It handles **long-running tasks** across hours, not minutes -- It has **native cron scheduling** — skills can wake up automatically on a schedule -- It has its own tool naming conventions -- It benefits from skills around **task handoff, memory persistence, and agent recovery** that session-based tools don't need +Your agent invokes `create-skill`, writes a new `SKILL.md`, and that behavior is live — immediately, permanently, no restart needed. The agent encodes your preferences as durable skills. You describe what you want. It teaches itself. + +The `community-skill-radar` skill takes this further: it scans Reddit every 3 days for pain points and feature requests from the OpenClaw community, scores them by signal strength, and writes a prioritized `PROPOSALS.md` — so the agent (or you) always knows what to build next. --- @@ -34,19 +50,17 @@ cd ~/.openclaw/extensions/superpowers && ./install.sh openclaw gateway restart ``` -`install.sh` symlinks skills, creates state directories for stateful skills, and registers cron jobs — everything in one step. - -That's it. Your agent now has superpowers. +`install.sh` symlinks all 44 skills, creates state directories for stateful skills, and registers cron jobs — everything in one step. That's it. Your agent now has superpowers. --- -## Skills Included +## Skills included ### Core (15 skills) -Methodology skills that work in any runtime. Adapted from [obra/superpowers](https://github.com/obra/superpowers) plus OpenClaw-specific additions. +Methodology skills that work in any AI agent runtime. Adapted from [obra/superpowers](https://github.com/obra/superpowers) plus new additions for skill quality assurance. -| Skill | Purpose | Script | +| Skill | What it does | Script | |---|---|---| | `using-superpowers` | Bootstrap — teaches the agent how to find and invoke skills | — | | `brainstorming` | Structured ideation before any implementation | — | @@ -66,89 +80,138 @@ Methodology skills that work in any runtime. Adapted from [obra/superpowers](htt ### OpenClaw-Native (28 skills) -Skills that require OpenClaw's persistent runtime — cron scheduling, session state, or long-running execution. Not useful in session-based tools. - -| Skill | Purpose | Cron | Stateful | Script | -|---|---|---|---|---| -| `long-running-task-management` | Breaks multi-hour tasks into checkpointed stages with resume | every 15 min | ✓ | — | -| `persistent-memory-hygiene` | Keeps OpenClaw's memory store clean and useful over time | daily 11pm | ✓ | — | -| `task-handoff` | Gracefully hands off incomplete tasks across agent restarts | — | ✓ | — | -| `agent-self-recovery` | Detects when the agent is stuck in a loop and escapes | — | ✓ | — | -| `context-window-management` | Prevents context overflow on long-running sessions | — | ✓ | — | -| `daily-review` | End-of-day structured summary and next-session prep | weekdays 6pm | ✓ | — | -| `morning-briefing` | Daily briefing: priorities, active tasks, pending handoffs | weekdays 7am | ✓ | `run.py` | -| `secrets-hygiene` | Audits installed skills for stale credentials and orphaned secrets | Mondays 9am | ✓ | `audit.py` | -| `workflow-orchestration` | Chains skills into resumable named workflows with on-failure conditions | — | ✓ | `run.py` | -| `context-budget-guard` | Estimates context usage and triggers compaction before overflow | — | ✓ | `check.py` | -| `prompt-injection-guard` | Detects injection attempts in external content before the agent acts | — | ✓ | `guard.py` | -| `spend-circuit-breaker` | Tracks API spend against a monthly budget; pauses crons at 100% | every 4h | ✓ | `check.py` | -| `dangerous-action-guard` | Requires explicit user confirmation before irreversible actions | — | ✓ | `audit.py` | -| `loop-circuit-breaker` | Detects infinite retry loops from deterministic errors and breaks them | — | ✓ | `check.py` | -| `workspace-integrity-guardian` | Detects drift or tampering in SOUL.md, AGENTS.md, MEMORY.md | Sundays 3am | ✓ | `guard.py` | -| `multi-agent-coordinator` | Manages parallel agent fleets: health checks, consistency, handoffs | — | ✓ | `run.py` | -| `cron-hygiene` | Audits cron skills for session mode waste and token efficiency | Mondays 9am | ✓ | `audit.py` | -| `channel-context-bridge` | Writes a resumé card at session end for seamless channel switching | — | ✓ | `bridge.py` | -| `skill-doctor` | Diagnoses silent skill discovery failures — YAML errors, path violations, schema mismatches | — | ✓ | `doctor.py` | -| `installed-skill-auditor` | Weekly post-install audit of all skills for injection, credentials, and drift | Mondays 9am | ✓ | `audit.py` | -| `skill-loadout-manager` | Named skill profiles to manage active skill sets and prevent system prompt bloat | — | ✓ | `loadout.py` | -| `skill-compatibility-checker` | Checks installed skills against the current OpenClaw version for feature compatibility | — | ✓ | `check.py` | -| `heartbeat-governor` | Enforces per-skill execution budgets for cron skills; auto-pauses runaway skills | every hour | ✓ | `governor.py` | -| `community-skill-radar` | Scans Reddit for OpenClaw pain points and feature requests; writes prioritized PROPOSALS.md | every 3 days | ✓ | `radar.py` | -| `memory-graph-builder` | Parses MEMORY.md into a knowledge graph; detects duplicates, contradictions, and stale entries; generates compressed digest | daily 10pm | ✓ | `graph.py` | -| `config-encryption-auditor` | Scans config directories for plaintext API keys, tokens, and world-readable permissions | Sundays 9am | ✓ | `audit.py` | -| `tool-description-optimizer` | Scores skill descriptions for trigger quality — clarity, specificity, keyword density — and suggests rewrites | — | ✓ | `optimize.py` | -| `mcp-health-checker` | Monitors MCP server connections for health, latency, and availability; detects stale connections | every 6h | ✓ | `check.py` | +Skills that require OpenClaw's persistent runtime — cron scheduling, session state, or long-running execution. These are the skills that make a 24/7 autonomous agent actually work reliably. + +| Skill | What it does | Cron | Script | +|---|---|---|---| +| `long-running-task-management` | Breaks multi-hour tasks into checkpointed stages with resume | every 15 min | — | +| `persistent-memory-hygiene` | Keeps the agent's memory store clean and useful over time | daily 11pm | — | +| `task-handoff` | Gracefully hands off incomplete tasks across agent restarts | — | — | +| `agent-self-recovery` | Detects when the agent is stuck in a loop and escapes | — | — | +| `context-window-management` | Prevents context overflow on long-running sessions | — | — | +| `daily-review` | End-of-day structured summary and next-session prep | weekdays 6pm | — | +| `morning-briefing` | Daily briefing: priorities, active tasks, pending handoffs | weekdays 7am | `run.py` | +| `secrets-hygiene` | Audits installed skills for stale credentials and orphaned secrets | Mondays 9am | `audit.py` | +| `workflow-orchestration` | Chains skills into resumable named workflows with on-failure conditions | — | `run.py` | +| `context-budget-guard` | Estimates context usage and triggers compaction before overflow | — | `check.py` | +| `prompt-injection-guard` | Detects injection attempts in external content before the agent acts | — | `guard.py` | +| `spend-circuit-breaker` | Tracks API spend against a monthly budget; pauses crons at 100% | every 4h | `check.py` | +| `dangerous-action-guard` | Requires explicit user confirmation before irreversible actions | — | `audit.py` | +| `loop-circuit-breaker` | Detects infinite retry loops from deterministic errors and breaks them | — | `check.py` | +| `workspace-integrity-guardian` | Detects drift or tampering in SOUL.md, AGENTS.md, MEMORY.md | Sundays 3am | `guard.py` | +| `multi-agent-coordinator` | Manages parallel agent fleets: health checks, consistency, handoffs | — | `run.py` | +| `cron-hygiene` | Audits cron skills for session mode waste and token efficiency | Mondays 9am | `audit.py` | +| `channel-context-bridge` | Writes a context card at session end for seamless channel switching | — | `bridge.py` | +| `skill-doctor` | Diagnoses silent skill discovery failures — YAML errors, path violations, schema mismatches | — | `doctor.py` | +| `installed-skill-auditor` | Weekly post-install audit of all skills for injection, credentials, and drift | Mondays 9am | `audit.py` | +| `skill-loadout-manager` | Named skill profiles to manage active skill sets and prevent system prompt bloat | — | `loadout.py` | +| `skill-compatibility-checker` | Checks installed skills against the current OpenClaw version for feature compatibility | — | `check.py` | +| `heartbeat-governor` | Enforces per-skill execution budgets for cron skills; auto-pauses runaway skills | every hour | `governor.py` | +| `community-skill-radar` | Scans Reddit for OpenClaw pain points and feature requests; writes prioritized PROPOSALS.md | every 3 days | `radar.py` | +| `memory-graph-builder` | Parses MEMORY.md into a knowledge graph; detects duplicates, contradictions, stale entries | daily 10pm | `graph.py` | +| `config-encryption-auditor` | Scans config directories for plaintext API keys, tokens, and world-readable permissions | Sundays 9am | `audit.py` | +| `tool-description-optimizer` | Scores skill descriptions for trigger quality — clarity, specificity, keyword density — and suggests rewrites | — | `optimize.py` | +| `mcp-health-checker` | Monitors MCP server connections for health, latency, and availability; detects stale connections | every 6h | `check.py` | ### Community (1 skill) -Skills written by agents and contributors. Lives in `skills/community/`. Any agent can add a community skill via `create-skill`. Community skills default to stateless but may use `STATE_SCHEMA.yaml` when persistence is genuinely needed. +Skills written by agents and contributors. Any agent can add a community skill via `create-skill`. -| Skill | Purpose | Cron | Stateful | Script | -|---|---|---|---|---| -| `obsidian-sync` | Syncs OpenClaw memory to an Obsidian vault nightly | daily 10pm | ✓ | `sync.py` | +| Skill | What it does | Cron | Script | +|---|---|---|---| +| `obsidian-sync` | Syncs OpenClaw memory to an Obsidian vault nightly | daily 10pm | `sync.py` | --- -## How State Works +## Security & guardrails -Stateful skills commit a `STATE_SCHEMA.yaml` defining the shape of their runtime data. At install time, `install.sh` creates `~/.openclaw/skill-state//state.yaml` on your local machine. The agent reads and writes this file during execution — enabling reliable resume, handoff, and cron-based wakeups without relying on prose instructions. The schema is portable and versioned; the runtime state is local-only and never committed. +Six skills form a defense-in-depth security layer for autonomous agents: -## Companion Scripts +| Threat | Skill | How it works | +|---|---|---| +| Malicious skill installs | `skill-vetting` | Pre-install scanner with 6 security flags — rates SAFE / CAUTION / DO NOT INSTALL | +| Prompt injection from external content | `prompt-injection-guard` | Detects 6 injection signal types at runtime; blocks on 2+ signals | +| Agent takes destructive action without asking | `dangerous-action-guard` | Pre-execution confirmation gate with 5-min expiry and full audit trail | +| Post-install tampering or credential injection | `installed-skill-auditor` | Weekly SHA-256 drift detection; checks for INJECTION / CREDENTIAL / EXFILTRATION | +| Silent skill loading failures | `skill-doctor` | 6 diagnostic checks per skill; surfaces every load-time failure | +| Plaintext secrets in config files | `config-encryption-auditor` | Scans for 8 API key patterns + 3 token patterns; auto-fixes permissions | -Skills marked with a script in the table above ship a small executable alongside their `SKILL.md`: +--- -- **Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`, `radar.py`, `graph.py`, `optimize.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies required; `pyyaml` is optional but recommended. -- **`vet.sh`** — Pure bash scanner; runs on any system with grep. -- Each script supports `--help` and prints a human-readable summary. JSON output available where useful (`--format json`). Dry-run mode available on scripts that make changes. -- See the `example-state.yaml` in each skill directory for sample state and a commented walkthrough of the skill's cron behaviour. +## How it compares + +| Feature | openclaw-superpowers | obra/superpowers | Custom prompts | +|---|---|---|---| +| Skills included | **44** | 8 | 0 | +| Self-modifying (agent writes new skills) | Yes | No | No | +| Cron scheduling | **12 scheduled skills** | No | No | +| Persistent state across sessions | **YAML state schemas** | No | No | +| Security guardrails | **6 defense-in-depth skills** | No | No | +| Companion scripts with CLI | **15 scripts** | No | No | +| Memory graph / knowledge graph | Yes | No | No | +| MCP server health monitoring | Yes | No | No | +| API spend tracking & budget enforcement | Yes | No | No | +| Community feature radar (Reddit scanning) | Yes | No | No | +| Multi-agent coordination | Yes | No | No | +| Works with 24/7 persistent agents | **Built for it** | Session-only | Session-only | --- -## Security skills at a glance +## Architecture -Six skills address the documented top security risks for OpenClaw agents: +``` +~/.openclaw/extensions/superpowers/ +├── skills/ +│ ├── core/ # 15 methodology skills (any runtime) +│ │ ├── brainstorming/ +│ │ │ └── SKILL.md +│ │ ├── create-skill/ +│ │ │ ├── SKILL.md +│ │ │ └── TEMPLATE.md +│ │ └── ... +│ ├── openclaw-native/ # 28 persistent-runtime skills +│ │ ├── memory-graph-builder/ +│ │ │ ├── SKILL.md # Skill definition + YAML frontmatter +│ │ │ ├── STATE_SCHEMA.yaml # State shape (committed, versioned) +│ │ │ ├── graph.py # Companion script +│ │ │ └── example-state.yaml # Annotated example +│ │ └── ... +│ └── community/ # Agent-written and contributed skills +├── scripts/ +│ └── validate-skills.sh # CI validation +├── tests/ +│ └── test-runner.sh +└── install.sh # One-command setup +``` -| Threat | Skill | How | -|---|---|---| -| Malicious skill install (36% of ClawHub skills contain injection payloads) | `skill-vetting` | Scans before install — 6 security flags, SAFE / CAUTION / DO NOT INSTALL | -| Runtime injection from emails, web pages, scraped data | `prompt-injection-guard` | Detects 6 signal types at runtime; blocks on 2+ signals | -| Agent takes destructive action without confirmation | `dangerous-action-guard` | Pre-execution gate with 5-min expiry window and full audit trail | -| Post-install skill tampering or credential injection | `installed-skill-auditor` | Weekly content-hash drift detection; INJECTION / CREDENTIAL / EXFILTRATION checks | -| Silent skill loading failures hiding broken skills | `skill-doctor` | 6 diagnostic checks per skill; surfaces every load-time failure before it disappears | -| Plaintext API keys and tokens in config files | `config-encryption-auditor` | Scans for 8 API key patterns + 3 token patterns; auto-fixes permissions; suggests env var migration | +**State model:** Each stateful skill commits a `STATE_SCHEMA.yaml` defining the shape of its runtime data. At install time, `install.sh` creates `~/.openclaw/skill-state//state.yaml` on your machine. The schema is portable and versioned; the runtime state is local-only and never committed. + +--- + +## Companion scripts + +Skills marked with a script ship a small executable alongside their `SKILL.md`: + +- **15 Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`, `radar.py`, `graph.py`, `optimize.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies; `pyyaml` is optional but recommended. +- **`vet.sh`** — Pure bash scanner; runs on any system with grep. +- Every script supports `--help` and `--format json`. Dry-run mode available on scripts that make changes. +- See the `example-state.yaml` in each skill directory for sample state and a commented walkthrough of cron behaviour. --- -## Why OpenClaw-specific? +## Use cases + +**Solo developer with a persistent AI agent** +> Install superpowers, and your agent handles memory cleanup, security audits, and daily briefings on autopilot. You focus on building; the agent maintains itself. -obra/superpowers was built for session-based tools (Claude Code, Cursor, Codex). OpenClaw is different: +**Team running multiple OpenClaw agents** +> Use `multi-agent-coordinator` for fleet health checks, `skill-loadout-manager` to keep system prompts lean per agent role, and `heartbeat-governor` to prevent runaway cron costs. -- Runs **24/7**, not just per-session -- Handles tasks that take **hours, not minutes** -- Has **native cron scheduling** — skills wake up automatically on a schedule -- Needs skills around **handoff, memory persistence, and self-recovery** that session tools don't require +**Open-source maintainer** +> `community-skill-radar` scans Reddit for pain points automatically. `skill-vetting` catches malicious community contributions before they're installed. `installed-skill-auditor` detects post-install tampering. -The OpenClaw-native skills in this repo exist because of that difference. And with `community-skill-radar`, the library discovers what to build next by scanning Reddit communities automatically. +**Security-conscious deployment** +> Six defense-in-depth skills: pre-install vetting, runtime injection detection, destructive action gates, post-install drift detection, credential scanning, and silent failure diagnosis. --- @@ -163,5 +226,6 @@ The OpenClaw-native skills in this repo exist because of that difference. And wi ## Credits -- **[openclaw/openclaw](https://github.com/openclaw/openclaw)** — the personal AI runtime that makes this possible -- **[obra/superpowers](https://github.com/obra/superpowers)** — Jesse Vincent's skills framework; core skills adapted here under MIT license +- **[openclaw/openclaw](https://github.com/openclaw/openclaw)** — the open-source AI agent runtime +- **[obra/superpowers](https://github.com/obra/superpowers)** — Jesse Vincent's skills framework; core skills adapted under MIT license +- **[OpenLobster](https://github.com/Neirth/OpenLobster)** — inspiration for memory graph, config encryption auditing, tool-description scoring, and MCP health monitoring From 91c4318594085d23e78345d5feba60248af5672a Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Tue, 17 Mar 2026 00:43:17 +0530 Subject: [PATCH 18/23] Add memory-dag-compactor skill (#35) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Builds hierarchical summary DAGs from MEMORY.md with depth-aware prompts (d0 leaf → d3+ durable). Supports search, tree visualization, inspect, and dissolve. Cron nightly 11pm. Companion script: compact.py. Inspired by lossless-claw's DAG-based summarization hierarchy. Co-authored-by: Claude Opus 4.6 --- .../memory-dag-compactor/SKILL.md | 114 ++++ .../memory-dag-compactor/STATE_SCHEMA.yaml | 45 ++ .../memory-dag-compactor/compact.py | 637 ++++++++++++++++++ .../memory-dag-compactor/example-state.yaml | 90 +++ 4 files changed, 886 insertions(+) create mode 100644 skills/openclaw-native/memory-dag-compactor/SKILL.md create mode 100644 skills/openclaw-native/memory-dag-compactor/STATE_SCHEMA.yaml create mode 100755 skills/openclaw-native/memory-dag-compactor/compact.py create mode 100644 skills/openclaw-native/memory-dag-compactor/example-state.yaml diff --git a/skills/openclaw-native/memory-dag-compactor/SKILL.md b/skills/openclaw-native/memory-dag-compactor/SKILL.md new file mode 100644 index 0000000..b321308 --- /dev/null +++ b/skills/openclaw-native/memory-dag-compactor/SKILL.md @@ -0,0 +1,114 @@ +--- +name: memory-dag-compactor +version: "1.0" +category: openclaw-native +description: Builds hierarchical summary DAGs from MEMORY.md with depth-aware prompts — leaf summaries preserve detail, higher depths condense to durable arcs, preventing information loss during compaction. +stateful: true +cron: "0 23 * * *" +--- + +# Memory DAG Compactor + +## What it does + +Standard memory compaction is lossy — older entries get truncated and details disappear forever. Memory DAG Compactor replaces flat compaction with a **directed acyclic graph (DAG)** of hierarchical summaries inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw)'s Lossless Context Management approach. + +Each depth in the DAG uses a purpose-built prompt tuned for that abstraction level: + +| Depth | Name | What it preserves | Timeline granularity | +|---|---|---|---| +| d0 | Leaf | File operations, timestamps, specific actions, errors | Hours | +| d1 | Condensed | What changed vs. previous context, decisions made | Sessions | +| d2 | Arc | Goal → outcome → carries forward | Days | +| d3+ | Durable | Long-term context that survives weeks of inactivity | Date ranges | + +The raw MEMORY.md entries are never deleted — only organized into a searchable, multi-level summary hierarchy. + +## When to invoke + +- Automatically nightly at 11pm (cron) — compacts the day's memory entries +- When MEMORY.md grows beyond a configurable threshold (default: 200 entries) +- Before a long-running task — ensures memory is compact and searchable +- When the agent reports "I don't remember" for something that should be in memory + +## How to use + +```bash +python3 compact.py --compact # Run leaf + condensation passes +python3 compact.py --compact --depth 0 # Only leaf summaries (d0) +python3 compact.py --compact --depth 2 # Condense up to d2 arcs +python3 compact.py --status # Show DAG stats and health +python3 compact.py --tree # Print the summary DAG as a tree +python3 compact.py --search "deployment issue" # Search across all depths +python3 compact.py --inspect # Show a summary with its children +python3 compact.py --dissolve # Reverse a condensation +python3 compact.py --format json # Machine-readable output +``` + +## Procedure + +**Step 1 — Run compaction** + +```bash +python3 compact.py --compact +``` + +The compactor: +1. Reads all entries from MEMORY.md +2. Groups entries into chunks (default: 20 entries per leaf) +3. Generates d0 leaf summaries preserving operational detail +4. When leaf count exceeds fanout (default: 5), condenses into d1 summaries +5. Repeats condensation at each depth until DAG is within budget +6. Writes the summary DAG to state + +**Step 2 — Search memory across depths** + +```bash +python3 compact.py --search "API migration" +``` + +Searches raw entries and all summary depths. Results ranked by relevance and depth — deeper summaries (d0) are more detailed, shallower (d3+) give the big picture. + +**Step 3 — Inspect and repair** + +```bash +python3 compact.py --tree # Visualize the full DAG +python3 compact.py --inspect s-003 # Show summary with lineage +python3 compact.py --dissolve s-007 # Reverse a bad condensation +``` + +## Depth-aware prompt design + +### d0 (Leaf) — Operational detail +Preserves: timestamps, file paths, commands run, error messages, specific values. Drops: conversational filler, repeated attempts, verbose tool output. + +### d1 (Condensed) — Session context +Preserves: what changed vs. previous state, decisions made and why, blockers encountered. Drops: per-file details, exact timestamps, intermediate steps. + +### d2 (Arc) — Goal-to-outcome arcs +Preserves: goal definition, final outcome, what carries forward, open questions. Drops: session-level detail, individual decisions, specific tools used. + +### d3+ (Durable) — Long-term context +Preserves: project identity, architectural decisions, user preferences, recurring patterns. Drops: anything that wouldn't matter after 2 weeks of inactivity. + +## Configuration + +| Parameter | Default | Description | +|---|---|---| +| `chunk_size` | 20 | Entries per leaf summary | +| `fanout` | 5 | Max children before condensation triggers | +| `max_depth` | 4 | Maximum DAG depth | +| `token_budget` | 8000 | Target token count for assembled context | + +## State + +DAG structure, summary content, and lineage stored in `~/.openclaw/skill-state/memory-dag-compactor/state.yaml`. + +Fields: `last_compact_at`, `dag_nodes`, `dag_edges`, `entry_count`, `compact_history`. + +## Notes + +- Never modifies or deletes MEMORY.md — the DAG is an overlay +- Each summary includes a `[Expand for details about: ...]` footer listing what was compressed +- Dissolve reverses a condensation, restoring child summaries to the active set +- Inspired by lossless-claw's DAG-based summarization hierarchy and depth-aware prompt system diff --git a/skills/openclaw-native/memory-dag-compactor/STATE_SCHEMA.yaml b/skills/openclaw-native/memory-dag-compactor/STATE_SCHEMA.yaml new file mode 100644 index 0000000..44a9761 --- /dev/null +++ b/skills/openclaw-native/memory-dag-compactor/STATE_SCHEMA.yaml @@ -0,0 +1,45 @@ +version: "1.0" +description: DAG-based memory summary hierarchy with depth-aware nodes and lineage tracking. +fields: + last_compact_at: + type: datetime + config: + type: object + description: Compaction configuration + fields: + chunk_size: { type: integer, default: 20 } + fanout: { type: integer, default: 5 } + max_depth: { type: integer, default: 4 } + token_budget: { type: integer, default: 8000 } + dag_nodes: + type: list + description: All summary nodes in the DAG + items: + id: { type: string, description: "Unique summary ID (e.g. s-001)" } + depth: { type: integer, description: "0=leaf, 1+=condensed" } + content: { type: string, description: "Summary text" } + expand_footer: { type: string, description: "What was compressed away" } + token_count: { type: integer } + created_at: { type: datetime } + source_type: { type: enum, values: [entries, summaries] } + source_range: { type: string, description: "Entry range or child summary IDs" } + is_active: { type: boolean, description: "Part of current assembled context" } + dag_edges: + type: list + description: Parent-child relationships in the DAG + items: + parent_id: { type: string } + child_id: { type: string } + entry_count: + type: integer + description: Total MEMORY.md entries tracked + compact_history: + type: list + description: Rolling log of compaction runs (last 20) + items: + compacted_at: { type: datetime } + entries_processed: { type: integer } + leaves_created: { type: integer } + condensations: { type: integer } + max_depth_reached: { type: integer } + total_nodes: { type: integer } diff --git a/skills/openclaw-native/memory-dag-compactor/compact.py b/skills/openclaw-native/memory-dag-compactor/compact.py new file mode 100755 index 0000000..73b4142 --- /dev/null +++ b/skills/openclaw-native/memory-dag-compactor/compact.py @@ -0,0 +1,637 @@ +#!/usr/bin/env python3 +""" +Memory DAG Compactor for openclaw-superpowers. + +Builds hierarchical summary DAGs from MEMORY.md with depth-aware +prompts. Leaf summaries preserve detail; higher depths condense +to durable arcs. + +Usage: + python3 compact.py --compact + python3 compact.py --compact --depth 2 + python3 compact.py --tree + python3 compact.py --search "query" + python3 compact.py --inspect + python3 compact.py --dissolve + python3 compact.py --status + python3 compact.py --format json +""" + +import argparse +import hashlib +import json +import os +import re +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "memory-dag-compactor" / "state.yaml" +MEMORY_FILE = OPENCLAW_DIR / "workspace" / "MEMORY.md" +MAX_HISTORY = 20 + +# Default config +DEFAULT_CONFIG = { + "chunk_size": 20, + "fanout": 5, + "max_depth": 4, + "token_budget": 8000, +} + +# ── Depth-aware prompt templates ───────────────────────────────────────────── + +DEPTH_PROMPTS = { + 0: { + "name": "Leaf (d0)", + "instruction": ( + "Summarize these memory entries preserving operational detail.\n" + "KEEP: timestamps, file paths, commands run, error messages, " + "specific values, tool outputs, decisions made.\n" + "DROP: conversational filler, repeated failed attempts (keep final " + "outcome), verbose intermediate steps.\n" + "Timeline granularity: hours.\n" + "End with: [Expand for details about: ]" + ), + }, + 1: { + "name": "Condensed (d1)", + "instruction": ( + "Condense these summaries into a session-level overview.\n" + "KEEP: what changed vs. previous state, decisions made and why, " + "blockers encountered, tools/APIs used.\n" + "DROP: per-file details, exact timestamps, intermediate steps, " + "individual error messages.\n" + "Timeline granularity: sessions.\n" + "End with: [Expand for details about: ]" + ), + }, + 2: { + "name": "Arc (d2)", + "instruction": ( + "Condense these summaries into goal-to-outcome arcs.\n" + "KEEP: goal definition, final outcome, what carries forward, " + "open questions, architectural decisions.\n" + "DROP: session-level detail, individual decisions, specific " + "tools used, intermediate blockers.\n" + "Timeline granularity: days.\n" + "End with: [Expand for details about: ]" + ), + }, + 3: { + "name": "Durable (d3+)", + "instruction": ( + "Condense these summaries into durable long-term context.\n" + "KEEP: project identity, architectural decisions, user preferences, " + "recurring patterns, key relationships.\n" + "DROP: anything that wouldn't matter after 2 weeks of inactivity.\n" + "Timeline granularity: date ranges.\n" + "End with: [Expand for details about: ]" + ), + }, +} + + +# ── State helpers ──────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return { + "config": DEFAULT_CONFIG.copy(), + "dag_nodes": [], + "dag_edges": [], + "entry_count": 0, + "compact_history": [], + } + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {"config": DEFAULT_CONFIG.copy(), "dag_nodes": [], "dag_edges": [], + "entry_count": 0, "compact_history": []} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Memory parsing ─────────────────────────────────────────────────────────── + +def parse_memory(memory_path: Path) -> list[dict]: + """Parse MEMORY.md into individual entries.""" + if not memory_path.exists(): + return [] + text = memory_path.read_text() + entries = [] + current = [] + current_header = "" + idx = 0 + + for line in text.split("\n"): + # Detect entry boundaries: lines starting with - or ## or timestamps + is_boundary = ( + line.startswith("- ") or + line.startswith("## ") or + re.match(r'^\d{4}-\d{2}-\d{2}', line.strip()) + ) + if is_boundary and current: + entries.append({ + "id": f"e-{idx:04d}", + "header": current_header, + "content": "\n".join(current).strip(), + "line_count": len(current), + }) + idx += 1 + current = [line] + current_header = line.strip()[:80] + else: + current.append(line) + if not current_header and line.strip(): + current_header = line.strip()[:80] + + if current: + entries.append({ + "id": f"e-{idx:04d}", + "header": current_header, + "content": "\n".join(current).strip(), + "line_count": len(current), + }) + return entries + + +def estimate_tokens(text: str) -> int: + """Rough token estimate: ~4 chars per token.""" + return len(text) // 4 + + +# ── DAG operations ─────────────────────────────────────────────────────────── + +def gen_summary_id(depth: int, index: int) -> str: + return f"s-d{depth}-{index:03d}" + + +def get_depth_prompt(depth: int) -> dict: + if depth >= 3: + return DEPTH_PROMPTS[3] + return DEPTH_PROMPTS.get(depth, DEPTH_PROMPTS[0]) + + +def generate_leaf_summary(entries: list[dict], depth: int = 0) -> str: + """Generate a deterministic summary from entries using depth-aware rules.""" + prompt = get_depth_prompt(depth) + lines = [] + + if depth == 0: + # Leaf: preserve operational detail + for e in entries: + content = e["content"] + # Keep first 3 lines of each entry for detail + entry_lines = content.split("\n")[:3] + lines.append(" | ".join(l.strip() for l in entry_lines if l.strip())) + topics = extract_topics(entries) + summary = "; ".join(lines[:10]) + if len(lines) > 10: + summary += f" ... (+{len(lines)-10} more entries)" + summary += f"\n[Expand for details about: {', '.join(topics[:5])}]" + + elif depth == 1: + # Condensed: focus on changes and decisions + for e in entries: + content = e.get("content", "") + first_line = content.split("\n")[0].strip() + lines.append(first_line) + topics = extract_topics(entries) + summary = "Session: " + "; ".join(lines[:8]) + if len(lines) > 8: + summary += f" ... (+{len(lines)-8} more)" + summary += f"\n[Expand for details about: {', '.join(topics[:5])}]" + + elif depth == 2: + # Arc: goal → outcome + all_text = " ".join(e.get("content", "") for e in entries) + topics = extract_topics(entries) + summary = f"Arc ({len(entries)} summaries): {all_text[:200].strip()}" + summary += f"\n[Expand for details about: {', '.join(topics[:5])}]" + + else: + # Durable: long-term context only + all_text = " ".join(e.get("content", "") for e in entries) + topics = extract_topics(entries) + summary = f"Durable context: {all_text[:150].strip()}" + summary += f"\n[Expand for details about: {', '.join(topics[:5])}]" + + return summary + + +def extract_topics(entries: list[dict]) -> list[str]: + """Extract key topics from a set of entries.""" + # Simple keyword extraction: find capitalized words, paths, and technical terms + all_text = " ".join(e.get("content", e.get("header", "")) for e in entries) + words = re.findall(r'[A-Z][a-z]+(?:[A-Z][a-z]+)*|[a-z]+(?:[-_][a-z]+)+|/[\w/.-]+', all_text) + # Deduplicate preserving order + seen = set() + topics = [] + for w in words: + low = w.lower() + if low not in seen and len(w) > 3: + seen.add(low) + topics.append(w) + return topics[:10] + + +def compact_entries(state: dict, max_depth: int | None = None) -> dict: + """Run leaf + condensation passes on MEMORY.md entries.""" + config = state.get("config") or DEFAULT_CONFIG + chunk_size = config.get("chunk_size", 20) + fanout = config.get("fanout", 5) + depth_limit = max_depth if max_depth is not None else config.get("max_depth", 4) + + entries = parse_memory(MEMORY_FILE) + if not entries: + return {"entries_processed": 0, "leaves_created": 0, + "condensations": 0, "max_depth_reached": 0} + + nodes = state.get("dag_nodes") or [] + edges = state.get("dag_edges") or [] + + # Track existing entry coverage + existing_entry_ids = set() + for node in nodes: + if node.get("source_type") == "entries": + sr = node.get("source_range", "") + for eid in sr.split(","): + existing_entry_ids.add(eid.strip()) + + # Find unprocessed entries + new_entries = [e for e in entries if e["id"] not in existing_entry_ids] + + if not new_entries: + return {"entries_processed": 0, "leaves_created": 0, + "condensations": 0, "max_depth_reached": 0} + + # Step 1: Create leaf summaries (d0) + leaves_created = 0 + leaf_idx = len([n for n in nodes if n.get("depth") == 0]) + + for i in range(0, len(new_entries), chunk_size): + chunk = new_entries[i:i + chunk_size] + summary_text = generate_leaf_summary(chunk, depth=0) + sid = gen_summary_id(0, leaf_idx) + + topics = extract_topics(chunk) + node = { + "id": sid, + "depth": 0, + "content": summary_text, + "expand_footer": f"Expand for details about: {', '.join(topics[:5])}", + "token_count": estimate_tokens(summary_text), + "created_at": datetime.now().isoformat(), + "source_type": "entries", + "source_range": ", ".join(e["id"] for e in chunk), + "is_active": True, + } + nodes.append(node) + leaf_idx += 1 + leaves_created += 1 + + # Step 2: Condensation passes (d1, d2, d3+) + condensations = 0 + max_depth_reached = 0 + + for depth in range(1, depth_limit + 1): + # Find active nodes at depth-1 + parent_depth = depth - 1 + active_at_depth = [n for n in nodes if n.get("depth") == parent_depth and n.get("is_active")] + + if len(active_at_depth) <= fanout: + break + + # Condense in groups of fanout + condense_idx = len([n for n in nodes if n.get("depth") == depth]) + + for i in range(0, len(active_at_depth), fanout): + group = active_at_depth[i:i + fanout] + if len(group) < 2: + continue + + summary_text = generate_leaf_summary(group, depth=depth) + sid = gen_summary_id(depth, condense_idx) + + node = { + "id": sid, + "depth": depth, + "content": summary_text, + "expand_footer": extract_topics(group), + "token_count": estimate_tokens(summary_text), + "created_at": datetime.now().isoformat(), + "source_type": "summaries", + "source_range": ", ".join(g["id"] for g in group), + "is_active": True, + } + nodes.append(node) + + # Deactivate children and create edges + for g in group: + g["is_active"] = False + edges.append({"parent_id": sid, "child_id": g["id"]}) + + condense_idx += 1 + condensations += 1 + max_depth_reached = max(max_depth_reached, depth) + + state["dag_nodes"] = nodes + state["dag_edges"] = edges + state["entry_count"] = len(entries) + + return { + "entries_processed": len(new_entries), + "leaves_created": leaves_created, + "condensations": condensations, + "max_depth_reached": max_depth_reached, + "total_nodes": len(nodes), + } + + +def dissolve_node(state: dict, node_id: str) -> bool: + """Reverse a condensation — reactivate children, remove parent.""" + nodes = state.get("dag_nodes") or [] + edges = state.get("dag_edges") or [] + + target = None + for n in nodes: + if n["id"] == node_id: + target = n + break + + if not target: + return False + + if target.get("depth", 0) == 0: + print(f"Error: cannot dissolve leaf node {node_id}") + return False + + # Find and reactivate children + child_ids = [e["child_id"] for e in edges if e["parent_id"] == node_id] + for n in nodes: + if n["id"] in child_ids: + n["is_active"] = True + + # Remove parent node and edges + state["dag_nodes"] = [n for n in nodes if n["id"] != node_id] + state["dag_edges"] = [e for e in edges if e["parent_id"] != node_id] + + return True + + +# ── Search ─────────────────────────────────────────────────────────────────── + +def search_dag(state: dict, query: str) -> list[dict]: + """Search across all DAG nodes and raw entries.""" + results = [] + query_lower = query.lower() + tokens = set(query_lower.split()) + + # Search DAG nodes + for node in (state.get("dag_nodes") or []): + content = node.get("content", "").lower() + if query_lower in content or any(t in content for t in tokens): + match_count = sum(1 for t in tokens if t in content) + results.append({ + "type": "summary", + "id": node["id"], + "depth": node.get("depth", 0), + "content": node["content"][:200], + "relevance": match_count / len(tokens) if tokens else 0, + "is_active": node.get("is_active", False), + }) + + # Search raw entries + entries = parse_memory(MEMORY_FILE) + for entry in entries: + content = entry.get("content", "").lower() + if query_lower in content or any(t in content for t in tokens): + match_count = sum(1 for t in tokens if t in content) + results.append({ + "type": "entry", + "id": entry["id"], + "depth": -1, + "content": entry["content"][:200], + "relevance": match_count / len(tokens) if tokens else 0, + "is_active": True, + }) + + results.sort(key=lambda r: (-r["relevance"], r["depth"])) + return results + + +# ── Commands ───────────────────────────────────────────────────────────────── + +def cmd_compact(state: dict, max_depth: int | None, fmt: str) -> None: + result = compact_entries(state, max_depth) + now = datetime.now().isoformat() + state["last_compact_at"] = now + + history = state.get("compact_history") or [] + history.insert(0, {"compacted_at": now, **result}) + state["compact_history"] = history[:MAX_HISTORY] + save_state(state) + + if fmt == "json": + print(json.dumps(result, indent=2)) + else: + print(f"\nMemory DAG Compaction — {datetime.now().strftime('%Y-%m-%d %H:%M')}") + print("-" * 50) + print(f" Entries processed: {result['entries_processed']}") + print(f" Leaves created: {result['leaves_created']}") + print(f" Condensations: {result['condensations']}") + print(f" Max depth reached: {result['max_depth_reached']}") + print(f" Total DAG nodes: {result.get('total_nodes', len(state.get('dag_nodes', [])))}") + print() + + if result["entries_processed"] == 0: + print(" No new entries to compact.") + else: + print(" DAG updated successfully.") + print() + + +def cmd_tree(state: dict, fmt: str) -> None: + nodes = state.get("dag_nodes") or [] + edges = state.get("dag_edges") or [] + + if fmt == "json": + print(json.dumps({"nodes": len(nodes), "edges": len(edges), + "dag": [{"id": n["id"], "depth": n["depth"], + "active": n.get("is_active", False), + "tokens": n.get("token_count", 0)} + for n in nodes]}, indent=2)) + return + + print(f"\nMemory DAG Tree — {len(nodes)} nodes, {len(edges)} edges") + print("-" * 50) + + # Build parent map + children_of = {} + for e in edges: + children_of.setdefault(e["parent_id"], []).append(e["child_id"]) + + # Find roots (nodes with no parent) + child_ids = {e["child_id"] for e in edges} + roots = [n for n in nodes if n["id"] not in child_ids and n.get("is_active")] + + def print_node(node, indent=0): + prefix = " " * indent + active = "+" if node.get("is_active") else "-" + depth_label = f"d{node.get('depth', 0)}" + content_preview = node.get("content", "")[:60].replace("\n", " ") + print(f"{prefix}{active} [{depth_label}] {node['id']} ({node.get('token_count', 0)} tok)") + print(f"{prefix} \"{content_preview}...\"") + for child_id in children_of.get(node["id"], []): + child = next((n for n in nodes if n["id"] == child_id), None) + if child: + print_node(child, indent + 1) + + if not roots: + # Show all leaf nodes if no hierarchy yet + for n in sorted(nodes, key=lambda x: x["id"]): + print_node(n) + else: + for root in sorted(roots, key=lambda x: x["id"]): + print_node(root) + print() + + +def cmd_search(state: dict, query: str, fmt: str) -> None: + results = search_dag(state, query) + + if fmt == "json": + print(json.dumps({"query": query, "results": results[:20]}, indent=2)) + else: + print(f"\nSearch: \"{query}\" — {len(results)} results") + print("-" * 50) + for r in results[:15]: + depth_label = f"d{r['depth']}" if r["depth"] >= 0 else "raw" + active = "+" if r["is_active"] else "-" + print(f" {active} [{depth_label}] {r['id']} (relevance: {r['relevance']:.1%})") + print(f" \"{r['content'][:100]}...\"") + print() + + +def cmd_inspect(state: dict, node_id: str, fmt: str) -> None: + nodes = state.get("dag_nodes") or [] + edges = state.get("dag_edges") or [] + + target = next((n for n in nodes if n["id"] == node_id), None) + if not target: + print(f"Error: node '{node_id}' not found.") + sys.exit(1) + + children = [e["child_id"] for e in edges if e["parent_id"] == node_id] + parents = [e["parent_id"] for e in edges if e["child_id"] == node_id] + + if fmt == "json": + print(json.dumps({"node": target, "children": children, "parents": parents}, indent=2)) + else: + print(f"\nInspect: {node_id}") + print("-" * 50) + print(f" Depth: d{target.get('depth', 0)}") + print(f" Active: {target.get('is_active', False)}") + print(f" Tokens: {target.get('token_count', 0)}") + print(f" Created: {target.get('created_at', '?')}") + print(f" Source: {target.get('source_type', '?')}: {target.get('source_range', '?')}") + print(f" Parents: {', '.join(parents) if parents else 'none (root)'}") + print(f" Children: {', '.join(children) if children else 'none (leaf)'}") + print(f"\n Content:") + for line in target.get("content", "").split("\n"): + print(f" {line}") + print() + + +def cmd_dissolve(state: dict, node_id: str, fmt: str) -> None: + success = dissolve_node(state, node_id) + if success: + save_state(state) + if fmt == "json": + print(json.dumps({"dissolved": node_id, "success": True})) + else: + print(f"\n Dissolved {node_id} — children reactivated.") + else: + if fmt == "json": + print(json.dumps({"dissolved": node_id, "success": False})) + else: + print(f"\n Failed to dissolve {node_id}.") + sys.exit(1) + + +def cmd_status(state: dict) -> None: + nodes = state.get("dag_nodes") or [] + last = state.get("last_compact_at", "never") + entry_count = state.get("entry_count", 0) + + active = [n for n in nodes if n.get("is_active")] + depth_dist = {} + for n in nodes: + d = n.get("depth", 0) + depth_dist[d] = depth_dist.get(d, 0) + 1 + + total_tokens = sum(n.get("token_count", 0) for n in active) + + print(f"\nMemory DAG Compactor — Last compact: {last}") + print("-" * 50) + print(f" Entries tracked: {entry_count}") + print(f" Total DAG nodes: {len(nodes)}") + print(f" Active nodes: {len(active)}") + print(f" Active tokens: ~{total_tokens}") + print(f" Depth distribution:") + for d in sorted(depth_dist.keys()): + print(f" d{d}: {depth_dist[d]} nodes") + print() + + history = state.get("compact_history") or [] + if history: + h = history[0] + print(f" Last run: {h.get('entries_processed', 0)} entries → " + f"{h.get('leaves_created', 0)} leaves, " + f"{h.get('condensations', 0)} condensations") + print() + + +def main(): + parser = argparse.ArgumentParser(description="Memory DAG Compactor") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--compact", action="store_true", help="Run leaf + condensation passes") + group.add_argument("--tree", action="store_true", help="Print summary DAG as a tree") + group.add_argument("--search", type=str, metavar="QUERY", help="Search across all depths") + group.add_argument("--inspect", type=str, metavar="ID", help="Inspect a summary node") + group.add_argument("--dissolve", type=str, metavar="ID", help="Reverse a condensation") + group.add_argument("--status", action="store_true", help="Show DAG stats and health") + parser.add_argument("--depth", type=int, metavar="N", help="Max depth for compaction") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.compact: + cmd_compact(state, args.depth, args.format) + elif args.tree: + cmd_tree(state, args.format) + elif args.search: + cmd_search(state, args.search, args.format) + elif args.inspect: + cmd_inspect(state, args.inspect, args.format) + elif args.dissolve: + cmd_dissolve(state, args.dissolve, args.format) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() diff --git a/skills/openclaw-native/memory-dag-compactor/example-state.yaml b/skills/openclaw-native/memory-dag-compactor/example-state.yaml new file mode 100644 index 0000000..a0f6832 --- /dev/null +++ b/skills/openclaw-native/memory-dag-compactor/example-state.yaml @@ -0,0 +1,90 @@ +# Example runtime state for memory-dag-compactor +last_compact_at: "2026-03-16T23:00:12.445000" +config: + chunk_size: 20 + fanout: 5 + max_depth: 4 + token_budget: 8000 +dag_nodes: + - id: s-d0-000 + depth: 0 + content: | + Built memory-graph-builder skill with graph.py companion script; + pushed to branch skill/memory-graph-builder; PR #28 merged. + Added node/edge extraction from MEMORY.md, Jaccard dedup at 0.7. + [Expand for details about: memory-graph-builder, graph.py, Jaccard, PR] + expand_footer: "memory-graph-builder, graph.py, Jaccard, PR" + token_count: 52 + created_at: "2026-03-16T23:00:10.000000" + source_type: entries + source_range: "e-0000, e-0001, e-0002, e-0003, e-0004" + is_active: false + - id: s-d0-001 + depth: 0 + content: | + Built config-encryption-auditor with audit.py; scans for 8 API key + patterns + 3 token patterns; PR #29 merged. Also built + tool-description-optimizer with 5-dimension scoring; PR #30 merged. + [Expand for details about: config-encryption-auditor, audit.py, optimizer] + expand_footer: "config-encryption-auditor, audit.py, optimizer" + token_count: 58 + created_at: "2026-03-16T23:00:11.000000" + source_type: entries + source_range: "e-0005, e-0006, e-0007, e-0008, e-0009" + is_active: false + - id: s-d1-000 + depth: 1 + content: | + Session: Built 4 OpenLobster-inspired skills (memory-graph-builder, + config-encryption-auditor, tool-description-optimizer, mcp-health-checker). + All merged via PRs #28-#31. README updated to 44 skills. + [Expand for details about: OpenLobster, PRs, README, skills] + expand_footer: "OpenLobster, PRs, README, skills" + token_count: 44 + created_at: "2026-03-16T23:00:12.000000" + source_type: summaries + source_range: "s-d0-000, s-d0-001" + is_active: true +dag_edges: + - parent_id: s-d1-000 + child_id: s-d0-000 + - parent_id: s-d1-000 + child_id: s-d0-001 +entry_count: 10 +compact_history: + - compacted_at: "2026-03-16T23:00:12.445000" + entries_processed: 10 + leaves_created: 2 + condensations: 1 + max_depth_reached: 1 + total_nodes: 3 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Cron runs nightly at 11pm: python3 compact.py --compact +# +# Memory DAG Compaction — 2026-03-16 23:00 +# ────────────────────────────────────────────────── +# Entries processed: 10 +# Leaves created: 2 +# Condensations: 1 +# Max depth reached: 1 +# Total DAG nodes: 3 +# +# python3 compact.py --tree +# +# Memory DAG Tree — 3 nodes, 2 edges +# ────────────────────────────────────────────────── +# + [d1] s-d1-000 (44 tok) +# "Session: Built 4 OpenLobster-inspired skills..." +# - [d0] s-d0-000 (52 tok) +# "Built memory-graph-builder skill with graph.py..." +# - [d0] s-d0-001 (58 tok) +# "Built config-encryption-auditor with audit.py..." +# +# python3 compact.py --search "encryption" +# +# Search: "encryption" — 2 results +# ────────────────────────────────────────────────── +# - [d0] s-d0-001 (relevance: 100%) +# "Built config-encryption-auditor with audit.py..." +# + [d1] s-d1-000 (relevance: 50%) +# "Session: Built 4 OpenLobster-inspired skills..." From 896e4847747cdd29f148b11b99b1db47c6287433 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Tue, 17 Mar 2026 00:48:22 +0530 Subject: [PATCH 19/23] Add large-file-interceptor skill (#36) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Detects oversized files that would blow the context window, generates structural exploration summaries (JSON schema, CSV columns, Python imports, log patterns), and stores compact reference cards. Supports scan, summarize, restore, and audit. No cron — invoked on demand. Inspired by lossless-claw's large file interception layer. Co-authored-by: Claude Opus 4.6 --- .../large-file-interceptor/SKILL.md | 109 ++++ .../large-file-interceptor/STATE_SCHEMA.yaml | 34 ++ .../large-file-interceptor/example-state.yaml | 66 +++ .../large-file-interceptor/intercept.py | 497 ++++++++++++++++++ 4 files changed, 706 insertions(+) create mode 100644 skills/openclaw-native/large-file-interceptor/SKILL.md create mode 100644 skills/openclaw-native/large-file-interceptor/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/large-file-interceptor/example-state.yaml create mode 100755 skills/openclaw-native/large-file-interceptor/intercept.py diff --git a/skills/openclaw-native/large-file-interceptor/SKILL.md b/skills/openclaw-native/large-file-interceptor/SKILL.md new file mode 100644 index 0000000..e479943 --- /dev/null +++ b/skills/openclaw-native/large-file-interceptor/SKILL.md @@ -0,0 +1,109 @@ +--- +name: large-file-interceptor +version: "1.0" +category: openclaw-native +description: Detects oversized files that would blow the context window, generates structural exploration summaries, and stores compact references — preventing a single paste from consuming the entire budget. +stateful: true +--- + +# Large File Interceptor + +## What it does + +A single large file paste can consume 60–80% of the context window, leaving no room for actual work. Large File Interceptor detects oversized files, generates a structural summary (schema, columns, imports, key definitions), stores the original externally, and replaces it with a compact reference card. + +Inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw)'s large file interception layer, which automatically extracts files exceeding 25k tokens. + +## When to invoke + +- Before processing any file the agent reads or receives — check size first +- When context budget is running low and large files may be the cause +- After a paste or file read — retroactively scan for oversized content +- Periodically to audit what's consuming the most context budget + +## How to use + +```bash +python3 intercept.py --scan # Scan a file or directory +python3 intercept.py --scan --threshold 10000 # Custom token threshold +python3 intercept.py --summarize # Generate structural summary for a file +python3 intercept.py --list # List all intercepted files +python3 intercept.py --restore # Retrieve original file content +python3 intercept.py --audit # Show context budget impact +python3 intercept.py --status # Last scan summary +python3 intercept.py --format json # Machine-readable output +``` + +## Structural exploration summaries + +The interceptor generates different summaries based on file type: + +| File type | Summary includes | +|---|---| +| JSON/YAML | Top-level schema, key types, array lengths, nested depth | +| CSV/TSV | Column names, row count, sample values, data types per column | +| Python/JS/TS | Imports, class definitions, function signatures, export list | +| Markdown | Heading structure, word count per section, link count | +| Log files | Time range, error count, unique error patterns, frequency | +| Binary/Other | File size, MIME type, magic bytes | + +## Reference card format + +When a file is intercepted, the original is stored in `~/.openclaw/lcm-files/` and replaced with: + +``` +[FILE REFERENCE: ref-001] +Original: /path/to/large-file.json +Size: 145,230 bytes (~36,307 tokens) +Type: JSON — API response payload + +Structure: + - Root: object with 3 keys + - "data": array of 1,247 objects + - "metadata": object (pagination, timestamps) + - "errors": empty array + +Key fields in data[]: id, name, email, created_at, status +Sample: {"id": 1, "name": "...", "status": "active"} + +To retrieve full content: python3 intercept.py --restore ref-001 +``` + +## Procedure + +**Step 1 — Scan before processing** + +```bash +python3 intercept.py --scan /path/to/file.json +``` + +If the file exceeds the token threshold (default: 25,000 tokens), it generates a structural summary and stores a reference. + +**Step 2 — Audit context impact** + +```bash +python3 intercept.py --audit +``` + +Shows all files in the current workspace ranked by token impact, with recommendations for which to intercept. + +**Step 3 — Restore when needed** + +```bash +python3 intercept.py --restore ref-001 +``` + +Retrieves the original file content from storage for detailed inspection. + +## State + +Intercepted file registry and reference cards stored in `~/.openclaw/skill-state/large-file-interceptor/state.yaml`. Original files stored in `~/.openclaw/lcm-files/`. + +Fields: `last_scan_at`, `intercepted_files`, `total_tokens_saved`, `scan_history`. + +## Notes + +- Never deletes or modifies original files — intercept creates a copy + reference +- Token threshold is configurable (default: 25,000 ~= 100KB of text) +- Reference cards are typically 200–400 tokens vs. 25,000+ for the original +- Supports recursive directory scanning with `--scan /path/to/dir` diff --git a/skills/openclaw-native/large-file-interceptor/STATE_SCHEMA.yaml b/skills/openclaw-native/large-file-interceptor/STATE_SCHEMA.yaml new file mode 100644 index 0000000..6c421f9 --- /dev/null +++ b/skills/openclaw-native/large-file-interceptor/STATE_SCHEMA.yaml @@ -0,0 +1,34 @@ +version: "1.0" +description: Registry of intercepted large files, reference cards, and token savings. +fields: + last_scan_at: + type: datetime + token_threshold: + type: integer + default: 25000 + description: Files exceeding this token count are intercepted + intercepted_files: + type: list + description: All intercepted file references + items: + ref_id: { type: string, description: "Reference ID (e.g. ref-001)" } + original_path: { type: string } + stored_path: { type: string, description: "Path in ~/.openclaw/lcm-files/" } + file_type: { type: string, description: "Detected file type" } + original_tokens: { type: integer } + summary_tokens: { type: integer } + tokens_saved: { type: integer } + summary: { type: string, description: "Structural exploration summary" } + intercepted_at: { type: datetime } + total_tokens_saved: + type: integer + description: Cumulative tokens saved by interception + scan_history: + type: list + description: Rolling log of past scans (last 20) + items: + scanned_at: { type: datetime } + path_scanned: { type: string } + files_checked: { type: integer } + files_intercepted: { type: integer } + tokens_saved: { type: integer } diff --git a/skills/openclaw-native/large-file-interceptor/example-state.yaml b/skills/openclaw-native/large-file-interceptor/example-state.yaml new file mode 100644 index 0000000..162404b --- /dev/null +++ b/skills/openclaw-native/large-file-interceptor/example-state.yaml @@ -0,0 +1,66 @@ +# Example runtime state for large-file-interceptor +last_scan_at: "2026-03-16T14:30:05.000000" +token_threshold: 25000 +intercepted_files: + - ref_id: ref-001 + original_path: "/Users/you/project/data/api-response.json" + stored_path: "/Users/you/.openclaw/lcm-files/ref-001_a3b2c1d4e5f6.json" + file_type: JSON + original_tokens: 36307 + summary_tokens: 180 + tokens_saved: 36127 + summary: | + Root: object with 3 keys + "data": array of 1247 objects + "metadata": object (5 keys) + "errors": empty array + Item keys: id, name, email, created_at, status + intercepted_at: "2026-03-16T14:30:03.000000" + - ref_id: ref-002 + original_path: "/Users/you/project/logs/server.log" + stored_path: "/Users/you/.openclaw/lcm-files/ref-002_f7e8d9c0b1a2.log" + file_type: Log + original_tokens: 52800 + summary_tokens: 220 + tokens_saved: 52580 + summary: | + Total lines: 8450 + Time range: 2026-03-15T00:00 → 2026-03-16T14:29 + Errors: 23, Warnings: 87 + Unique error patterns: 5 + ConnectionError: host N.N.N.N port N + TimeoutError: request exceeded Nms + ValueError: invalid JSON at line N + intercepted_at: "2026-03-16T14:30:04.000000" +total_tokens_saved: 88707 +scan_history: + - scanned_at: "2026-03-16T14:30:05.000000" + path_scanned: "/Users/you/project" + files_checked: 48 + files_intercepted: 2 + tokens_saved: 88707 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# python3 intercept.py --scan /Users/you/project +# +# Intercepted: api-response.json (36,307 tokens → 180 tokens) +# Reference card: +# [FILE REFERENCE: ref-001] +# Original: /Users/you/project/data/api-response.json +# Size: 145,230 bytes (~36,307 tokens) +# Type: JSON — API response payload +# ... +# +# Intercepted: server.log (52,800 tokens → 220 tokens) +# ... +# +# Scan Complete — 48 files checked, 2 intercepted, ~88,707 tokens saved +# +# python3 intercept.py --audit +# +# Context Budget Audit +# ────────────────────────────────────────────── +# Intercepted files: 2 +# Original token cost: ~89,107 +# Summary token cost: ~400 +# Total tokens saved: ~88,707 +# Compression ratio: 99% diff --git a/skills/openclaw-native/large-file-interceptor/intercept.py b/skills/openclaw-native/large-file-interceptor/intercept.py new file mode 100755 index 0000000..d7b231a --- /dev/null +++ b/skills/openclaw-native/large-file-interceptor/intercept.py @@ -0,0 +1,497 @@ +#!/usr/bin/env python3 +""" +Large File Interceptor for openclaw-superpowers. + +Detects oversized files, generates structural summaries, and stores +compact references to prevent context window blowout. + +Usage: + python3 intercept.py --scan + python3 intercept.py --scan --threshold 10000 + python3 intercept.py --summarize + python3 intercept.py --list + python3 intercept.py --restore + python3 intercept.py --audit + python3 intercept.py --status + python3 intercept.py --format json +""" + +import argparse +import csv +import hashlib +import io +import json +import mimetypes +import os +import re +import shutil +import sys +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "large-file-interceptor" / "state.yaml" +FILE_STORE = OPENCLAW_DIR / "lcm-files" +DEFAULT_THRESHOLD = 25000 # tokens +MAX_HISTORY = 20 + +# File extensions to analyze +TEXT_EXTENSIONS = { + ".json", ".yaml", ".yml", ".csv", ".tsv", ".xml", + ".py", ".js", ".ts", ".jsx", ".tsx", ".go", ".rs", ".java", + ".md", ".txt", ".log", ".conf", ".cfg", ".ini", ".toml", + ".html", ".css", ".sql", ".sh", ".bash", ".zsh", + ".env", ".gitignore", ".dockerfile", +} + + +# ── State helpers ──────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"intercepted_files": [], "total_tokens_saved": 0, "scan_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {"intercepted_files": [], "total_tokens_saved": 0, "scan_history": []} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +def estimate_tokens(text: str) -> int: + return len(text) // 4 + + +def next_ref_id(state: dict) -> str: + existing = state.get("intercepted_files") or [] + return f"ref-{len(existing)+1:03d}" + + +# ── File type detection and structural analysis ────────────────────────────── + +def detect_file_type(path: Path) -> str: + ext = path.suffix.lower() + type_map = { + ".json": "JSON", ".yaml": "YAML", ".yml": "YAML", + ".csv": "CSV", ".tsv": "TSV", + ".py": "Python", ".js": "JavaScript", ".ts": "TypeScript", + ".jsx": "JSX", ".tsx": "TSX", + ".go": "Go", ".rs": "Rust", ".java": "Java", + ".md": "Markdown", ".txt": "Text", + ".log": "Log", ".xml": "XML", ".html": "HTML", + ".css": "CSS", ".sql": "SQL", + ".sh": "Shell", ".bash": "Shell", ".zsh": "Shell", + } + return type_map.get(ext, "Unknown") + + +def analyze_json(content: str) -> str: + """Structural summary for JSON files.""" + try: + data = json.loads(content) + except json.JSONDecodeError: + return "Invalid JSON — parse error" + + lines = [] + if isinstance(data, dict): + lines.append(f"Root: object with {len(data)} keys") + for key, val in list(data.items())[:10]: + if isinstance(val, list): + item_type = type(val[0]).__name__ if val else "empty" + lines.append(f' "{key}": array of {len(val)} {item_type}s') + elif isinstance(val, dict): + lines.append(f' "{key}": object ({len(val)} keys)') + else: + lines.append(f' "{key}": {type(val).__name__} = {str(val)[:50]}') + if len(data) > 10: + lines.append(f" ... +{len(data)-10} more keys") + elif isinstance(data, list): + lines.append(f"Root: array of {len(data)} items") + if data: + item = data[0] + if isinstance(item, dict): + lines.append(f" Item keys: {', '.join(list(item.keys())[:8])}") + sample = json.dumps(item, default=str)[:100] + lines.append(f" Sample: {sample}") + return "\n".join(lines) + + +def analyze_csv(content: str, delimiter: str = ",") -> str: + """Structural summary for CSV/TSV files.""" + lines = [] + reader = csv.reader(io.StringIO(content), delimiter=delimiter) + rows = list(reader) + if not rows: + return "Empty file" + + headers = rows[0] if rows else [] + lines.append(f"Columns ({len(headers)}): {', '.join(headers[:10])}") + if len(headers) > 10: + lines.append(f" ... +{len(headers)-10} more columns") + lines.append(f"Rows: {len(rows)-1} (excluding header)") + + # Sample values + if len(rows) > 1: + sample = rows[1] + for i, (h, v) in enumerate(zip(headers[:5], sample[:5])): + lines.append(f" {h}: {v[:50]}") + return "\n".join(lines) + + +def analyze_python(content: str) -> str: + """Structural summary for Python files.""" + lines = [] + imports = re.findall(r'^(?:import|from)\s+.+', content, re.MULTILINE) + classes = re.findall(r'^class\s+(\w+)', content, re.MULTILINE) + functions = re.findall(r'^def\s+(\w+)\(([^)]*)\)', content, re.MULTILINE) + + if imports: + lines.append(f"Imports ({len(imports)}): {'; '.join(imports[:5])}") + if classes: + lines.append(f"Classes: {', '.join(classes)}") + if functions: + func_sigs = [f"{name}({args[:30]})" for name, args in functions[:10]] + lines.append(f"Functions ({len(functions)}): {', '.join(func_sigs)}") + lines.append(f"Total lines: {content.count(chr(10))+1}") + return "\n".join(lines) + + +def analyze_javascript(content: str) -> str: + """Structural summary for JS/TS files.""" + lines = [] + imports = re.findall(r'^import\s+.+', content, re.MULTILINE) + exports = re.findall(r'^export\s+(?:default\s+)?(?:class|function|const|let|var|interface|type)\s+(\w+)', + content, re.MULTILINE) + classes = re.findall(r'^(?:export\s+)?class\s+(\w+)', content, re.MULTILINE) + functions = re.findall(r'^(?:export\s+)?(?:async\s+)?function\s+(\w+)', content, re.MULTILINE) + + if imports: + lines.append(f"Imports ({len(imports)}): {'; '.join(imports[:5])}") + if exports: + lines.append(f"Exports: {', '.join(exports[:10])}") + if classes: + lines.append(f"Classes: {', '.join(classes)}") + if functions: + lines.append(f"Functions: {', '.join(functions[:10])}") + lines.append(f"Total lines: {content.count(chr(10))+1}") + return "\n".join(lines) + + +def analyze_markdown(content: str) -> str: + """Structural summary for Markdown files.""" + lines = [] + headings = re.findall(r'^(#{1,6})\s+(.+)', content, re.MULTILINE) + word_count = len(content.split()) + link_count = len(re.findall(r'\[([^\]]+)\]\([^)]+\)', content)) + + lines.append(f"Word count: {word_count}") + if headings: + lines.append(f"Headings ({len(headings)}):") + for level, text in headings[:10]: + lines.append(f" {' '*(len(level)-1)}{level} {text}") + lines.append(f"Links: {link_count}") + return "\n".join(lines) + + +def analyze_log(content: str) -> str: + """Structural summary for log files.""" + lines_list = content.split("\n") + lines = [f"Total lines: {len(lines_list)}"] + + # Detect time range + timestamps = re.findall(r'\d{4}-\d{2}-\d{2}[T ]\d{2}:\d{2}', content) + if timestamps: + lines.append(f"Time range: {timestamps[0]} → {timestamps[-1]}") + + # Error patterns + errors = [l for l in lines_list if re.search(r'(?i)error|exception|fatal|panic', l)] + warns = [l for l in lines_list if re.search(r'(?i)warn', l)] + lines.append(f"Errors: {len(errors)}, Warnings: {len(warns)}") + + if errors: + unique_errors = set() + for e in errors[:20]: + pattern = re.sub(r'\d+', 'N', e.strip()[:80]) + unique_errors.add(pattern) + lines.append(f"Unique error patterns: {len(unique_errors)}") + for p in list(unique_errors)[:3]: + lines.append(f" {p}") + + return "\n".join(lines) + + +def generate_summary(path: Path, content: str) -> str: + """Generate a structural exploration summary based on file type.""" + file_type = detect_file_type(path) + + analyzers = { + "JSON": analyze_json, + "YAML": lambda c: analyze_json(json.dumps(yaml.safe_load(c))) if HAS_YAML else f"YAML file, {len(c)} chars", + "CSV": lambda c: analyze_csv(c, ","), + "TSV": lambda c: analyze_csv(c, "\t"), + "Python": analyze_python, + "JavaScript": analyze_javascript, + "TypeScript": analyze_javascript, + "JSX": analyze_javascript, + "TSX": analyze_javascript, + "Markdown": analyze_markdown, + "Log": analyze_log, + } + + analyzer = analyzers.get(file_type) + if analyzer: + try: + return analyzer(content) + except Exception as e: + return f"Analysis failed: {str(e)[:100]}" + else: + return f"File type: {file_type}\nSize: {len(content)} chars\nLines: {content.count(chr(10))+1}" + + +def generate_reference_card(ref_id: str, path: Path, content: str, summary: str) -> str: + """Generate a compact reference card for an intercepted file.""" + tokens = estimate_tokens(content) + file_type = detect_file_type(path) + + card = f"""[FILE REFERENCE: {ref_id}] +Original: {path} +Size: {len(content):,} bytes (~{tokens:,} tokens) +Type: {file_type} + +Structure: +{summary} + +To retrieve full content: python3 intercept.py --restore {ref_id}""" + return card + + +# ── Commands ───────────────────────────────────────────────────────────────── + +def cmd_scan(state: dict, scan_path: str, threshold: int, fmt: str) -> None: + path = Path(scan_path).resolve() + now = datetime.now().isoformat() + + if not path.exists(): + print(f"Error: path '{scan_path}' not found.") + sys.exit(1) + + # Collect files to scan + files = [] + if path.is_file(): + files = [path] + else: + for ext in TEXT_EXTENSIONS: + files.extend(path.rglob(f"*{ext}")) + + checked = 0 + intercepted = 0 + tokens_saved = 0 + + for fp in files: + try: + content = fp.read_text(errors="replace") + except (PermissionError, OSError): + continue + + checked += 1 + tokens = estimate_tokens(content) + + if tokens <= threshold: + continue + + # Intercept this file + ref_id = next_ref_id(state) + summary = generate_summary(fp, content) + summary_tokens = estimate_tokens(summary + ref_id) + saved = tokens - summary_tokens + + # Store original + FILE_STORE.mkdir(parents=True, exist_ok=True) + file_hash = hashlib.sha256(content.encode()).hexdigest()[:12] + stored_name = f"{ref_id}_{file_hash}{fp.suffix}" + stored_path = FILE_STORE / stored_name + stored_path.write_text(content) + + record = { + "ref_id": ref_id, + "original_path": str(fp), + "stored_path": str(stored_path), + "file_type": detect_file_type(fp), + "original_tokens": tokens, + "summary_tokens": summary_tokens, + "tokens_saved": saved, + "summary": summary, + "intercepted_at": now, + } + + files_list = state.get("intercepted_files") or [] + files_list.append(record) + state["intercepted_files"] = files_list + intercepted += 1 + tokens_saved += saved + + if fmt != "json": + card = generate_reference_card(ref_id, fp, content, summary) + print(f"\n Intercepted: {fp.name} ({tokens:,} tokens → {summary_tokens:,} tokens)") + print(f" Reference card:\n{card}\n") + + state["total_tokens_saved"] = (state.get("total_tokens_saved") or 0) + tokens_saved + state["last_scan_at"] = now + + history = state.get("scan_history") or [] + history.insert(0, { + "scanned_at": now, "path_scanned": str(path), + "files_checked": checked, "files_intercepted": intercepted, + "tokens_saved": tokens_saved, + }) + state["scan_history"] = history[:MAX_HISTORY] + save_state(state) + + if fmt == "json": + print(json.dumps({"files_checked": checked, "files_intercepted": intercepted, + "tokens_saved": tokens_saved}, indent=2)) + else: + print(f"\nScan Complete — {checked} files checked, {intercepted} intercepted, ~{tokens_saved:,} tokens saved") + + +def cmd_summarize(path_str: str, fmt: str) -> None: + path = Path(path_str).resolve() + if not path.exists(): + print(f"Error: file '{path_str}' not found.") + sys.exit(1) + + content = path.read_text(errors="replace") + summary = generate_summary(path, content) + tokens = estimate_tokens(content) + summary_tokens = estimate_tokens(summary) + + if fmt == "json": + print(json.dumps({"file": str(path), "type": detect_file_type(path), + "original_tokens": tokens, "summary_tokens": summary_tokens, + "summary": summary}, indent=2)) + else: + print(f"\nStructural Summary: {path.name}") + print(f" Type: {detect_file_type(path)} | {tokens:,} tokens → ~{summary_tokens:,} tokens") + print("-" * 50) + print(summary) + print() + + +def cmd_list(state: dict, fmt: str) -> None: + files = state.get("intercepted_files") or [] + if fmt == "json": + print(json.dumps({"intercepted_files": files, "total": len(files)}, indent=2)) + else: + print(f"\nIntercepted Files — {len(files)} total") + print("-" * 60) + for f in files: + saved = f.get("tokens_saved", 0) + print(f" {f['ref_id']} {f['file_type']:>10} {f['original_tokens']:>8,} tok → " + f"{f['summary_tokens']:>6,} tok (saved {saved:,})") + print(f" {f['original_path']}") + total = state.get("total_tokens_saved", 0) + print(f"\n Total tokens saved: ~{total:,}") + print() + + +def cmd_restore(state: dict, ref_id: str) -> None: + files = state.get("intercepted_files") or [] + target = next((f for f in files if f["ref_id"] == ref_id), None) + if not target: + print(f"Error: reference '{ref_id}' not found.") + sys.exit(1) + + stored = Path(target["stored_path"]) + if stored.exists(): + print(stored.read_text()) + else: + print(f"Error: stored file not found at {stored}") + sys.exit(1) + + +def cmd_audit(state: dict, fmt: str) -> None: + files = state.get("intercepted_files") or [] + total_original = sum(f.get("original_tokens", 0) for f in files) + total_summary = sum(f.get("summary_tokens", 0) for f in files) + total_saved = state.get("total_tokens_saved", 0) + + if fmt == "json": + print(json.dumps({"files": len(files), "total_original_tokens": total_original, + "total_summary_tokens": total_summary, "total_saved": total_saved}, indent=2)) + else: + print(f"\nContext Budget Audit") + print("-" * 50) + print(f" Intercepted files: {len(files)}") + print(f" Original token cost: ~{total_original:,}") + print(f" Summary token cost: ~{total_summary:,}") + print(f" Total tokens saved: ~{total_saved:,}") + if total_original > 0: + ratio = total_saved / total_original * 100 + print(f" Compression ratio: {ratio:.0f}%") + print() + + # Top consumers + sorted_files = sorted(files, key=lambda f: f.get("original_tokens", 0), reverse=True) + if sorted_files: + print(" Top context consumers:") + for f in sorted_files[:5]: + print(f" {f['ref_id']} {f['original_tokens']:>8,} tok {f['original_path']}") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_scan_at", "never") + files = state.get("intercepted_files") or [] + total_saved = state.get("total_tokens_saved", 0) + print(f"\nLarge File Interceptor — Last scan: {last}") + print(f" {len(files)} files intercepted | ~{total_saved:,} tokens saved") + history = state.get("scan_history") or [] + if history: + h = history[0] + print(f" Last: {h.get('files_checked',0)} checked, " + f"{h.get('files_intercepted',0)} intercepted at {h.get('path_scanned','?')}") + print() + + +def main(): + parser = argparse.ArgumentParser(description="Large File Interceptor") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--scan", type=str, metavar="PATH", help="Scan a file or directory") + group.add_argument("--summarize", type=str, metavar="FILE", help="Generate structural summary") + group.add_argument("--list", action="store_true", help="List all intercepted files") + group.add_argument("--restore", type=str, metavar="REF_ID", help="Retrieve original file") + group.add_argument("--audit", action="store_true", help="Show context budget impact") + group.add_argument("--status", action="store_true", help="Last scan summary") + parser.add_argument("--threshold", type=int, default=DEFAULT_THRESHOLD, help="Token threshold (default: 25000)") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.scan: + cmd_scan(state, args.scan, args.threshold, args.format) + elif args.summarize: + cmd_summarize(args.summarize, args.format) + elif args.list: + cmd_list(state, args.format) + elif args.restore: + cmd_restore(state, args.restore) + elif args.audit: + cmd_audit(state, args.format) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() From d79639eb23733d724e2b50e2f5909d48b36cfa9f Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Tue, 17 Mar 2026 00:51:12 +0530 Subject: [PATCH 20/23] Add context-assembly-scorer skill (#37) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Scores how well current context represents full conversation history. 5 dimensions: topic coverage, recency bias, entity continuity, decision retention, task continuity. Detects blind spots — topics the agent has effectively forgotten. Cron every 4h. Inspired by lossless-claw's context assembly system. Co-authored-by: Claude Opus 4.6 --- .../context-assembly-scorer/SKILL.md | 94 ++++ .../context-assembly-scorer/STATE_SCHEMA.yaml | 31 ++ .../example-state.yaml | 74 ++++ .../context-assembly-scorer/score.py | 419 ++++++++++++++++++ 4 files changed, 618 insertions(+) create mode 100644 skills/openclaw-native/context-assembly-scorer/SKILL.md create mode 100644 skills/openclaw-native/context-assembly-scorer/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/context-assembly-scorer/example-state.yaml create mode 100755 skills/openclaw-native/context-assembly-scorer/score.py diff --git a/skills/openclaw-native/context-assembly-scorer/SKILL.md b/skills/openclaw-native/context-assembly-scorer/SKILL.md new file mode 100644 index 0000000..0487be6 --- /dev/null +++ b/skills/openclaw-native/context-assembly-scorer/SKILL.md @@ -0,0 +1,94 @@ +--- +name: context-assembly-scorer +version: "1.0" +category: openclaw-native +description: Scores how well the current context represents the full conversation — detects information blind spots, stale summaries, and coverage gaps that cause the agent to forget critical details. +stateful: true +cron: "0 */4 * * *" +--- + +# Context Assembly Scorer + +## What it does + +When an agent compacts context, it loses information. But how much? And which information? Context Assembly Scorer answers these questions by measuring **coverage** — the ratio of important topics in the full conversation history that are represented in the current assembled context. + +Inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw)'s context assembly system, which carefully selects which summaries to include in each turn's context to maximize information coverage. + +## When to invoke + +- Automatically every 4 hours (cron) — silent coverage check +- Before starting a task that depends on prior context — verify nothing critical is missing +- After compaction — measure information loss +- When the agent says "I don't remember" — diagnose why + +## Coverage dimensions + +| Dimension | What it measures | Weight | +|---|---|---| +| Topic coverage | % of conversation topics present in current context | 2x | +| Recency bias | Whether recent context is over-represented vs. older important context | 1.5x | +| Entity continuity | Named entities (files, people, APIs) mentioned in history that are missing from context | 2x | +| Decision retention | Architectural decisions and user preferences still accessible | 2x | +| Task continuity | Active/pending tasks that might be lost after compaction | 1.5x | + +## How to use + +```bash +python3 score.py --score # Score current context assembly +python3 score.py --score --verbose # Detailed per-dimension breakdown +python3 score.py --blind-spots # List topics missing from context +python3 score.py --drift # Compare current vs. previous scores +python3 score.py --status # Last score summary +python3 score.py --format json # Machine-readable output +``` + +## Procedure + +**Step 1 — Score context coverage** + +```bash +python3 score.py --score +``` + +The scorer reads MEMORY.md (full history) and compares it against what's currently accessible. Outputs a coverage score from 0–100% with a letter grade. + +**Step 2 — Find blind spots** + +```bash +python3 score.py --blind-spots +``` + +Lists specific topics, entities, and decisions that exist in full history but are missing from current context — these are what the agent has effectively "forgotten." + +**Step 3 — Track drift over time** + +```bash +python3 score.py --drift +``` + +Shows how coverage has changed across the last 20 scores. Identify if compaction is progressively losing more information. + +## Grading + +| Grade | Coverage | Meaning | +|---|---|---| +| A | 90–100% | Excellent — minimal information loss | +| B | 75–89% | Good — minor gaps, unlikely to cause issues | +| C | 60–74% | Fair — some important context missing | +| D | 40–59% | Poor — significant blind spots | +| F | 0–39% | Critical — agent is operating with major gaps | + +## State + +Coverage scores and blind spot history stored in `~/.openclaw/skill-state/context-assembly-scorer/state.yaml`. + +Fields: `last_score_at`, `current_score`, `blind_spots`, `score_history`. + +## Notes + +- Read-only — does not modify context or memory +- Topic extraction uses keyword clustering, not LLM calls +- Entity detection uses regex patterns for file paths, URLs, class names, API endpoints +- Decision detection looks for markers: "decided", "chose", "prefer", "always", "never" +- Recency bias is measured as the ratio of recent-vs-old entry representation diff --git a/skills/openclaw-native/context-assembly-scorer/STATE_SCHEMA.yaml b/skills/openclaw-native/context-assembly-scorer/STATE_SCHEMA.yaml new file mode 100644 index 0000000..89a1b18 --- /dev/null +++ b/skills/openclaw-native/context-assembly-scorer/STATE_SCHEMA.yaml @@ -0,0 +1,31 @@ +version: "1.0" +description: Context coverage scores, blind spot tracking, and drift history. +fields: + last_score_at: + type: datetime + current_score: + type: object + fields: + overall: { type: float, description: "0-100 coverage percentage" } + grade: { type: string } + topic_coverage: { type: float } + recency_bias: { type: float } + entity_continuity: { type: float } + decision_retention: { type: float } + task_continuity: { type: float } + blind_spots: + type: list + description: Topics/entities missing from current context + items: + type: { type: enum, values: [topic, entity, decision, task] } + name: { type: string } + importance: { type: enum, values: [critical, high, medium, low] } + last_seen: { type: string, description: "When this was last in context" } + score_history: + type: list + description: Rolling log of past scores (last 20) + items: + scored_at: { type: datetime } + overall: { type: float } + grade: { type: string } + blind_spot_count: { type: integer } diff --git a/skills/openclaw-native/context-assembly-scorer/example-state.yaml b/skills/openclaw-native/context-assembly-scorer/example-state.yaml new file mode 100644 index 0000000..a2d139d --- /dev/null +++ b/skills/openclaw-native/context-assembly-scorer/example-state.yaml @@ -0,0 +1,74 @@ +# Example runtime state for context-assembly-scorer +last_score_at: "2026-03-16T16:00:08.000000" +current_score: + overall: 72.3 + grade: C + topic_coverage: 82.0 + recency_bias: 65.5 + entity_continuity: 68.0 + decision_retention: 75.0 + task_continuity: 70.0 +blind_spots: + - type: decision + name: "Decided to use Jaccard similarity threshold of 0.7 for deduplication" + importance: critical + last_seen: "in full memory" + - type: entity + name: "/skills/openclaw-native/heartbeat-governor/governor.py" + importance: high + last_seen: "in full memory" + - type: task + name: "TODO: add --dry-run flag to radar.py before next release" + importance: high + last_seen: "in full memory" + - type: entity + name: "https://github.com/Neirth/OpenLobster" + importance: medium + last_seen: "in full memory" +score_history: + - scored_at: "2026-03-16T16:00:08.000000" + overall: 72.3 + grade: C + blind_spot_count: 12 + - scored_at: "2026-03-16T12:00:05.000000" + overall: 85.1 + grade: B + blind_spot_count: 5 + - scored_at: "2026-03-16T08:00:03.000000" + overall: 91.2 + grade: A + blind_spot_count: 2 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Cron runs every 4 hours: python3 score.py --score --verbose +# +# Context Assembly Score — 2026-03-16 16:00 +# ─────────────────────────────────────────────────────── +# Overall: 72.3% Grade: C +# Topic coverage: 82.0% (2x weight) +# Recency bias: 65.5% (1.5x weight) +# Entity continuity: 68.0% (2x weight) +# Decision retention: 75.0% (2x weight) +# Task continuity: 70.0% (1.5x weight) +# +# Memory stats: +# Topics: 284 unique | Entities: 47 +# Decisions: 12 | Tasks: 8 +# Blind spots: 12 +# +# python3 score.py --blind-spots +# +# Blind Spots — 12 items missing from context +# ─────────────────────────────────────────────────────── +# !! [CRITICAL] decision: Decided to use Jaccard similarity threshold... +# ! [ HIGH] entity: /skills/openclaw-native/heartbeat-governor/... +# ! [ HIGH] task: TODO: add --dry-run flag to radar.py... +# +# python3 score.py --drift +# +# Coverage Drift — 3 data points +# ─────────────────────────────────────────────────────── +# 2026-03-16T16:00 [=======---] 72.3% (C) 12 blind spots +# 2026-03-16T12:00 [=========-] 85.1% (B) 5 blind spots +# 2026-03-16T08:00 [=========+] 91.2% (A) 2 blind spots +# +# Trend: declining (-12.8%) diff --git a/skills/openclaw-native/context-assembly-scorer/score.py b/skills/openclaw-native/context-assembly-scorer/score.py new file mode 100755 index 0000000..0897293 --- /dev/null +++ b/skills/openclaw-native/context-assembly-scorer/score.py @@ -0,0 +1,419 @@ +#!/usr/bin/env python3 +""" +Context Assembly Scorer for openclaw-superpowers. + +Scores how well the current context represents the full conversation. +Detects information blind spots, stale summaries, and coverage gaps. + +Usage: + python3 score.py --score + python3 score.py --score --verbose + python3 score.py --blind-spots + python3 score.py --drift + python3 score.py --status + python3 score.py --format json +""" + +import argparse +import json +import os +import re +import sys +from collections import Counter +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "context-assembly-scorer" / "state.yaml" +MEMORY_FILE = OPENCLAW_DIR / "workspace" / "MEMORY.md" +CONTEXT_FILE = OPENCLAW_DIR / "workspace" / "CONTEXT.md" +MAX_HISTORY = 20 + +# ── Patterns for entity/decision detection ─────────────────────────────────── + +ENTITY_PATTERNS = [ + re.compile(r'(?:/[\w./-]+\.[\w]+)'), # file paths + re.compile(r'https?://[^\s)]+'), # URLs + re.compile(r'(?:class|def|function|const)\s+(\w+)'), # code definitions + re.compile(r'(?:GET|POST|PUT|DELETE|PATCH)\s+/[\w/-]+'), # API endpoints + re.compile(r'`([^`]{3,40})`'), # inline code refs +] + +DECISION_MARKERS = [ + "decided", "chose", "chosen", "prefer", "preference", + "always", "never", "must", "should not", "agreed", + "convention", "standard", "rule", "policy", "approach", +] + +TASK_MARKERS = [ + "todo", "TODO", "FIXME", "HACK", "pending", "in progress", + "blocked", "waiting", "next step", "follow up", "need to", +] + + +# ── State helpers ──────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"blind_spots": [], "score_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {"blind_spots": [], "score_history": []} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +# ── Extraction ─────────────────────────────────────────────────────────────── + +def extract_topics(text: str) -> Counter: + """Extract topic keywords from text.""" + # Remove code blocks and URLs + cleaned = re.sub(r'```[\s\S]*?```', '', text) + cleaned = re.sub(r'https?://\S+', '', cleaned) + # Tokenize + words = re.findall(r'[a-z][a-z0-9_-]{2,}', cleaned.lower()) + # Filter stopwords + stopwords = { + "the", "and", "for", "that", "this", "with", "from", "are", "was", + "were", "been", "have", "has", "had", "will", "would", "could", + "should", "not", "but", "its", "also", "can", "into", "when", + "then", "than", "more", "some", "each", "all", "any", "our", + "your", "their", "which", "about", "just", "like", "very", + } + filtered = [w for w in words if w not in stopwords and len(w) > 2] + return Counter(filtered) + + +def extract_entities(text: str) -> set: + """Extract named entities from text.""" + entities = set() + for pattern in ENTITY_PATTERNS: + matches = pattern.findall(text) + for m in matches: + if isinstance(m, tuple): + m = m[0] + if len(m) > 2: + entities.add(m.strip()) + return entities + + +def extract_decisions(text: str) -> list[str]: + """Extract decision statements from text.""" + decisions = [] + for line in text.split("\n"): + line_lower = line.lower() + for marker in DECISION_MARKERS: + if marker in line_lower: + decisions.append(line.strip()[:120]) + break + return decisions + + +def extract_tasks(text: str) -> list[str]: + """Extract task/todo references from text.""" + tasks = [] + for line in text.split("\n"): + line_lower = line.lower() + for marker in TASK_MARKERS: + if marker in line_lower: + tasks.append(line.strip()[:120]) + break + return tasks + + +# ── Scoring ────────────────────────────────────────────────────────────────── + +def score_topic_coverage(memory_topics: Counter, context_topics: Counter) -> float: + """Score: what % of important memory topics appear in context.""" + if not memory_topics: + return 100.0 + # Focus on top 50 topics by frequency + top_topics = {t for t, _ in memory_topics.most_common(50)} + if not top_topics: + return 100.0 + covered = sum(1 for t in top_topics if context_topics.get(t, 0) > 0) + return round(covered / len(top_topics) * 100, 1) + + +def score_recency_bias(memory_text: str, context_text: str) -> float: + """Score: is context over-representing recent entries vs. older important ones.""" + memory_lines = memory_text.split("\n") + total = len(memory_lines) + if total < 10: + return 100.0 + + # Split memory into thirds: old, mid, recent + third = total // 3 + old_topics = extract_topics("\n".join(memory_lines[:third])) + mid_topics = extract_topics("\n".join(memory_lines[third:2*third])) + recent_topics = extract_topics("\n".join(memory_lines[2*third:])) + ctx_topics = extract_topics(context_text) + + # Score each third's representation + old_covered = sum(1 for t in old_topics if ctx_topics.get(t, 0) > 0) + mid_covered = sum(1 for t in mid_topics if ctx_topics.get(t, 0) > 0) + recent_covered = sum(1 for t in recent_topics if ctx_topics.get(t, 0) > 0) + + old_pct = old_covered / max(len(old_topics), 1) * 100 + mid_pct = mid_covered / max(len(mid_topics), 1) * 100 + recent_pct = recent_covered / max(len(recent_topics), 1) * 100 + + # Penalize if old coverage is much lower than recent + if recent_pct > 0: + balance = (old_pct + mid_pct) / (2 * recent_pct) * 100 + return round(min(100.0, balance), 1) + return 100.0 + + +def score_entity_continuity(memory_entities: set, context_entities: set) -> float: + """Score: named entities in history that are missing from context.""" + if not memory_entities: + return 100.0 + covered = len(memory_entities & context_entities) + return round(covered / len(memory_entities) * 100, 1) + + +def score_decision_retention(memory_decisions: list, context_text: str) -> float: + """Score: are decisions still accessible in context.""" + if not memory_decisions: + return 100.0 + ctx_lower = context_text.lower() + retained = sum(1 for d in memory_decisions + if any(word in ctx_lower for word in d.lower().split()[:5])) + return round(retained / len(memory_decisions) * 100, 1) + + +def score_task_continuity(memory_tasks: list, context_text: str) -> float: + """Score: are active tasks still visible in context.""" + if not memory_tasks: + return 100.0 + ctx_lower = context_text.lower() + retained = sum(1 for t in memory_tasks + if any(word in ctx_lower for word in t.lower().split()[:5])) + return round(retained / len(memory_tasks) * 100, 1) + + +def compute_overall(tc, rb, ec, dr, tcont) -> float: + """Weighted overall score.""" + weighted = tc * 2.0 + rb * 1.5 + ec * 2.0 + dr * 2.0 + tcont * 1.5 + total_weight = 2.0 + 1.5 + 2.0 + 2.0 + 1.5 + return round(weighted / total_weight, 1) + + +def get_grade(score: float) -> str: + if score >= 90: + return "A" + elif score >= 75: + return "B" + elif score >= 60: + return "C" + elif score >= 40: + return "D" + return "F" + + +def find_blind_spots(memory_text: str, context_text: str) -> list[dict]: + """Find specific items missing from context.""" + spots = [] + mem_entities = extract_entities(memory_text) + ctx_entities = extract_entities(context_text) + missing_entities = mem_entities - ctx_entities + + for entity in sorted(missing_entities)[:20]: + importance = "high" if "/" in entity or "http" in entity else "medium" + spots.append({ + "type": "entity", + "name": entity[:80], + "importance": importance, + "last_seen": "in full memory", + }) + + mem_decisions = extract_decisions(memory_text) + ctx_lower = context_text.lower() + for d in mem_decisions[:10]: + if not any(word in ctx_lower for word in d.lower().split()[:5]): + spots.append({ + "type": "decision", + "name": d[:80], + "importance": "critical", + "last_seen": "in full memory", + }) + + mem_tasks = extract_tasks(memory_text) + for t in mem_tasks[:10]: + if not any(word in ctx_lower for word in t.lower().split()[:5]): + spots.append({ + "type": "task", + "name": t[:80], + "importance": "high", + "last_seen": "in full memory", + }) + + # Sort by importance + order = {"critical": 0, "high": 1, "medium": 2, "low": 3} + spots.sort(key=lambda s: order.get(s["importance"], 3)) + return spots + + +# ── Commands ───────────────────────────────────────────────────────────────── + +def cmd_score(state: dict, verbose: bool, fmt: str) -> None: + memory_text = MEMORY_FILE.read_text() if MEMORY_FILE.exists() else "" + context_text = CONTEXT_FILE.read_text() if CONTEXT_FILE.exists() else memory_text + + if not memory_text: + print("No MEMORY.md found — nothing to score.") + return + + mem_topics = extract_topics(memory_text) + ctx_topics = extract_topics(context_text) + mem_entities = extract_entities(memory_text) + ctx_entities = extract_entities(context_text) + mem_decisions = extract_decisions(memory_text) + mem_tasks = extract_tasks(memory_text) + + tc = score_topic_coverage(mem_topics, ctx_topics) + rb = score_recency_bias(memory_text, context_text) + ec = score_entity_continuity(mem_entities, ctx_entities) + dr = score_decision_retention(mem_decisions, context_text) + tcont = score_task_continuity(mem_tasks, context_text) + overall = compute_overall(tc, rb, ec, dr, tcont) + grade = get_grade(overall) + now = datetime.now().isoformat() + + score_data = { + "overall": overall, + "grade": grade, + "topic_coverage": tc, + "recency_bias": rb, + "entity_continuity": ec, + "decision_retention": dr, + "task_continuity": tcont, + } + state["current_score"] = score_data + state["last_score_at"] = now + + history = state.get("score_history") or [] + blind_spots = find_blind_spots(memory_text, context_text) + history.insert(0, { + "scored_at": now, "overall": overall, + "grade": grade, "blind_spot_count": len(blind_spots), + }) + state["score_history"] = history[:MAX_HISTORY] + state["blind_spots"] = blind_spots + save_state(state) + + if fmt == "json": + print(json.dumps(score_data, indent=2)) + else: + print(f"\nContext Assembly Score — {datetime.now().strftime('%Y-%m-%d %H:%M')}") + print("-" * 55) + print(f" Overall: {overall:>5}% Grade: {grade}") + if verbose: + print(f" Topic coverage: {tc:>5}% (2x weight)") + print(f" Recency bias: {rb:>5}% (1.5x weight)") + print(f" Entity continuity: {ec:>5}% (2x weight)") + print(f" Decision retention: {dr:>5}% (2x weight)") + print(f" Task continuity: {tcont:>5}% (1.5x weight)") + print(f"\n Memory stats:") + print(f" Topics: {len(mem_topics)} unique | Entities: {len(mem_entities)}") + print(f" Decisions: {len(mem_decisions)} | Tasks: {len(mem_tasks)}") + print(f" Blind spots: {len(blind_spots)}") + print() + + +def cmd_blind_spots(state: dict, fmt: str) -> None: + memory_text = MEMORY_FILE.read_text() if MEMORY_FILE.exists() else "" + context_text = CONTEXT_FILE.read_text() if CONTEXT_FILE.exists() else memory_text + spots = find_blind_spots(memory_text, context_text) + + if fmt == "json": + print(json.dumps({"blind_spots": spots, "count": len(spots)}, indent=2)) + else: + print(f"\nBlind Spots — {len(spots)} items missing from context") + print("-" * 55) + icons = {"critical": "!!", "high": "!", "medium": "~", "low": "."} + for s in spots[:25]: + icon = icons.get(s["importance"], "?") + print(f" {icon} [{s['importance'].upper():>8}] {s['type']:>8}: {s['name']}") + print() + + +def cmd_drift(state: dict, fmt: str) -> None: + history = state.get("score_history") or [] + if fmt == "json": + print(json.dumps({"score_history": history}, indent=2)) + else: + print(f"\nCoverage Drift — {len(history)} data points") + print("-" * 55) + if not history: + print(" No score history yet.") + else: + for h in history[:15]: + ts = h.get("scored_at", "?")[:16] + overall = h.get("overall", 0) + grade = h.get("grade", "?") + spots = h.get("blind_spot_count", 0) + bar = "=" * int(overall / 10) + "-" * (10 - int(overall / 10)) + print(f" {ts} [{bar}] {overall}% ({grade}) {spots} blind spots") + # Trend + if len(history) >= 2: + latest = history[0].get("overall", 0) + prev = history[1].get("overall", 0) + delta = latest - prev + trend = "improving" if delta > 0 else "declining" if delta < 0 else "stable" + print(f"\n Trend: {trend} ({'+' if delta>0 else ''}{delta}%)") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_score_at", "never") + score = state.get("current_score") or {} + print(f"\nContext Assembly Scorer — Last score: {last}") + if score: + print(f" Overall: {score.get('overall', 0)}% ({score.get('grade', '?')})") + spots = state.get("blind_spots") or [] + print(f" Blind spots: {len(spots)}") + critical = sum(1 for s in spots if s.get("importance") == "critical") + if critical: + print(f" Critical blind spots: {critical}") + print() + + +def main(): + parser = argparse.ArgumentParser(description="Context Assembly Scorer") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--score", action="store_true", help="Score current context assembly") + group.add_argument("--blind-spots", action="store_true", help="List topics missing from context") + group.add_argument("--drift", action="store_true", help="Compare scores over time") + group.add_argument("--status", action="store_true", help="Last score summary") + parser.add_argument("--verbose", action="store_true", help="Detailed per-dimension breakdown") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.score: + cmd_score(state, args.verbose, args.format) + elif args.blind_spots: + cmd_blind_spots(state, args.format) + elif args.drift: + cmd_drift(state, args.format) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() From 1133442c61c453ff1022bb8aa92e21f5bf04e30f Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Tue, 17 Mar 2026 00:53:31 +0530 Subject: [PATCH 21/23] Add compaction-resilience-guard skill (#38) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Monitors compaction for failures (empty, inflation, garbled, repetition) and enforces a 3-level fallback chain: normal → aggressive → deterministic truncation. Ensures compaction always makes forward progress. Inspired by lossless-claw's three-level escalation system. Co-authored-by: Claude Opus 4.6 --- .../compaction-resilience-guard/SKILL.md | 94 +++++ .../STATE_SCHEMA.yaml | 30 ++ .../example-state.yaml | 65 +++ .../compaction-resilience-guard/guard.py | 373 ++++++++++++++++++ 4 files changed, 562 insertions(+) create mode 100644 skills/openclaw-native/compaction-resilience-guard/SKILL.md create mode 100644 skills/openclaw-native/compaction-resilience-guard/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/compaction-resilience-guard/example-state.yaml create mode 100755 skills/openclaw-native/compaction-resilience-guard/guard.py diff --git a/skills/openclaw-native/compaction-resilience-guard/SKILL.md b/skills/openclaw-native/compaction-resilience-guard/SKILL.md new file mode 100644 index 0000000..f8ac747 --- /dev/null +++ b/skills/openclaw-native/compaction-resilience-guard/SKILL.md @@ -0,0 +1,94 @@ +--- +name: compaction-resilience-guard +version: "1.0" +category: openclaw-native +description: Monitors memory compaction for failures and enforces a three-level fallback chain — normal, aggressive, deterministic truncation — ensuring compaction always makes forward progress. +stateful: true +--- + +# Compaction Resilience Guard + +## What it does + +Memory compaction can fail silently: the LLM produces empty output, summaries that are *larger* than their input, or garbled text. When this happens, compaction stalls and context overflows. + +Compaction Resilience Guard enforces a three-level escalation chain inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw): + +| Level | Strategy | When used | +|---|---|---| +| L1 — Normal | Standard summarization prompt | First attempt | +| L2 — Aggressive | Low temperature, reduced reasoning, shorter output target | After L1 failure | +| L3 — Deterministic | Pure truncation: keep first N + last N lines, drop middle | After L2 failure | + +This ensures compaction **always makes progress** — even if the LLM is broken. + +## When to invoke + +- After any compaction event — validate the output +- When context usage approaches 90% — compaction may be failing +- When summaries seem unusually long or empty — detect inflation +- As a pre-check before memory-dag-compactor runs + +## How to use + +```bash +python3 guard.py --check # Validate recent compaction outputs +python3 guard.py --check --file # Check a specific summary file +python3 guard.py --simulate # Run the 3-level chain on sample text +python3 guard.py --report # Show failure/escalation history +python3 guard.py --status # Last check summary +python3 guard.py --format json # Machine-readable output +``` + +## Failure detection + +The guard detects these compaction failures: + +| Failure | How detected | Action | +|---|---|---| +| Empty output | Summary length < 10 chars | Escalate to next level | +| Inflation | Summary tokens > input tokens | Escalate to next level | +| Garbled text | Entropy score > 5.0 (random chars) | Escalate to next level | +| Repetition | Same 20+ char phrase repeated 3+ times | Escalate to next level | +| Truncation marker | Contains `[FALLBACK]` or `[TRUNCATED]` | Record as L3 usage | +| Stale | Summary unchanged from previous run | Flag for review | + +## Procedure + +**Step 1 — Check recent compaction outputs** + +```bash +python3 guard.py --check +``` + +Validates all summary nodes in memory-dag-compactor state. Reports failures by level and whether escalation was needed. + +**Step 2 — Simulate the fallback chain** + +```bash +python3 guard.py --simulate "$(cat long-text.txt)" +``` + +Runs the 3-level chain on sample text to test that each level produces valid output. + +**Step 3 — Review escalation history** + +```bash +python3 guard.py --report +``` + +Shows how often each level was used. High L2/L3 usage indicates the primary summarization prompt needs improvement. + +## State + +Failure counts, escalation history, and per-summary validation results stored in `~/.openclaw/skill-state/compaction-resilience-guard/state.yaml`. + +Fields: `last_check_at`, `level_usage`, `failures`, `check_history`. + +## Notes + +- Read-only monitoring — does not perform compaction itself +- Works alongside memory-dag-compactor as a quality gate +- Deterministic truncation (L3) preserves first 30% and last 20% of input, drops middle +- Entropy is measured using Shannon entropy on character distribution +- High L3 usage (>10% of compactions) suggests a systemic LLM issue diff --git a/skills/openclaw-native/compaction-resilience-guard/STATE_SCHEMA.yaml b/skills/openclaw-native/compaction-resilience-guard/STATE_SCHEMA.yaml new file mode 100644 index 0000000..df559f5 --- /dev/null +++ b/skills/openclaw-native/compaction-resilience-guard/STATE_SCHEMA.yaml @@ -0,0 +1,30 @@ +version: "1.0" +description: Compaction failure tracking, escalation history, and level usage stats. +fields: + last_check_at: + type: datetime + level_usage: + type: object + description: How often each fallback level was used + fields: + l1_normal: { type: integer, default: 0 } + l2_aggressive: { type: integer, default: 0 } + l3_deterministic: { type: integer, default: 0 } + failures: + type: list + description: Recent compaction failures detected + items: + summary_id: { type: string } + failure_type: { type: enum, values: [empty, inflation, garbled, repetition, stale] } + level_used: { type: integer, description: "1, 2, or 3" } + input_tokens: { type: integer } + output_tokens: { type: integer } + detected_at: { type: datetime } + check_history: + type: list + description: Rolling log of past checks (last 20) + items: + checked_at: { type: datetime } + summaries_checked: { type: integer } + failures_found: { type: integer } + escalations: { type: integer } diff --git a/skills/openclaw-native/compaction-resilience-guard/example-state.yaml b/skills/openclaw-native/compaction-resilience-guard/example-state.yaml new file mode 100644 index 0000000..df38ba6 --- /dev/null +++ b/skills/openclaw-native/compaction-resilience-guard/example-state.yaml @@ -0,0 +1,65 @@ +# Example runtime state for compaction-resilience-guard +last_check_at: "2026-03-16T23:05:00.000000" +level_usage: + l1_normal: 42 + l2_aggressive: 3 + l3_deterministic: 1 +failures: + - summary_id: s-d0-012 + failure_type: inflation + level_used: 2 + input_tokens: 500 + output_tokens: 620 + detected_at: "2026-03-16T23:04:58.000000" + - summary_id: s-d1-005 + failure_type: repetition + level_used: 3 + input_tokens: 800 + output_tokens: 200 + detected_at: "2026-03-15T23:05:00.000000" +check_history: + - checked_at: "2026-03-16T23:05:00.000000" + summaries_checked: 18 + failures_found: 1 + escalations: 1 + - checked_at: "2026-03-15T23:05:00.000000" + summaries_checked: 15 + failures_found: 1 + escalations: 1 + - checked_at: "2026-03-14T23:05:00.000000" + summaries_checked: 12 + failures_found: 0 + escalations: 0 +# ── Walkthrough ────────────────────────────────────────────────────────────── +# python3 guard.py --check +# +# Compaction Resilience Check — 2026-03-16 23:05 +# ────────────────────────────────────────────────── +# Summaries checked: 18 +# Failures found: 1 +# Escalations needed: 1 +# Status: DEGRADED +# +# ! s-d0-012: inflation (entropy=3.2, 620 tok) +# +# python3 guard.py --report +# +# Compaction Resilience Report +# ────────────────────────────────────────────────── +# Total compactions tracked: 46 +# L1 Normal: 42 (91%) +# L2 Aggressive: 3 (7%) +# L3 Deterministic: 1 (2%) +# +# Recent failures: 2 +# s-d0-012: inflation (L2) +# s-d1-005: repetition (L3) +# +# python3 guard.py --simulate "$(cat long-text.txt)" +# +# Fallback Chain Simulation +# ────────────────────────────────────────────────── +# Input: 2500 tokens (10000 chars) +# Level used: L1 (l1_normal) +# Output: 1000 tokens +# Compression: 40% diff --git a/skills/openclaw-native/compaction-resilience-guard/guard.py b/skills/openclaw-native/compaction-resilience-guard/guard.py new file mode 100755 index 0000000..3e926e2 --- /dev/null +++ b/skills/openclaw-native/compaction-resilience-guard/guard.py @@ -0,0 +1,373 @@ +#!/usr/bin/env python3 +""" +Compaction Resilience Guard for openclaw-superpowers. + +Monitors memory compaction for failures and enforces a 3-level +fallback chain: normal → aggressive → deterministic truncation. + +Usage: + python3 guard.py --check + python3 guard.py --check --file + python3 guard.py --simulate + python3 guard.py --report + python3 guard.py --status + python3 guard.py --format json +""" + +import argparse +import json +import math +import os +import re +import sys +from collections import Counter +from datetime import datetime +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "compaction-resilience-guard" / "state.yaml" +DAG_STATE_FILE = OPENCLAW_DIR / "skill-state" / "memory-dag-compactor" / "state.yaml" +MAX_HISTORY = 20 + +# Thresholds +MIN_SUMMARY_LENGTH = 10 +MAX_ENTROPY = 5.0 +REPETITION_THRESHOLD = 3 +REPETITION_MIN_LENGTH = 20 + + +# ── State helpers ──────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"level_usage": {"l1_normal": 0, "l2_aggressive": 0, "l3_deterministic": 0}, + "failures": [], "check_history": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {"level_usage": {"l1_normal": 0, "l2_aggressive": 0, "l3_deterministic": 0}, + "failures": [], "check_history": []} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +def load_dag_state() -> dict: + if not DAG_STATE_FILE.exists(): + return {} + try: + text = DAG_STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def estimate_tokens(text: str) -> int: + return len(text) // 4 + + +# ── Failure detection ──────────────────────────────────────────────────────── + +def shannon_entropy(text: str) -> float: + """Calculate Shannon entropy of character distribution.""" + if not text: + return 0.0 + freq = Counter(text) + total = len(text) + entropy = 0.0 + for count in freq.values(): + p = count / total + if p > 0: + entropy -= p * math.log2(p) + return round(entropy, 2) + + +def detect_repetition(text: str) -> bool: + """Detect if the same 20+ char phrase is repeated 3+ times.""" + if len(text) < REPETITION_MIN_LENGTH * REPETITION_THRESHOLD: + return False + # Sliding window of REPETITION_MIN_LENGTH chars + phrases = Counter() + for i in range(len(text) - REPETITION_MIN_LENGTH): + phrase = text[i:i + REPETITION_MIN_LENGTH] + phrases[phrase] += 1 + return any(count >= REPETITION_THRESHOLD for count in phrases.values()) + + +def validate_summary(content: str, input_tokens: int = 0) -> dict: + """Validate a single summary and return failure info if any.""" + failures = [] + + # Check: empty + if len(content.strip()) < MIN_SUMMARY_LENGTH: + failures.append("empty") + + # Check: inflation (summary larger than input) + output_tokens = estimate_tokens(content) + if input_tokens > 0 and output_tokens > input_tokens: + failures.append("inflation") + + # Check: garbled (high entropy = random characters) + entropy = shannon_entropy(content) + if entropy > MAX_ENTROPY: + failures.append("garbled") + + # Check: repetition + if detect_repetition(content): + failures.append("repetition") + + # Check: fallback markers + if "[FALLBACK]" in content or "[TRUNCATED]" in content: + failures.append("truncation_marker") + + return { + "valid": len(failures) == 0, + "failures": failures, + "output_tokens": output_tokens, + "entropy": entropy, + } + + +# ── Three-level fallback chain ─────────────────────────────────────────────── + +def l1_normal(text: str) -> str: + """Level 1: Standard summarization — keep first 40% by lines.""" + lines = text.split("\n") + keep = max(3, len(lines) * 40 // 100) + summary = "\n".join(lines[:keep]) + return summary.strip() + + +def l2_aggressive(text: str) -> str: + """Level 2: Aggressive — keep only lines with substance (no blanks, short lines).""" + lines = text.split("\n") + substantial = [l for l in lines if len(l.strip()) > 20] + keep = max(3, len(substantial) * 30 // 100) + summary = "\n".join(substantial[:keep]) + return summary.strip() + + +def l3_deterministic(text: str) -> str: + """Level 3: Deterministic truncation — first 30% + last 20%, drop middle.""" + lines = text.split("\n") + total = len(lines) + if total <= 5: + return text.strip() + head_count = max(2, total * 30 // 100) + tail_count = max(1, total * 20 // 100) + head = lines[:head_count] + tail = lines[-tail_count:] + dropped = total - head_count - tail_count + summary = "\n".join(head) + f"\n[... {dropped} lines truncated ...]\n" + "\n".join(tail) + return summary.strip() + + +def run_fallback_chain(text: str) -> tuple[str, int]: + """Run the 3-level fallback chain, return (result, level_used).""" + input_tokens = estimate_tokens(text) + + # Level 1 + result = l1_normal(text) + check = validate_summary(result, input_tokens) + if check["valid"]: + return result, 1 + + # Level 2 + result = l2_aggressive(text) + check = validate_summary(result, input_tokens) + if check["valid"]: + return result, 2 + + # Level 3 — always succeeds + result = l3_deterministic(text) + return result, 3 + + +# ── Commands ───────────────────────────────────────────────────────────────── + +def cmd_check(state: dict, file_path: str | None, fmt: str) -> None: + now = datetime.now().isoformat() + failures_found = 0 + escalations = 0 + summaries_checked = 0 + + if file_path: + # Check a specific file + path = Path(file_path) + if not path.exists(): + print(f"Error: file '{file_path}' not found.") + sys.exit(1) + content = path.read_text() + check = validate_summary(content) + summaries_checked = 1 + if not check["valid"]: + failures_found = 1 + else: + # Check all DAG summaries + dag = load_dag_state() + nodes = dag.get("dag_nodes") or [] + results = [] + + for node in nodes: + content = node.get("content", "") + check = validate_summary(content) + summaries_checked += 1 + if not check["valid"]: + failures_found += 1 + for f_type in check["failures"]: + failure_record = { + "summary_id": node.get("id", "unknown"), + "failure_type": f_type, + "level_used": 1, + "input_tokens": 0, + "output_tokens": check["output_tokens"], + "detected_at": now, + } + existing_failures = state.get("failures") or [] + existing_failures.append(failure_record) + state["failures"] = existing_failures[-50:] # Keep last 50 + escalations += 1 + + results.append({ + "id": node.get("id"), "failures": check["failures"], + "entropy": check["entropy"], "tokens": check["output_tokens"], + }) + + if fmt != "json" and results: + for r in results: + print(f" ! {r['id']}: {', '.join(r['failures'])} " + f"(entropy={r['entropy']}, {r['tokens']} tok)") + + state["last_check_at"] = now + history = state.get("check_history") or [] + history.insert(0, { + "checked_at": now, "summaries_checked": summaries_checked, + "failures_found": failures_found, "escalations": escalations, + }) + state["check_history"] = history[:MAX_HISTORY] + save_state(state) + + if fmt == "json": + print(json.dumps({"summaries_checked": summaries_checked, + "failures_found": failures_found, + "escalations": escalations}, indent=2)) + else: + print(f"\nCompaction Resilience Check — {datetime.now().strftime('%Y-%m-%d %H:%M')}") + print("-" * 50) + print(f" Summaries checked: {summaries_checked}") + print(f" Failures found: {failures_found}") + print(f" Escalations needed: {escalations}") + status = "HEALTHY" if failures_found == 0 else "DEGRADED" + print(f" Status: {status}") + print() + + +def cmd_simulate(state: dict, text: str, fmt: str) -> None: + input_tokens = estimate_tokens(text) + result, level = run_fallback_chain(text) + output_tokens = estimate_tokens(result) + + # Update level usage + usage = state.get("level_usage") or {"l1_normal": 0, "l2_aggressive": 0, "l3_deterministic": 0} + level_key = {1: "l1_normal", 2: "l2_aggressive", 3: "l3_deterministic"}[level] + usage[level_key] = usage.get(level_key, 0) + 1 + state["level_usage"] = usage + save_state(state) + + if fmt == "json": + print(json.dumps({"level_used": level, "input_tokens": input_tokens, + "output_tokens": output_tokens, "result": result[:500]}, indent=2)) + else: + print(f"\nFallback Chain Simulation") + print("-" * 50) + print(f" Input: {input_tokens} tokens ({len(text)} chars)") + print(f" Level used: L{level} ({level_key})") + print(f" Output: {output_tokens} tokens") + ratio = round(output_tokens / max(input_tokens, 1) * 100) + print(f" Compression: {ratio}%") + print(f"\n Result preview:") + for line in result.split("\n")[:10]: + print(f" {line}") + if result.count("\n") > 10: + print(f" ... ({result.count(chr(10))-10} more lines)") + print() + + +def cmd_report(state: dict, fmt: str) -> None: + usage = state.get("level_usage") or {} + failures = state.get("failures") or [] + total = sum(usage.values()) + + if fmt == "json": + print(json.dumps({"level_usage": usage, "total_compactions": total, + "recent_failures": failures[-10:]}, indent=2)) + else: + print(f"\nCompaction Resilience Report") + print("-" * 50) + print(f" Total compactions tracked: {total}") + if total > 0: + l1 = usage.get("l1_normal", 0) + l2 = usage.get("l2_aggressive", 0) + l3 = usage.get("l3_deterministic", 0) + print(f" L1 Normal: {l1:>5} ({l1/total*100:.0f}%)") + print(f" L2 Aggressive: {l2:>5} ({l2/total*100:.0f}%)") + print(f" L3 Deterministic: {l3:>5} ({l3/total*100:.0f}%)") + if l3 / total > 0.1: + print(f"\n WARNING: L3 usage > 10% — indicates systemic LLM issue") + print(f"\n Recent failures: {len(failures)}") + for f in failures[-5:]: + print(f" {f.get('summary_id', '?')}: {f.get('failure_type', '?')} " + f"(L{f.get('level_used', '?')})") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_check_at", "never") + usage = state.get("level_usage") or {} + total = sum(usage.values()) + l3 = usage.get("l3_deterministic", 0) + print(f"\nCompaction Resilience Guard — Last check: {last}") + print(f" {total} compactions tracked | L3 fallbacks: {l3}") + history = state.get("check_history") or [] + if history: + h = history[0] + print(f" Last: {h.get('summaries_checked', 0)} checked, " + f"{h.get('failures_found', 0)} failures") + print() + + +def main(): + parser = argparse.ArgumentParser(description="Compaction Resilience Guard") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--check", action="store_true", help="Validate recent compaction outputs") + group.add_argument("--simulate", type=str, metavar="TEXT", help="Run 3-level chain on sample text") + group.add_argument("--report", action="store_true", help="Show failure/escalation history") + group.add_argument("--status", action="store_true", help="Last check summary") + parser.add_argument("--file", type=str, metavar="PATH", help="Check a specific summary file") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.check: + cmd_check(state, args.file, args.format) + elif args.simulate: + cmd_simulate(state, args.simulate, args.format) + elif args.report: + cmd_report(state, args.format) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() From 2ec09753c58ffb9376078dd911efc93f74761441 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Tue, 17 Mar 2026 00:57:00 +0530 Subject: [PATCH 22/23] Add memory-integrity-checker skill (#39) Validates memory summary DAGs with 8 structural checks: orphan nodes, circular references, token inflation, broken lineage, stale active, empty nodes, duplicate edges, depth mismatch. Auto-fixes safe issues. Cron Sundays 3am. Inspired by lossless-claw's DAG integrity checking system. Co-authored-by: Claude Opus 4.6 --- .../memory-integrity-checker/SKILL.md | 93 ++++ .../STATE_SCHEMA.yaml | 33 ++ .../example-state.yaml | 71 +++ .../memory-integrity-checker/integrity.py | 489 ++++++++++++++++++ 4 files changed, 686 insertions(+) create mode 100644 skills/openclaw-native/memory-integrity-checker/SKILL.md create mode 100644 skills/openclaw-native/memory-integrity-checker/STATE_SCHEMA.yaml create mode 100644 skills/openclaw-native/memory-integrity-checker/example-state.yaml create mode 100755 skills/openclaw-native/memory-integrity-checker/integrity.py diff --git a/skills/openclaw-native/memory-integrity-checker/SKILL.md b/skills/openclaw-native/memory-integrity-checker/SKILL.md new file mode 100644 index 0000000..31fa0c9 --- /dev/null +++ b/skills/openclaw-native/memory-integrity-checker/SKILL.md @@ -0,0 +1,93 @@ +--- +name: memory-integrity-checker +version: "1.0" +category: openclaw-native +description: Validates memory summary DAGs for structural integrity — detects orphan nodes, circular references, token inflation, broken lineage, and stale summaries that corrupt the agent's memory. +stateful: true +cron: "0 3 * * 0" +--- + +# Memory Integrity Checker + +## What it does + +As memory DAGs grow through compaction, they can develop structural problems: orphan nodes with no parent, circular reference loops, summaries that inflated instead of compressing, broken lineage chains, and stale nodes that should have been dissolved. These problems silently corrupt the agent's memory. + +Memory Integrity Checker runs 8 structural checks on the DAG, generates a repair plan, and optionally auto-fixes safe issues. + +Inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw)'s DAG integrity checking system, which detects and repairs corrupted summaries. + +## When to invoke + +- Automatically Sundays at 3am (cron) — weekly structural audit +- After a crash or unexpected shutdown — check for corruption +- When the agent's memory seems inconsistent — diagnose structural issues +- Before a major compaction or prune operation — ensure clean starting state + +## Integrity checks (8 total) + +| Check | What it detects | Severity | +|---|---|---| +| ORPHAN_NODE | Node with no parent and not a root | HIGH | +| CIRCULAR_REF | Circular parent-child loops in the DAG | CRITICAL | +| TOKEN_INFLATION | Summary has more tokens than its combined children | HIGH | +| BROKEN_LINEAGE | Edge references a node ID that doesn't exist | CRITICAL | +| STALE_ACTIVE | Active node older than 30 days with no children | MEDIUM | +| EMPTY_NODE | Node with empty or whitespace-only content | HIGH | +| DUPLICATE_EDGE | Same parent-child edge appears multiple times | LOW | +| DEPTH_MISMATCH | Node's depth doesn't match its position in the DAG | MEDIUM | + +## How to use + +```bash +python3 integrity.py --check # Run all 8 integrity checks +python3 integrity.py --check --fix # Auto-fix safe issues +python3 integrity.py --check --only ORPHAN_NODE # Run a specific check +python3 integrity.py --repair-plan # Generate repair plan without fixing +python3 integrity.py --status # Last check summary +python3 integrity.py --format json # Machine-readable output +``` + +## Procedure + +**Step 1 — Run integrity checks** + +```bash +python3 integrity.py --check +``` + +Runs all 8 checks and reports findings by severity. + +**Step 2 — Review repair plan** + +```bash +python3 integrity.py --repair-plan +``` + +For each finding, shows what the auto-fix would do: +- ORPHAN_NODE → reattach to nearest active root or deactivate +- DUPLICATE_EDGE → remove duplicates +- EMPTY_NODE → deactivate +- STALE_ACTIVE → deactivate + +**Step 3 — Apply safe fixes** + +```bash +python3 integrity.py --check --fix +``` + +Auto-fixes LOW and MEDIUM severity issues. HIGH and CRITICAL require manual review. + +## State + +Check results, finding history, and repair actions stored in `~/.openclaw/skill-state/memory-integrity-checker/state.yaml`. + +Fields: `last_check_at`, `findings`, `check_history`, `repairs_applied`. + +## Notes + +- Reads from memory-dag-compactor's state file — does not maintain its own DAG +- Auto-fix only applies to LOW and MEDIUM severity issues +- CRITICAL issues (circular refs, broken lineage) require manual intervention +- Circular reference detection uses DFS with visited-set tracking +- Token inflation check compares parent tokens vs. sum of children tokens diff --git a/skills/openclaw-native/memory-integrity-checker/STATE_SCHEMA.yaml b/skills/openclaw-native/memory-integrity-checker/STATE_SCHEMA.yaml new file mode 100644 index 0000000..c18b7d9 --- /dev/null +++ b/skills/openclaw-native/memory-integrity-checker/STATE_SCHEMA.yaml @@ -0,0 +1,33 @@ +version: "1.0" +description: DAG integrity check results, findings, and repair history. +fields: + last_check_at: + type: datetime + findings: + type: list + description: Integrity issues found in the most recent check + items: + check: { type: string, description: "Check name (e.g. ORPHAN_NODE)" } + severity: { type: enum, values: [CRITICAL, HIGH, MEDIUM, LOW] } + node_id: { type: string } + detail: { type: string } + auto_fixable: { type: boolean } + check_history: + type: list + description: Rolling log of past checks (last 20) + items: + checked_at: { type: datetime } + nodes_checked: { type: integer } + findings: { type: integer } + critical: { type: integer } + high: { type: integer } + medium: { type: integer } + low: { type: integer } + repairs_applied: + type: list + description: History of auto-fix actions taken + items: + repaired_at: { type: datetime } + check: { type: string } + node_id: { type: string } + action: { type: string } diff --git a/skills/openclaw-native/memory-integrity-checker/example-state.yaml b/skills/openclaw-native/memory-integrity-checker/example-state.yaml new file mode 100644 index 0000000..f2ac1b0 --- /dev/null +++ b/skills/openclaw-native/memory-integrity-checker/example-state.yaml @@ -0,0 +1,71 @@ +# Example runtime state for memory-integrity-checker +last_check_at: "2026-03-16T03:00:12.000000" +findings: + - check: ORPHAN_NODE + severity: HIGH + node_id: s-d1-003 + detail: "Depth 1 node has no parent — should be connected to a d0 parent" + auto_fixable: true + - check: STALE_ACTIVE + severity: MEDIUM + node_id: s-d0-002 + detail: "Active node is 45 days old with no children" + auto_fixable: true + - check: DUPLICATE_EDGE + severity: LOW + node_id: "s-d2-001->s-d1-000" + detail: "Duplicate edge in DAG" + auto_fixable: true +check_history: + - checked_at: "2026-03-16T03:00:12.000000" + nodes_checked: 24 + findings: 3 + critical: 0 + high: 1 + medium: 1 + low: 1 + - checked_at: "2026-03-09T03:00:10.000000" + nodes_checked: 18 + findings: 0 + critical: 0 + high: 0 + medium: 0 + low: 0 +repairs_applied: + - repaired_at: "2026-03-16T03:00:12.000000" + check: DUPLICATE_EDGE + node_id: "s-d2-001->s-d1-000" + action: "Removed duplicate edges" + - repaired_at: "2026-03-16T03:00:12.000000" + check: STALE_ACTIVE + node_id: s-d0-002 + action: "Deactivated stale node s-d0-002" +# ── Walkthrough ────────────────────────────────────────────────────────────── +# Cron runs Sundays at 3am: python3 integrity.py --check --fix +# +# Memory Integrity Check — 2026-03-16 03:00 +# ─────────────────────────────────────────────────────── +# Nodes checked: 24 | Edges: 18 +# Findings: 3 (0 critical, 1 high, 1 medium, 1 low) +# +# ! [ HIGH] ORPHAN_NODE: s-d1-003 +# Depth 1 node has no parent [auto-fixable] +# ~ [ MEDIUM] STALE_ACTIVE: s-d0-002 +# Active node is 45 days old with no children [auto-fixable] +# . [ LOW] DUPLICATE_EDGE: s-d2-001->s-d1-000 +# Duplicate edge in DAG [auto-fixable] +# +# Repairs applied: 2 +# + Removed duplicate edges +# + Deactivated stale node s-d0-002 +# +# Status: DEGRADED +# +# python3 integrity.py --repair-plan +# +# Repair Plan — 3 findings +# ─────────────────────────────────────────────────────── +# Auto-fixable (3): +# ORPHAN_NODE on s-d1-003: Depth 1 node has no parent +# STALE_ACTIVE on s-d0-002: Active node is 45 days old +# DUPLICATE_EDGE on s-d2-001->s-d1-000: Duplicate edge diff --git a/skills/openclaw-native/memory-integrity-checker/integrity.py b/skills/openclaw-native/memory-integrity-checker/integrity.py new file mode 100755 index 0000000..5c48083 --- /dev/null +++ b/skills/openclaw-native/memory-integrity-checker/integrity.py @@ -0,0 +1,489 @@ +#!/usr/bin/env python3 +""" +Memory Integrity Checker for openclaw-superpowers. + +Validates memory summary DAGs for structural integrity — orphan nodes, +circular references, token inflation, broken lineage, stale summaries. + +Usage: + python3 integrity.py --check + python3 integrity.py --check --fix + python3 integrity.py --check --only ORPHAN_NODE + python3 integrity.py --repair-plan + python3 integrity.py --status + python3 integrity.py --format json +""" + +import argparse +import json +import os +import sys +from datetime import datetime, timedelta +from pathlib import Path + +try: + import yaml + HAS_YAML = True +except ImportError: + HAS_YAML = False + +OPENCLAW_DIR = Path(os.environ.get("OPENCLAW_HOME", Path.home() / ".openclaw")) +STATE_FILE = OPENCLAW_DIR / "skill-state" / "memory-integrity-checker" / "state.yaml" +DAG_STATE_FILE = OPENCLAW_DIR / "skill-state" / "memory-dag-compactor" / "state.yaml" +MAX_HISTORY = 20 +STALE_DAYS = 30 + + +# ── State helpers ──────────────────────────────────────────────────────────── + +def load_state() -> dict: + if not STATE_FILE.exists(): + return {"findings": [], "check_history": [], "repairs_applied": []} + try: + text = STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {"findings": [], "check_history": [], "repairs_applied": []} + + +def save_state(state: dict) -> None: + STATE_FILE.parent.mkdir(parents=True, exist_ok=True) + if HAS_YAML: + with open(STATE_FILE, "w") as f: + yaml.dump(state, f, default_flow_style=False, allow_unicode=True) + + +def load_dag_state() -> dict: + if not DAG_STATE_FILE.exists(): + return {} + try: + text = DAG_STATE_FILE.read_text() + return (yaml.safe_load(text) or {}) if HAS_YAML else {} + except Exception: + return {} + + +def save_dag_state(dag: dict) -> None: + if HAS_YAML: + with open(DAG_STATE_FILE, "w") as f: + yaml.dump(dag, f, default_flow_style=False, allow_unicode=True) + + +# ── Integrity checks ──────────────────────────────────────────────────────── + +def check_orphan_nodes(nodes: list, edges: list) -> list[dict]: + """Find nodes with no parent that aren't root nodes (depth > 0).""" + findings = [] + child_ids = {e["child_id"] for e in edges} + parent_ids = {e["parent_id"] for e in edges} + + for node in nodes: + nid = node.get("id", "") + depth = node.get("depth", 0) + # Root nodes (not a child of anything) at depth 0 are fine + if nid not in child_ids and depth > 0: + findings.append({ + "check": "ORPHAN_NODE", + "severity": "HIGH", + "node_id": nid, + "detail": f"Depth {depth} node has no parent — should be connected to a d{depth-1} parent", + "auto_fixable": True, + }) + return findings + + +def check_circular_refs(nodes: list, edges: list) -> list[dict]: + """Detect circular parent-child loops using DFS.""" + findings = [] + children_of = {} + for e in edges: + children_of.setdefault(e["parent_id"], []).append(e["child_id"]) + + def has_cycle(start: str, visited: set, stack: set) -> bool: + visited.add(start) + stack.add(start) + for child in children_of.get(start, []): + if child in stack: + return True + if child not in visited and has_cycle(child, visited, stack): + return True + stack.discard(start) + return False + + visited = set() + for node in nodes: + nid = node.get("id", "") + if nid not in visited: + if has_cycle(nid, visited, set()): + findings.append({ + "check": "CIRCULAR_REF", + "severity": "CRITICAL", + "node_id": nid, + "detail": f"Circular reference detected in DAG involving node {nid}", + "auto_fixable": False, + }) + return findings + + +def check_token_inflation(nodes: list, edges: list) -> list[dict]: + """Find summaries with more tokens than their combined children.""" + findings = [] + children_of = {} + for e in edges: + children_of.setdefault(e["parent_id"], []).append(e["child_id"]) + + node_map = {n.get("id"): n for n in nodes} + + for node in nodes: + nid = node.get("id", "") + parent_tokens = node.get("token_count", 0) + child_ids = children_of.get(nid, []) + + if not child_ids or parent_tokens == 0: + continue + + children_tokens = sum( + node_map.get(cid, {}).get("token_count", 0) + for cid in child_ids + ) + + if children_tokens > 0 and parent_tokens > children_tokens: + ratio = round(parent_tokens / children_tokens, 1) + findings.append({ + "check": "TOKEN_INFLATION", + "severity": "HIGH", + "node_id": nid, + "detail": f"Parent ({parent_tokens} tok) > children ({children_tokens} tok) — {ratio}x inflation", + "auto_fixable": False, + }) + return findings + + +def check_broken_lineage(nodes: list, edges: list) -> list[dict]: + """Find edges referencing non-existent node IDs.""" + findings = [] + node_ids = {n.get("id") for n in nodes} + + for edge in edges: + if edge["parent_id"] not in node_ids: + findings.append({ + "check": "BROKEN_LINEAGE", + "severity": "CRITICAL", + "node_id": edge["parent_id"], + "detail": f"Edge references non-existent parent: {edge['parent_id']}", + "auto_fixable": True, + }) + if edge["child_id"] not in node_ids: + findings.append({ + "check": "BROKEN_LINEAGE", + "severity": "CRITICAL", + "node_id": edge["child_id"], + "detail": f"Edge references non-existent child: {edge['child_id']}", + "auto_fixable": True, + }) + return findings + + +def check_stale_active(nodes: list, edges: list) -> list[dict]: + """Find active nodes older than STALE_DAYS with no children.""" + findings = [] + parent_ids = {e["parent_id"] for e in edges} + cutoff = datetime.now() - timedelta(days=STALE_DAYS) + + for node in nodes: + if not node.get("is_active"): + continue + nid = node.get("id", "") + created = node.get("created_at", "") + if nid in parent_ids: + continue # Has children, not stale + + try: + created_dt = datetime.fromisoformat(created) + if created_dt < cutoff: + age_days = (datetime.now() - created_dt).days + findings.append({ + "check": "STALE_ACTIVE", + "severity": "MEDIUM", + "node_id": nid, + "detail": f"Active node is {age_days} days old with no children", + "auto_fixable": True, + }) + except (ValueError, TypeError): + pass + return findings + + +def check_empty_nodes(nodes: list) -> list[dict]: + """Find nodes with empty or whitespace-only content.""" + findings = [] + for node in nodes: + content = node.get("content", "") + if len(content.strip()) < 10: + findings.append({ + "check": "EMPTY_NODE", + "severity": "HIGH", + "node_id": node.get("id", "unknown"), + "detail": f"Node has empty or near-empty content ({len(content.strip())} chars)", + "auto_fixable": True, + }) + return findings + + +def check_duplicate_edges(edges: list) -> list[dict]: + """Find duplicate parent-child edges.""" + findings = [] + seen = set() + for edge in edges: + key = (edge["parent_id"], edge["child_id"]) + if key in seen: + findings.append({ + "check": "DUPLICATE_EDGE", + "severity": "LOW", + "node_id": f"{edge['parent_id']}->{edge['child_id']}", + "detail": "Duplicate edge in DAG", + "auto_fixable": True, + }) + seen.add(key) + return findings + + +def check_depth_mismatch(nodes: list, edges: list) -> list[dict]: + """Check that node depth matches its actual position in the DAG.""" + findings = [] + children_of = {} + for e in edges: + children_of.setdefault(e["parent_id"], []).append(e["child_id"]) + + node_map = {n.get("id"): n for n in nodes} + + for node in nodes: + nid = node.get("id", "") + depth = node.get("depth", 0) + child_ids = children_of.get(nid, []) + + for cid in child_ids: + child = node_map.get(cid, {}) + child_depth = child.get("depth", 0) + if child_depth != depth - 1: + findings.append({ + "check": "DEPTH_MISMATCH", + "severity": "MEDIUM", + "node_id": nid, + "detail": f"Parent d{depth} has child d{child_depth} — expected d{depth-1}", + "auto_fixable": False, + }) + return findings + + +ALL_CHECKS = { + "ORPHAN_NODE": check_orphan_nodes, + "CIRCULAR_REF": check_circular_refs, + "TOKEN_INFLATION": check_token_inflation, + "BROKEN_LINEAGE": check_broken_lineage, + "STALE_ACTIVE": check_stale_active, + "EMPTY_NODE": lambda nodes, edges: check_empty_nodes(nodes), + "DUPLICATE_EDGE": lambda nodes, edges: check_duplicate_edges(edges), + "DEPTH_MISMATCH": check_depth_mismatch, +} + + +# ── Auto-fix ───────────────────────────────────────────────────────────────── + +def apply_fix(dag: dict, finding: dict) -> str | None: + """Apply a safe auto-fix for a finding. Returns action description or None.""" + check = finding["check"] + nid = finding["node_id"] + nodes = dag.get("dag_nodes") or [] + edges = dag.get("dag_edges") or [] + + if check == "ORPHAN_NODE": + for n in nodes: + if n.get("id") == nid: + n["is_active"] = False + return f"Deactivated orphan node {nid}" + + elif check == "EMPTY_NODE": + for n in nodes: + if n.get("id") == nid: + n["is_active"] = False + return f"Deactivated empty node {nid}" + + elif check == "STALE_ACTIVE": + for n in nodes: + if n.get("id") == nid: + n["is_active"] = False + return f"Deactivated stale node {nid}" + + elif check == "DUPLICATE_EDGE": + seen = set() + new_edges = [] + for e in edges: + key = (e["parent_id"], e["child_id"]) + if key not in seen: + seen.add(key) + new_edges.append(e) + dag["dag_edges"] = new_edges + return f"Removed duplicate edges" + + elif check == "BROKEN_LINEAGE": + node_ids = {n.get("id") for n in nodes} + dag["dag_edges"] = [e for e in edges + if e["parent_id"] in node_ids and e["child_id"] in node_ids] + return f"Removed edges referencing non-existent nodes" + + return None + + +# ── Commands ───────────────────────────────────────────────────────────────── + +def cmd_check(state: dict, fix: bool, only: str | None, fmt: str) -> None: + dag = load_dag_state() + nodes = dag.get("dag_nodes") or [] + edges = dag.get("dag_edges") or [] + now = datetime.now().isoformat() + + if not nodes: + print("No DAG nodes found — memory-dag-compactor may not have run yet.") + return + + all_findings = [] + checks_to_run = {only: ALL_CHECKS[only]} if only and only in ALL_CHECKS else ALL_CHECKS + + for name, check_fn in checks_to_run.items(): + findings = check_fn(nodes, edges) + all_findings.extend(findings) + + # Count by severity + counts = {"CRITICAL": 0, "HIGH": 0, "MEDIUM": 0, "LOW": 0} + for f in all_findings: + counts[f["severity"]] = counts.get(f["severity"], 0) + 1 + + # Apply fixes if requested + repairs = [] + if fix: + fixable = [f for f in all_findings if f.get("auto_fixable") and f["severity"] in ("LOW", "MEDIUM")] + for finding in fixable: + action = apply_fix(dag, finding) + if action: + repairs.append({ + "repaired_at": now, + "check": finding["check"], + "node_id": finding["node_id"], + "action": action, + }) + if repairs: + save_dag_state(dag) + existing_repairs = state.get("repairs_applied") or [] + existing_repairs.extend(repairs) + state["repairs_applied"] = existing_repairs[-50:] + + state["last_check_at"] = now + state["findings"] = all_findings + history = state.get("check_history") or [] + history.insert(0, { + "checked_at": now, "nodes_checked": len(nodes), + "findings": len(all_findings), + "critical": counts["CRITICAL"], "high": counts["HIGH"], + "medium": counts["MEDIUM"], "low": counts["LOW"], + }) + state["check_history"] = history[:MAX_HISTORY] + save_state(state) + + if fmt == "json": + print(json.dumps({"nodes_checked": len(nodes), "findings": all_findings, + "counts": counts, "repairs": repairs}, indent=2)) + else: + print(f"\nMemory Integrity Check — {datetime.now().strftime('%Y-%m-%d %H:%M')}") + print("-" * 55) + print(f" Nodes checked: {len(nodes)} | Edges: {len(edges)}") + print(f" Findings: {len(all_findings)} " + f"({counts['CRITICAL']} critical, {counts['HIGH']} high, " + f"{counts['MEDIUM']} medium, {counts['LOW']} low)") + print() + + severity_icons = {"CRITICAL": "!!", "HIGH": "!", "MEDIUM": "~", "LOW": "."} + for f in all_findings: + icon = severity_icons.get(f["severity"], "?") + fix_mark = " [auto-fixable]" if f.get("auto_fixable") else "" + print(f" {icon} [{f['severity']:>8}] {f['check']}: {f['node_id']}") + print(f" {f['detail']}{fix_mark}") + + if repairs: + print(f"\n Repairs applied: {len(repairs)}") + for r in repairs: + print(f" + {r['action']}") + + status = "HEALTHY" if not all_findings else "DEGRADED" if not counts["CRITICAL"] else "CRITICAL" + print(f"\n Status: {status}") + print() + + if counts["CRITICAL"] > 0: + sys.exit(1) + + +def cmd_repair_plan(state: dict, fmt: str) -> None: + dag = load_dag_state() + nodes = dag.get("dag_nodes") or [] + edges = dag.get("dag_edges") or [] + + all_findings = [] + for check_fn in ALL_CHECKS.values(): + all_findings.extend(check_fn(nodes, edges)) + + fixable = [f for f in all_findings if f.get("auto_fixable")] + manual = [f for f in all_findings if not f.get("auto_fixable")] + + if fmt == "json": + print(json.dumps({"auto_fixable": fixable, "manual_required": manual}, indent=2)) + else: + print(f"\nRepair Plan — {len(all_findings)} findings") + print("-" * 55) + if fixable: + print(f"\n Auto-fixable ({len(fixable)}):") + for f in fixable: + print(f" {f['check']} on {f['node_id']}: {f['detail']}") + if manual: + print(f"\n Manual review required ({len(manual)}):") + for f in manual: + print(f" [{f['severity']}] {f['check']} on {f['node_id']}: {f['detail']}") + if not all_findings: + print(" No issues found — DAG is healthy.") + print() + + +def cmd_status(state: dict) -> None: + last = state.get("last_check_at", "never") + findings = state.get("findings") or [] + critical = sum(1 for f in findings if f.get("severity") == "CRITICAL") + print(f"\nMemory Integrity Checker — Last check: {last}") + print(f" {len(findings)} findings | {critical} critical") + repairs = state.get("repairs_applied") or [] + if repairs: + print(f" {len(repairs)} repairs applied total") + print() + + +def main(): + parser = argparse.ArgumentParser(description="Memory Integrity Checker") + group = parser.add_mutually_exclusive_group(required=True) + group.add_argument("--check", action="store_true", help="Run all integrity checks") + group.add_argument("--repair-plan", action="store_true", help="Generate repair plan") + group.add_argument("--status", action="store_true", help="Last check summary") + parser.add_argument("--fix", action="store_true", help="Auto-fix safe issues") + parser.add_argument("--only", type=str, metavar="CHECK", + choices=list(ALL_CHECKS.keys()), help="Run a specific check") + parser.add_argument("--format", choices=["text", "json"], default="text") + args = parser.parse_args() + + state = load_state() + if args.check: + cmd_check(state, args.fix, args.only, args.format) + elif args.repair_plan: + cmd_repair_plan(state, args.format) + elif args.status: + cmd_status(state) + + +if __name__ == "__main__": + main() From 41983b7072d64bea0516cbeaca7f6cdaf642c086 Mon Sep 17 00:00:00 2001 From: ArchieIndian Date: Tue, 17 Mar 2026 00:58:41 +0530 Subject: [PATCH 23/23] =?UTF-8?q?Update=20README:=2044=20=E2=86=92=2049=20?= =?UTF-8?q?skills=20(5=20lossless-claw-inspired=20additions)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add memory-dag-compactor, large-file-interceptor, context-assembly-scorer, compaction-resilience-guard, and memory-integrity-checker. Update badges, comparison table, architecture diagram, companion scripts list. Co-Authored-By: Claude Opus 4.6 --- README.md | 30 ++++++++++++++++++------------ 1 file changed, 18 insertions(+), 12 deletions(-) diff --git a/README.md b/README.md index 35e4d92..7a6c760 100644 --- a/README.md +++ b/README.md @@ -1,11 +1,11 @@ # openclaw-superpowers -**44 ready-to-use skills that make your AI agent autonomous, self-healing, and self-improving.** +**49 ready-to-use skills that make your AI agent autonomous, self-healing, and self-improving.** -[![Skills](https://img.shields.io/badge/skills-44-blue)](#skills-included) +[![Skills](https://img.shields.io/badge/skills-49-blue)](#skills-included) [![Security](https://img.shields.io/badge/security_skills-6-green)](#security--guardrails) -[![Cron](https://img.shields.io/badge/cron_scheduled-12-orange)](#openclaw-native-28-skills) -[![Scripts](https://img.shields.io/badge/companion_scripts-15-purple)](#companion-scripts) +[![Cron](https://img.shields.io/badge/cron_scheduled-15-orange)](#openclaw-native-33-skills) +[![Scripts](https://img.shields.io/badge/companion_scripts-20-purple)](#companion-scripts) [![License: MIT](https://img.shields.io/badge/license-MIT-yellow.svg)](LICENSE) A plug-and-play skill library for [OpenClaw](https://github.com/openclaw/openclaw) — the open-source AI agent runtime. Gives your agent structured thinking, security guardrails, persistent memory, cron scheduling, self-recovery, and the ability to write its own new skills during conversation. @@ -20,12 +20,13 @@ Built for developers who want their AI agent to run autonomously 24/7, not just Most AI agent frameworks give you a chatbot that forgets everything between sessions. OpenClaw is different — it runs persistently, handles multi-hour tasks, and has native cron scheduling. But out of the box, it doesn't know *how* to use those capabilities well. -**openclaw-superpowers bridges that gap.** Install once, and your agent immediately knows how to: +**openclaw-superpowers bridges that gap.** Install 49 skills in one command, and your agent immediately knows how to: - **Think before it acts** — brainstorming, planning, and systematic debugging skills prevent the "dive in and break things" failure mode - **Protect itself** — 6 security skills detect prompt injection, block dangerous actions, audit installed code, and scan for leaked credentials - **Run unattended** — 12 cron-scheduled skills handle memory cleanup, health checks, budget tracking, and community monitoring while you sleep - **Recover from failures** — self-recovery, loop-breaking, and task handoff skills keep long-running work alive across crashes and restarts +- **Never forget** — DAG-based memory compaction, integrity checking, and context scoring ensure the agent preserves critical information even in month-long conversations - **Improve itself** — the agent can write new skills during normal conversation using `create-skill`, encoding your preferences as permanent behaviors --- @@ -50,7 +51,7 @@ cd ~/.openclaw/extensions/superpowers && ./install.sh openclaw gateway restart ``` -`install.sh` symlinks all 44 skills, creates state directories for stateful skills, and registers cron jobs — everything in one step. That's it. Your agent now has superpowers. +`install.sh` symlinks all 49 skills, creates state directories for stateful skills, and registers cron jobs — everything in one step. That's it. Your agent now has superpowers. --- @@ -78,7 +79,7 @@ Methodology skills that work in any AI agent runtime. Adapted from [obra/superpo | `skill-conflict-detector` | Detects name shadowing and description-overlap conflicts between installed skills | `detect.py` | | `skill-portability-checker` | Validates OS/binary dependencies in companion scripts; catches non-portable calls | `check.py` | -### OpenClaw-Native (28 skills) +### OpenClaw-Native (33 skills) Skills that require OpenClaw's persistent runtime — cron scheduling, session state, or long-running execution. These are the skills that make a 24/7 autonomous agent actually work reliably. @@ -112,6 +113,11 @@ Skills that require OpenClaw's persistent runtime — cron scheduling, session s | `config-encryption-auditor` | Scans config directories for plaintext API keys, tokens, and world-readable permissions | Sundays 9am | `audit.py` | | `tool-description-optimizer` | Scores skill descriptions for trigger quality — clarity, specificity, keyword density — and suggests rewrites | — | `optimize.py` | | `mcp-health-checker` | Monitors MCP server connections for health, latency, and availability; detects stale connections | every 6h | `check.py` | +| `memory-dag-compactor` | Builds hierarchical summary DAGs from MEMORY.md with depth-aware prompts (d0 leaf → d3+ durable) | daily 11pm | `compact.py` | +| `large-file-interceptor` | Detects oversized files, generates structural exploration summaries, stores compact references | — | `intercept.py` | +| `context-assembly-scorer` | Scores how well current context represents full conversation; detects blind spots | every 4h | `score.py` | +| `compaction-resilience-guard` | Monitors compaction for failures; enforces normal → aggressive → deterministic fallback chain | — | `guard.py` | +| `memory-integrity-checker` | Validates summary DAGs for orphans, circular refs, token inflation, broken lineage | Sundays 3am | `integrity.py` | ### Community (1 skill) @@ -142,12 +148,12 @@ Six skills form a defense-in-depth security layer for autonomous agents: | Feature | openclaw-superpowers | obra/superpowers | Custom prompts | |---|---|---|---| -| Skills included | **44** | 8 | 0 | +| Skills included | **49** | 8 | 0 | | Self-modifying (agent writes new skills) | Yes | No | No | -| Cron scheduling | **12 scheduled skills** | No | No | +| Cron scheduling | **15 scheduled skills** | No | No | | Persistent state across sessions | **YAML state schemas** | No | No | | Security guardrails | **6 defense-in-depth skills** | No | No | -| Companion scripts with CLI | **15 scripts** | No | No | +| Companion scripts with CLI | **20 scripts** | No | No | | Memory graph / knowledge graph | Yes | No | No | | MCP server health monitoring | Yes | No | No | | API spend tracking & budget enforcement | Yes | No | No | @@ -169,7 +175,7 @@ Six skills form a defense-in-depth security layer for autonomous agents: │ │ │ ├── SKILL.md │ │ │ └── TEMPLATE.md │ │ └── ... -│ ├── openclaw-native/ # 28 persistent-runtime skills +│ ├── openclaw-native/ # 33 persistent-runtime skills │ │ ├── memory-graph-builder/ │ │ │ ├── SKILL.md # Skill definition + YAML frontmatter │ │ │ ├── STATE_SCHEMA.yaml # State shape (committed, versioned) @@ -192,7 +198,7 @@ Six skills form a defense-in-depth security layer for autonomous agents: Skills marked with a script ship a small executable alongside their `SKILL.md`: -- **15 Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`, `radar.py`, `graph.py`, `optimize.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies; `pyyaml` is optional but recommended. +- **20 Python scripts** (`run.py`, `audit.py`, `check.py`, `guard.py`, `bridge.py`, `onboard.py`, `sync.py`, `doctor.py`, `loadout.py`, `governor.py`, `detect.py`, `test.py`, `radar.py`, `graph.py`, `optimize.py`, `compact.py`, `intercept.py`, `score.py`, `integrity.py`) — run directly to manipulate state, generate reports, or trigger actions. No extra dependencies; `pyyaml` is optional but recommended. - **`vet.sh`** — Pure bash scanner; runs on any system with grep. - Every script supports `--help` and `--format json`. Dry-run mode available on scripts that make changes. - See the `example-state.yaml` in each skill directory for sample state and a commented walkthrough of cron behaviour.