ArchieIndian · ArchieIndian · Mar 16, 2026 · Mar 16, 2026
diff --git a/skills/openclaw-native/context-assembly-scorer/SKILL.md b/skills/openclaw-native/context-assembly-scorer/SKILL.md
@@ -0,0 +1,94 @@
+---
+name: context-assembly-scorer
+version: "1.0"
+category: openclaw-native
+description: Scores how well the current context represents the full conversation — detects information blind spots, stale summaries, and coverage gaps that cause the agent to forget critical details.
+stateful: true
+cron: "0 */4 * * *"
+---
+
+# Context Assembly Scorer
+
+## What it does
+
+When an agent compacts context, it loses information. But how much? And which information? Context Assembly Scorer answers these questions by measuring **coverage** — the ratio of important topics in the full conversation history that are represented in the current assembled context.
+
+Inspired by [lossless-claw](https://github.com/Martian-Engineering/lossless-claw)'s context assembly system, which carefully selects which summaries to include in each turn's context to maximize information coverage.
+
+## When to invoke
+
+- Automatically every 4 hours (cron) — silent coverage check
+- Before starting a task that depends on prior context — verify nothing critical is missing
+- After compaction — measure information loss
+- When the agent says "I don't remember" — diagnose why
+
+## Coverage dimensions
+
+| Dimension | What it measures | Weight |
+|---|---|---|
+| Topic coverage | % of conversation topics present in current context | 2x |
+| Recency bias | Whether recent context is over-represented vs. older important context | 1.5x |
+| Entity continuity | Named entities (files, people, APIs) mentioned in history that are missing from context | 2x |
+| Decision retention | Architectural decisions and user preferences still accessible | 2x |
+| Task continuity | Active/pending tasks that might be lost after compaction | 1.5x |
+
+## How to use
+
+```bash
+python3 score.py --score                      # Score current context assembly
+python3 score.py --score --verbose             # Detailed per-dimension breakdown
+python3 score.py --blind-spots                 # List topics missing from context
+python3 score.py --drift                       # Compare current vs. previous scores
+python3 score.py --status                      # Last score summary
+python3 score.py --format json                 # Machine-readable output
+```
+
+## Procedure
+
+**Step 1 — Score context coverage**
+
+```bash
+python3 score.py --score
+```
+
+The scorer reads MEMORY.md (full history) and compares it against what's currently accessible. Outputs a coverage score from 0–100% with a letter grade.
+
+**Step 2 — Find blind spots**
+
+```bash
+python3 score.py --blind-spots
+```
+
+Lists specific topics, entities, and decisions that exist in full history but are missing from current context — these are what the agent has effectively "forgotten."
+
+**Step 3 — Track drift over time**
+
+```bash
+python3 score.py --drift
+```
+
+Shows how coverage has changed across the last 20 scores. Identify if compaction is progressively losing more information.
+
+## Grading
+
+| Grade | Coverage | Meaning |
+|---|---|---|
+| A | 90–100% | Excellent — minimal information loss |
+| B | 75–89% | Good — minor gaps, unlikely to cause issues |
+| C | 60–74% | Fair — some important context missing |
+| D | 40–59% | Poor — significant blind spots |
+| F | 0–39% | Critical — agent is operating with major gaps |
+
+## State
+
+Coverage scores and blind spot history stored in `~/.openclaw/skill-state/context-assembly-scorer/state.yaml`.
+
+Fields: `last_score_at`, `current_score`, `blind_spots`, `score_history`.
+
+## Notes
+
+- Read-only — does not modify context or memory
+- Topic extraction uses keyword clustering, not LLM calls
+- Entity detection uses regex patterns for file paths, URLs, class names, API endpoints
+- Decision detection looks for markers: "decided", "chose", "prefer", "always", "never"
+- Recency bias is measured as the ratio of recent-vs-old entry representation
diff --git a/skills/openclaw-native/context-assembly-scorer/STATE_SCHEMA.yaml b/skills/openclaw-native/context-assembly-scorer/STATE_SCHEMA.yaml
@@ -0,0 +1,31 @@
+version: "1.0"
+description: Context coverage scores, blind spot tracking, and drift history.
+fields:
+  last_score_at:
+    type: datetime
+  current_score:
+    type: object
+    fields:
+      overall:          { type: float, description: "0-100 coverage percentage" }
+      grade:            { type: string }
+      topic_coverage:   { type: float }
+      recency_bias:     { type: float }
+      entity_continuity: { type: float }
+      decision_retention: { type: float }
+      task_continuity:  { type: float }
+  blind_spots:
+    type: list
+    description: Topics/entities missing from current context
+    items:
+      type:       { type: enum, values: [topic, entity, decision, task] }
+      name:       { type: string }
+      importance: { type: enum, values: [critical, high, medium, low] }
+      last_seen:  { type: string, description: "When this was last in context" }
+  score_history:
+    type: list
+    description: Rolling log of past scores (last 20)
+    items:
+      scored_at:  { type: datetime }
+      overall:    { type: float }
+      grade:      { type: string }
+      blind_spot_count: { type: integer }
diff --git a/skills/openclaw-native/context-assembly-scorer/example-state.yaml b/skills/openclaw-native/context-assembly-scorer/example-state.yaml
@@ -0,0 +1,74 @@
+# Example runtime state for context-assembly-scorer
+last_score_at: "2026-03-16T16:00:08.000000"
+current_score:
+  overall: 72.3
+  grade: C
+  topic_coverage: 82.0
+  recency_bias: 65.5
+  entity_continuity: 68.0
+  decision_retention: 75.0
+  task_continuity: 70.0
+blind_spots:
+  - type: decision
+    name: "Decided to use Jaccard similarity threshold of 0.7 for deduplication"
+    importance: critical
+    last_seen: "in full memory"
+  - type: entity
+    name: "/skills/openclaw-native/heartbeat-governor/governor.py"
+    importance: high
+    last_seen: "in full memory"
+  - type: task
+    name: "TODO: add --dry-run flag to radar.py before next release"
+    importance: high
+    last_seen: "in full memory"
+  - type: entity
+    name: "https://github.com/Neirth/OpenLobster"
+    importance: medium
+    last_seen: "in full memory"
+score_history:
+  - scored_at: "2026-03-16T16:00:08.000000"
+    overall: 72.3
+    grade: C
+    blind_spot_count: 12
+  - scored_at: "2026-03-16T12:00:05.000000"
+    overall: 85.1
+    grade: B
+    blind_spot_count: 5
+  - scored_at: "2026-03-16T08:00:03.000000"
+    overall: 91.2
+    grade: A
+    blind_spot_count: 2
+# ── Walkthrough ──────────────────────────────────────────────────────────────
+# Cron runs every 4 hours:  python3 score.py --score --verbose
+#
+#   Context Assembly Score — 2026-03-16 16:00
+#   ───────────────────────────────────────────────────────
+#     Overall:              72.3%  Grade: C
+#     Topic coverage:       82.0%  (2x weight)
+#     Recency bias:         65.5%  (1.5x weight)
+#     Entity continuity:    68.0%  (2x weight)
+#     Decision retention:   75.0%  (2x weight)
+#     Task continuity:      70.0%  (1.5x weight)
+#
+#     Memory stats:
+#       Topics: 284 unique | Entities: 47
+#       Decisions: 12 | Tasks: 8
+#     Blind spots: 12
+#
+# python3 score.py --blind-spots
+#
+#   Blind Spots — 12 items missing from context
+#   ───────────────────────────────────────────────────────
+#   !! [CRITICAL] decision: Decided to use Jaccard similarity threshold...
+#   !  [    HIGH]   entity: /skills/openclaw-native/heartbeat-governor/...
+#   !  [    HIGH]     task: TODO: add --dry-run flag to radar.py...
+#
+# python3 score.py --drift
+#
+#   Coverage Drift — 3 data points
+#   ───────────────────────────────────────────────────────
+#   2026-03-16T16:00  [=======---] 72.3% (C)  12 blind spots
+#   2026-03-16T12:00  [=========-] 85.1% (B)  5 blind spots
+#   2026-03-16T08:00  [=========+] 91.2% (A)  2 blind spots
+#
+#   Trend: declining (-12.8%)