-
Notifications
You must be signed in to change notification settings - Fork 0
docs: ARCHITECTURE.md — system overview + data flow diagrams (#116) #135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
| @@ -0,0 +1,168 @@ | ||||||
| # ARCHITECTURE.md | ||||||
|
|
||||||
| > wagl — local-first agent memory backed by libSQL | ||||||
|
|
||||||
| ## System Overview | ||||||
|
|
||||||
| ``` | ||||||
| ┌─────────────────────────────────────────────────────┐ | ||||||
| │ Agent / Human │ | ||||||
| │ (OpenClaw, ChatGPT, Claude, CLI) │ | ||||||
| └──────┬──────────┬──────────┬───────────┬────────────┘ | ||||||
| │ │ │ │ | ||||||
| ▼ ▼ ▼ ▼ | ||||||
| ┌────────┐ ┌────────┐ ┌────────┐ ┌──────────┐ | ||||||
| │ CLI │ │ Server │ │ MCP │ │ OpenClaw │ | ||||||
| │ (wagl) │ │ (REST) │ │(stdio) │ │ plugin │ | ||||||
| └───┬────┘ └───┬────┘ └───┬────┘ └────┬─────┘ | ||||||
| │ │ │ │ | ||||||
| └──────────┴──────┬───┴────────────┘ | ||||||
| │ | ||||||
| ▼ | ||||||
| ┌──────────────────┐ | ||||||
| │ wagl-db │ | ||||||
| │ (libSQL layer) │ | ||||||
| └────────┬─────────┘ | ||||||
| │ | ||||||
| ┌────────┴─────────┐ | ||||||
| │ wagl-core │ | ||||||
| │ (types, no IO) │ | ||||||
| └──────────────────┘ | ||||||
| │ | ||||||
| ┌────────┴─────────┐ | ||||||
| │ libSQL │ | ||||||
| │ (local + Turso │ | ||||||
| │ embedded sync) │ | ||||||
| └──────────────────┘ | ||||||
|
Comment on lines
+27
to
+36
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The diagram draws
As written, this will mislead contributors into thinking |
||||||
| ``` | ||||||
|
|
||||||
| ## Crate Structure | ||||||
|
|
||||||
| ### `crates/core` — Types (no IO) | ||||||
| Pure data types shared across all crates. No database, network, or filesystem dependencies. | ||||||
|
|
||||||
| - `MemoryItem` — the fundamental unit: text + tags + scores + metadata | ||||||
| - `ExpertiseItem`, `FocusItem` — curated domain slices | ||||||
| - `IntentItem` — terminology/alias mappings | ||||||
| - `MemoryEdge` — graph edges between items | ||||||
|
|
||||||
| ### `crates/db` — Database Layer | ||||||
| All libSQL/SQLite interaction lives here. Owns the schema and migrations. | ||||||
|
|
||||||
| - `MemoryDb` — connection wrapper with sync support | ||||||
| - `migrate.rs` — schema versioning (currently v1, v2 in PR) | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
| - `vector_ext.rs` — sqlite-vec extension loading | ||||||
| - Insert/update/query/recall for all item types | ||||||
| - Turso embedded replica sync (`db.sync()`) | ||||||
|
|
||||||
| **Schema highlights:** | ||||||
| - `memory_items` — main table with vector embeddings column | ||||||
| - `expertise_items` + `expertise_index` — curated knowledge areas | ||||||
| - `focus_items` + `focus_index` — project/discussion-specific memory | ||||||
| - `memory_edges` — graph relationships between items | ||||||
| - `intent_items` — term/alias mappings | ||||||
|
|
||||||
| ### `crates/cli` — The `wagl` Binary (~4600 lines) | ||||||
| The primary interface. 40+ subcommands organized by phase: | ||||||
|
|
||||||
| | Category | Commands | | ||||||
| |----------|----------| | ||||||
| | **Core CRUD** | `put`, `get`, `query`, `search`, `everything`, `forget` | | ||||||
| | **Canonical** | `canon get\|set\|list` | | ||||||
| | **Recall** | `recall` (deterministic packs), `scores` (explain scoring) | | ||||||
| | **Quality** | `reconcile`, `audit-quality`, `audit`, `import-missing` | | ||||||
| | **Lifecycle** | `sleep`, `morning`, `decay`, `gc` | | ||||||
| | **Knowledge** | `expertise`, `focus`, `promote` | | ||||||
| | **Graph** | `link`, `neighbors`, `reconstruct` | | ||||||
| | **Capture** | `capture`, `reflex` | | ||||||
| | **Data** | `bundle` (export/import), `curate`, `ingest`, `sync` | | ||||||
| | **Infra** | `init`, `status`, `stats`, `embed`, `trust`, `intent`, `serve` | | ||||||
|
|
||||||
| ### `crates/server` — HTTP REST API | ||||||
| Axum-based server started via `wagl serve`. Exposes core operations over HTTP with JSON request/response. | ||||||
|
|
||||||
| - `PUT /items` — store memory | ||||||
| - `GET /items/:id` — retrieve by ID | ||||||
| - `POST /recall` — recall packs | ||||||
| - `POST /query` — text search | ||||||
| - `POST /search` — vector search | ||||||
|
Comment on lines
+84
to
+88
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The router exposed by Useful? React with 👍 / 👎. |
||||||
| - Webhook handlers for external event capture | ||||||
|
|
||||||
| ### `crates/mcp` — MCP Server (stdio transport) | ||||||
| Model Context Protocol server for direct LLM tool integration. | ||||||
|
|
||||||
| - 5 tools: `store`, `recall`, `query`, `search`, `forget` | ||||||
| - stdio transport (launched as subprocess by MCP clients) | ||||||
|
Comment on lines
+94
to
+95
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The MCP server registered in Useful? React with 👍 / 👎. |
||||||
| - Separate binary: `wagl-mcp` | ||||||
|
|
||||||
| ## Data Flow: Recall | ||||||
|
|
||||||
| ``` | ||||||
| Agent asks: "What do I know about Chris's project?" | ||||||
| │ | ||||||
| ▼ | ||||||
| wagl recall "Chris's project" | ||||||
| │ | ||||||
| ├─── 1. Always include canon items | ||||||
| │ (canon:user.profile, canon:user.preferences) | ||||||
| │ | ||||||
| ├─── 2. Vector search (sqlite-vec cosine similarity) | ||||||
| │ Query → embedding → nearest neighbors | ||||||
| │ | ||||||
| ├─── 3. Keyword search (LIKE matching) | ||||||
| │ Extract significant terms → text match | ||||||
| │ | ||||||
| ├─── 4. Hybrid ranking | ||||||
| │ Vector hits ∩ keyword hits ranked first | ||||||
| │ Score: salience × recency × |d_score| boost | ||||||
| │ | ||||||
| ├─── 5. Multi-pass (if configured) | ||||||
| │ Pass 1: high-valence (|d_score| ≥ threshold) | ||||||
| │ Pass 2: standard relevance | ||||||
| │ Pass 3: recency boost | ||||||
| │ | ||||||
| └─── 6. Return ranked pack with provenance | ||||||
| Each item: text + tags + scores + "why included" | ||||||
| ``` | ||||||
|
|
||||||
| ## Scoring System | ||||||
|
|
||||||
| Three scores work together to surface the right memories: | ||||||
|
|
||||||
| - **D-Score** (feeling): -10 to +10. Emotional valence. Negative = bad experience, positive = great. | ||||||
| - **I-Score** (intuition): 0 to 2. Confidence/accuracy multiplier. | ||||||
| - **EV** (experience value): `d_score × i_score`. Ranking bonus in recall. | ||||||
|
|
||||||
| High |EV| items surface faster. Low/zero EV items rank normally by relevance. | ||||||
|
|
||||||
| ## Embedding Strategy | ||||||
|
|
||||||
| - **Local CLI**: embeddings via configurable OpenAI-compatible endpoint | ||||||
| - **Cloud (zumie.ai)**: Gemini `gemini-embedding-001` (768 dimensions) | ||||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The same applies to line 155:
Context Used: AGENTS.md (source) |
||||||
| - **Storage**: `Float32Array` in sqlite-vec `float32[N]` column | ||||||
| - **Fallback**: graceful degradation when embeddings unavailable (keyword-only search) | ||||||
|
|
||||||
| ## Sync Architecture | ||||||
|
|
||||||
| ``` | ||||||
| Local DB (libSQL) ←──embedded replica──→ Turso Cloud | ||||||
| │ | ||||||
| └── wagl sync (push/pull) | ||||||
| ``` | ||||||
|
|
||||||
| - Local-first: works fully offline | ||||||
| - Turso embedded replicas for cloud sync | ||||||
| - Per-user isolated databases in zumie.ai (multi-tenant) | ||||||
|
|
||||||
| ## Key Design Decisions | ||||||
|
|
||||||
| 1. **Local-first, not cloud-first** — CLI works without network. Cloud is additive. | ||||||
| 2. **libSQL over SQLite** — Turso sync, vector support via sqlite-vec, same SQLite compatibility. | ||||||
| 3. **Tags over schema fields** — Canonical conventions (`canon:user.profile`) use tags to avoid schema rigidity. Schema fields added only when behavior depends on them (scores, emotions, actionable). | ||||||
| 4. **Deterministic recall** — Same query + same DB = same results. No randomness in ranking. | ||||||
| 5. **Doctrine in the binary** — `wagl skill` embeds the behavioral contract. Agents learn from the tool. | ||||||
| 6. **Multi-crate for isolation** — core (types) → db (storage) → cli/server/mcp (surfaces). No circular deps. | ||||||
|
|
||||||
| --- | ||||||
|
|
||||||
| *Generated 2026-03-21. See `AGENTS.md` for contributor orientation.* | ||||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# ARCHITECTURE.mdas a rendered heading looks odd (the.mdextension is typically omitted). Compare withAGENTS.mdwhich uses a plain prose heading. Consider:Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!