Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 5 additions & 24 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,54 +3,35 @@

### Architecture

<!-- lore:019c8f8c-47c3-71a2-b5fd-248a2cfeba78 -->
* **Lore temporal pruning runs after distillation and curation on session.idle**: In src/index.ts, session.idle awaits backgroundDistill and backgroundCurate sequentially before running temporal.prune(). Ordering is critical: pruning must not delete unprocessed messages. Pruning defaults: 120-day retention, 1GB max storage (in .lore.json under pruning.retention and pruning.maxStorage). These generous defaults were chosen because the system was new — earlier proposals of 7d/200MB were based on insufficient data.
<!-- lore:019c904b-791e-772a-ab2b-93ac892a960c -->
* **buildSection() renders AGENTS.md directly without formatKnowledge**: AGENTS.md export/import: buildSection() iterates DB entries grouped by category, emitting \`\<!-- lore:UUID -->\` markers, serialized via remark. splitFile() scans ALL\_START\_MARKERS (current + historical) — self-healing: N duplicate sections collapse to 1 on next export. Import dedup handled by curator LLM at startup when file changed. Missing marker = hand-written; duplicate UUID = first wins. ltm.create() has title-based dedup guard (case-insensitive, skipped for explicit IDs from cross-machine import). Marker text says 'maintained by the coding agent' (not 'auto-maintained') so LLM agents include it in commits.
<!-- lore:019c8f4f-67c8-7cf4-b93b-c5ec46ed94b6 -->
* **Lore DB uses incremental auto\_vacuum to prevent free-page bloat**: Lore's SQLite DB uses incremental auto\_vacuum (schema version 3 migration) to prevent free-page bloat from deletions. The migration sets PRAGMA auto\_vacuum = INCREMENTAL then VACUUM outside a transaction. temporal\_messages is the primary storage consumer (~51MB); knowledge table is tiny.
<!-- lore:019c94bd-042d-73c0-b850-192d1d62fa68 -->
* **Knowledge entry distribution across projects — worktree sessions create separate project IDs**: Knowledge entries are scoped by project\_id from ensureProject(projectPath). OpenCode worktree sessions (paths like ~/.local/share/opencode/worktree/\<hash>/\<slug>/) each get their own project\_id. A single repo can have multiple project\_ids: one for the real path, separate ones per worktree session. Project-specific entries (cross\_project=0) are invisible across different project\_ids. Cross-project entries (cross\_project=1) are shared globally.
<!-- lore:019c94bd-042b-7215-b0a0-05719fcd39b2 -->
* **LTM injection pipeline: system transform → forSession → formatKnowledge → gradient deduction**: LTM is injected via experimental.chat.system.transform hook. Flow: getLtmBudget() computes ceiling as (contextLimit - outputReserved - overhead) \* ltmFraction (default 10%, configurable 2-30%). forSession() loads project-specific entries unconditionally + cross-project entries scored by term overlap, greedy-packs into budget. formatKnowledge() renders as markdown. setLtmTokens() records consumption so gradient deducts it. Key: LTM goes into output.system (system prompt), not the message array — invisible to tryFit(), counts against overhead budget.
<!-- lore:019c8f8c-47c3-71a2-b5fd-248a2cfeba78 -->
* **Lore temporal pruning runs after distillation and curation on session.idle**: In src/index.ts, the session.idle handler now awaits both backgroundDistill and backgroundCurate (changed from fire-and-forget) before running temporal.prune(). This ordering is critical: pruning must run after both pipelines complete so it never deletes messages that haven't been processed. The prune call is wrapped in try/catch and logs when rows are deleted. Config comes from cfg.pruning.retention (days) and cfg.pruning.maxStorage (MB).
<!-- lore:019c904b-791e-772a-ab2b-93ac892a960c -->
* **buildSection() renders AGENTS.md directly without formatKnowledge**: AGENTS.md export/import architecture: buildSection() iterates DB entries grouped by category, emitting \<!-- lore:UUID --> markers before each bullet, serialized via remark. splitFile() scans ALL\_START\_MARKERS (current + historical marker text) to find every lore section span — self-healing: N duplicate sections collapse to 1 on next export. Adding new marker variants requires only appending to ALL\_START\_MARKERS. Import dedup: curator LLM handles semantic dedup at import time (startup when file changed). For merge conflicts: missing marker = hand-written (curator deduplicates); duplicate UUID = first wins; malformed = hand-written. ltm.create() has a title-based dedup guard (case-insensitive, skipped for explicit IDs from cross-machine import). LLM agents skip AGENTS.md in commits because it looks auto-generated — fixed via system prompt instruction and changing marker text from 'auto-maintained' to 'maintained by the coding agent'.

### Decision

<!-- lore:019c904b-7924-7187-8471-8ad2423b8946 -->
* **Curator prompt scoped to code-relevant knowledge only**: CURATOR\_SYSTEM in src/prompt.ts now explicitly excludes: general ecosystem knowledge available online, business strategy and marketing positioning, product pricing models, third-party tool details not needed for development, and personal contact information. This was added after the curator extracted entries about OpenWork integration strategy (including an email address), Lore Cloud pricing tiers, and AGENTS.md ecosystem facts — none of which help an agent write code. The curatorUser() function also appends guidance to prefer updating existing entries over creating new ones for the same concept, reducing duplicate creation.
<!-- lore:019c8f8c-47c6-7e5a-8e93-7721dc1378dc -->
* **Lore pruning defaults: 120-day retention, 1GB size cap**: User chose 120 days retention (not the originally proposed 7 days) and 1GB max storage (not 200MB). Rationale: the system is new and 7 days/200MB was based on only one week of data from a one-week-old system — not indicative of real growth. The generous defaults preserve recall capability and historical context. Config is in .lore.json under pruning.retention (days) and pruning.maxStorage (MB).
<!-- lore:dd60622e-6cf3-48c7-9715-f44fb054e150 -->
* **Use uuidv7 npm package for knowledge entry IDs**: User chose the \`uuidv7\` npm package (https://npmx.dev/package/uuidv7, by LiosK) over a self-contained ~15 line implementation. The package is RFC 9562 compliant, provides \`uuidv7()\` function that returns standard UUID string format, has a 42-bit counter for sub-millisecond monotonic ordering, and is clean/minimal. Usage: \`import { uuidv7 } from 'uuidv7'; const id = uuidv7();\` replaces \`crypto.randomUUID()\` in \`ltm.ts:31\`. Added as a runtime dependency in package.json.

### Gotcha

<!-- lore:019ca60f-0390-74ce-9381-b4296d94a553 -->
* **React useState async pitfall**: React useState setter is async — reading state immediately after setState returns stale value in dashboard components
<!-- lore:019ca60f-036f-7341-bbb7-dda2ea63294f -->
* **TypeScript strict mode caveat**: TypeScript strict null checks require explicit undefined handling
<!-- lore:019c91d6-04af-7334-8374-e8bbf14cb43d -->
* **Calibration used DB message count instead of transformed window count — caused layer 0 false passthrough**: Multiple gradient calibration bugs caused context overflow: (1) Calibration used DB message count instead of transformed window count — after compression (e.g. 50 msgs), delta saw ~1 new message → layer 0 passthrough → all raw messages sent → overflow. Fix: use getLastTransformedCount(). (2) actualInput formula omitted cache.write — on cold-cache turns actualInput became ~3 instead of 150K → layer 0 → overflow. Fix: include cache.write in both the formula and calibration guard. (3) Trailing pure-text assistant messages after tryFit cause Anthropic prefill errors. But messages with ANY tool parts must NOT be dropped (SDK converts to tool\_result user-role). Drop predicate: \`hasToolParts\`, not \`hasPendingTool\`. (4) Stats PATCH on message parts was removed — it was write-only dead code causing system-reminder persistence bug. Don't mutate parts you don't own.
* **Calibration used DB message count instead of transformed window count — caused layer 0 false passthrough**: Lore gradient calibration bugs that caused context overflow: (1) Used DB message count instead of transformed window count — after compression, delta saw ~1 new msg → layer 0 passthrough → overflow. Fix: getLastTransformedCount(). (2) actualInput omitted cache.write — cold-cache turns showed ~3 tokens instead of 150K → layer 0. Fix: include cache.write. (3) Trailing pure-text assistant messages cause Anthropic prefill errors, but messages with tool parts must NOT be dropped (SDK converts to tool\_result user-role). Drop predicate: \`hasToolParts\`. (4) Don't mutate message parts you don't own — removed stats PATCH that caused system-reminder persistence bug.
<!-- lore:019c91c0-cdf3-71c9-be52-7f6441fb643e -->
* **Lore plugin only protects projects where it's registered in opencode.json**: The lore gradient transform only runs for projects with lore registered in opencode.json (or globally in ~/.config/opencode/). Projects without it get zero context management — messages accumulate until overflow triggers a stuck compaction loop. This caused a 404K-token overflow in a getsentry/cli session with no opencode.json.
<!-- lore:019c91ad-4d47-7afc-90e0-239a9eda57a4 -->
* **Stuck compaction loops leave orphaned user+assistant message pairs in DB**: When OpenCode compaction overflows, it creates paired user+assistant messages per retry (assistant has error.name:'ContextOverflowError', mode:'compaction'). These accumulate and worsen the session. Recovery: find last good assistant message (has tokens, no error), delete all messages after it from both \`message\` and \`part\` tables. Use json\_extract(data, '$.error.name') to identify compaction debris.
<!-- lore:019c8f4f-67ca-7212-a8c4-8a75b230ceea -->
* **Lore test suite uses live DB — no test isolation for db.test.ts**: The lore test suite (test/db.test.ts, test/ltm.test.ts) uses the live DB at ~/.local/share/opencode-lore/lore.db — no LORE\_DB\_PATH override. Test fixtures create entries with 019c9026-\* UUIDs that persist and leak into AGENTS.md exports. Known leaked entries: 'Kubernetes deployment pattern', 'TypeScript strict mode caveat', 'React useState async pitfall', 'Fine entry'. These require periodic manual cleanup from the DB. Fix needed: set LORE\_DB\_PATH to a temp file in tests.

### Pattern

<!-- lore:019ca60f-0371-7021-9a40-bf59215147e1 -->
* **Kubernetes deployment pattern**: Use helm charts for Kubernetes deployments with resource limits
<!-- lore:019c8ae9-2e54-7276-966a-befe699db589 -->
* **Use SDK internal client for HTTP requests in OpenCode plugins**: OpenCode plugins should use \`(ctx.client as any).\_client.patch()\` instead of raw fetch() with ctx.serverUrl. The \_client is the HeyAPI Client with correct base URL and interceptors, avoiding ConnectionRefused in TUI-only mode. Supports path interpolation. Caveat: \_client is private/undocumented. Note: lore's use of this pattern was removed — prefer dedicated endpoints over part mutation.

### Preference

<!-- lore:019ca60f-0376-72e0-805c-e1927dacdbed -->
* **General coding preference**: Prefer explicit error handling over silent failures
<!-- lore:019ca60f-0335-786c-83bd-a5cfcccbf223 -->
* **Code style**: User prefers no backwards-compat shims, fix callers directly
<!-- lore:019ca19d-fc02-7657-b2e9-7764658c01a5 -->
* **Code style**: User prefers no backwards-compat shims — fix callers directly. Prefer explicit error handling over silent failures. Derive thresholds from existing constants rather than hardcoding magic numbers (e.g., use \`raw.length <= COL\_COUNT\` instead of \`n < 10\_000\`). In CI, define shared env vars at workflow level, not per-job.

Expand Down
72 changes: 72 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,78 @@ To use a local clone instead of the published package:
}
```

## Configuration

Create a `.lore.json` file in your project root to customize behavior. All fields are optional — defaults are shown below:

```jsonc
{
// Disable long-term knowledge entirely. Temporal storage, distillation,
// gradient context management, and the recall tool (for conversation search)
// remain active. Only the curator, knowledge injection, and AGENTS.md sync
// are turned off.
"knowledge": { "enabled": true },

// Tune the curator that extracts knowledge from conversations.
"curator": {
"enabled": true, // set false to stop extracting knowledge entries
"onIdle": true, // run curation when a session goes idle
"afterTurns": 10, // run curation after N user turns
"maxEntries": 25 // consolidate when entries exceed this count
},

// AGENTS.md export/import — the universal agents file format.
"agentsFile": {
"enabled": true, // set false to disable AGENTS.md sync
"path": "AGENTS.md" // change to e.g. "CLAUDE.md" or ".cursor/rules/lore.md"
},

// Context budget fractions (of usable context window).
"budget": {
"distilled": 0.25, // distilled history prefix
"raw": 0.4, // recent raw messages
"output": 0.25, // reserved for model output
"ltm": 0.10 // long-term knowledge in system prompt (2-30%)
},

// Distillation thresholds.
"distillation": {
"minMessages": 8, // min undistilled messages before distilling
"maxSegment": 50 // max messages per distillation chunk
},

// Temporal message pruning.
"pruning": {
"retention": 120, // days to keep distilled messages
"maxStorage": 1024 // max storage in MB before emergency pruning
},

// Include cross-project knowledge entries. Default: true.
"crossProject": true
}
```

### Disabling long-term knowledge

If you prefer to manage context manually and only want conversation search capabilities, set:

```json
{
"knowledge": { "enabled": false }
}
```

This disables:
- **Knowledge extraction** — the curator won't extract patterns, decisions, or gotchas from conversations
- **Knowledge injection** — no knowledge entries are added to the system prompt
- **AGENTS.md sync** — no import/export of the agents file

This keeps active:
- **Temporal storage** — all messages are still stored and searchable
- **Distillation** — conversations are still distilled for context management
- **Gradient context manager** — context window is still managed automatically
- **The `recall` tool** — the agent can still search conversation history and distillations (knowledge search is skipped)

## What to expect

Once Lore is active, you should notice several changes:
Expand Down
10 changes: 10 additions & 0 deletions src/config.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,16 @@ export const LoreConfig = z.object({
metaThreshold: z.number().min(3).default(10),
})
.default({}),
knowledge: z
.object({
/** Set to false to disable long-term knowledge storage and system-prompt injection.
* Conversation recall (temporal search, distillation search) and context management
* (gradient transform, distillation) remain fully active. Disabling this turns off
* the curator, knowledge DB writes, AGENTS.md sync, and LTM injection into the
* system prompt. Default: true. */
enabled: z.boolean().default(true),
})
.default({}),
curator: z
.object({
enabled: z.boolean().default(true),
Expand Down
Loading