feat: add environment + origin_domain provenance fields (#119) by GoZumie · Pull Request #127 · BigInformatics/wagl

GoZumie · 2026-03-22T01:28:17Z

What

Adds environment (TEXT) and origin_domain (TEXT) columns to memory_items, threaded through all layers.

Schema

Migration v1→v2: adds columns with defaults
CLI: --environment (default: 'cli') and --origin (default: hostname)
Server/MCP: accept in PutItemRequest
Centralized memory_item_from_row() helper — eliminates ~200 lines of duplicated row-parsing

Why

Every memory item should know where it came from. When multiple agents/surfaces write memories, provenance is critical for debugging, filtering, and trust.

Closes #119

PR Review by Greptile

Greptile Summary

This PR threads environment and origin_domain provenance fields through the full memory-item stack (schema, types, DB layer, CLI, server) and centralises row-parsing into a single memory_item_from_row() helper, eliminating ~200 lines of duplicated code. It also ships a new Day subcommand and the three daily_* tables as a schema v2 migration.

DB layer (lib.rs, migrate.rs): column additions via apply_pending_alters are correct; MEMORY_ITEM_SELECT constant and memory_item_from_row() cleanly unify all query paths; migration v1→v2 runs in a transaction and is well-tested.
Server (server/src/lib.rs): PutItemRequest correctly exposes and passes through both provenance fields.
CLI (main.rs, ingest_cmd.rs): Put and CanonPut commands default environment to "cli" and origin_domain to the system hostname. However, several other CLI-originated item constructors (People, Trust, Decide/ReDecide, import_missing_files) still set environment: None, which is inconsistent with the stated goal. The new Day subcommand's validate_day_args function also does not guard against --date being silently ignored when combined with --invalidate.
MCP handler (mcp/src/handler.rs): both fields are hardcoded to None and MemoryStoreParams has no corresponding inputs, meaning every agent-written memory will have no provenance — directly at odds with the motivation for this PR.

Confidence Score: 3/5

Safe to merge for the DB/server/type changes, but the MCP handler gap and inconsistent CLI environment tagging mean the primary provenance goal is only partially achieved.
The DB refactor and migration are solid and well-tested. The server-side provenance pass-through is correct. The P1 issue — MCP items silently storing NULL for both provenance fields — directly contradicts the PR's stated motivation and the MCP surface is a first-class agent write path. The inconsistent environment=None in several CLI paths (People, Trust, Decide) and the silent --date/--invalidate UX gap are meaningful correctness/consistency gaps on top of that. One targeted fix to the MCP handler (setting environment="mcp" and ideally adding fields to MemoryStoreParams) plus a pass over the None-environment CLI paths would bring this to 4+.
crates/mcp/src/handler.rs (provenance silently lost), crates/cli/src/main.rs (inconsistent environment tagging and Day arg validation gap)

Important Files Changed

Filename	Overview
crates/db/src/lib.rs	Core change: adds MEMORY_ITEM_SELECT constant, centralises row-parsing in memory_item_from_row(), adds environment/origin_domain to all INSERT statements, and introduces daily_entries/daily_memories/daily tables with full CRUD. Column index mapping in memory_item_from_row() is correct against MEMORY_ITEM_SELECT.
crates/db/src/migrate.rs	Adds migrate_v1_to_v2 under a BEGIN IMMEDIATE transaction, creating the three daily narrative tables and bumping schema_version to 2. Tests cover the new migration path. The environment/origin_domain columns are handled via apply_pending_alters in init(), not this migration.
crates/core/src/types.rs	Adds optional environment and origin_domain fields to MemoryItem with #[serde(default)], and introduces DailyEntry and DailySummary structs. Non-breaking change.
crates/cli/src/main.rs	Adds --environment/--origin CLI flags to Put/CanonPut (defaulting to "cli"/hostname), adds the Day subcommand for daily narrative, and threads the new fields into existing commands. Several CLI-originated item constructors (People, Trust, Decide, ReDecide, import_missing_files) still set environment: None, inconsistent with Put which defaults to "cli". validate_day_args also misses the --date + --invalidate guard.
crates/mcp/src/handler.rs	Compiles the new MemoryItem fields by setting both to None; MemoryStoreParams has no corresponding fields so MCP callers can never supply provenance. This means every agent-written memory has environment=NULL, directly undermining the PR's stated provenance goal.
crates/server/src/lib.rs	Correctly adds environment and origin_domain to PutItemRequest and threads them through to the MemoryItem constructor.
crates/cli/src/ingest_cmd.rs	Sets environment to "cli" for transcript items but hardcodes origin_domain: None instead of calling default_origin_domain(), inconsistent with the Put command path.
crates/cli/tests/temporal_recall_smoke.rs	Test fixtures updated to include the new fields as None; no functional changes to the test logic.

Sequence Diagram

sequenceDiagram
    participant CLI as CLI (Put/Canon)
    participant Ingest as CLI (Ingest)
    participant MCP as MCP Handler
    participant Server as HTTP Server
    participant DB as MemoryDb

    CLI->>DB: put(MemoryItem)<br/>environment="cli", origin_domain=hostname
    Ingest->>DB: put(MemoryItem)<br/>environment="cli", origin_domain=None
    MCP->>DB: put(MemoryItem)<br/>environment=None, origin_domain=None
    Server->>DB: put(MemoryItem)<br/>environment=req.environment, origin_domain=req.origin_domain

    note over DB: memory_items<br/>+ environment TEXT<br/>+ origin_domain TEXT

    DB-->>CLI: MemoryItem (with provenance)
    DB-->>Ingest: MemoryItem (partial provenance)
    DB-->>MCP: MemoryItem (no provenance)
    DB-->>Server: MemoryItem (caller-supplied provenance)

Comments Outside Diff (1)

crates/cli/src/main.rs, line 298-310 (link)

--date silently ignored when combined with --invalidate

validate_day_args rejects --date paired with --update and with --list, but there is no corresponding guard for --date --invalidate. Since the invalidation path looks up the entry by ID only and never reads date, passing --date with --invalidate will succeed without error while the date is silently dropped, which is likely to confuse users.

Consider adding the mirror check:

_{Last reviewed commit: "feat: add environmen..."}

Greptile also left 2 inline comments on this PR.

…es (#119, #120) Schema migration v1→v2: - memory_items: new environment TEXT and origin_domain TEXT columns - New tables: daily_entries (append-only), daily_memories (junction), daily (materialized summary) Core types: - MemoryItem: add environment, origin_domain fields - New DailyEntry, DailySummary structs DB layer: - Centralized memory_item_from_row() helper (eliminates ~200 lines of duplicated row-parsing code across get/query/recall/everything) - MEMORY_ITEM_SELECT constant for consistent column projection - environment/origin_domain threaded through all INSERT and SELECT paths - TranscriptOpts, NoteOpts, DecisionOpts: add provenance fields - New daily methods: insert_daily_entry, get_daily_entries, list_daily_dates, invalidate_daily_entry, link_daily_memory, refresh_daily_summary CLI: - Cmd::Put: --environment and --origin flags (defaults: cli, hostname) - Cmd::Day: full subcommand with --update, --date, --context, --list, --include-invalidated, --invalidate, --reason Server + MCP: - PutItemRequest: accept environment/origin_domain - MCP handler: pass through provenance fields Tests: - Migration tests updated for v2 schema - New migration_v1_to_v2_adds_daily_tables test

greptile-apps · 2026-03-22T01:32:40Z

crates/mcp/src/handler.rs

+            environment: None,
+            origin_domain: None,


MCP items silently lose all provenance

environment and origin_domain are both hardcoded to None in the MCP handler, and MemoryStoreParams has no corresponding fields, so there's no way for an MCP caller to supply them either. The PR's stated motivation is:

"When multiple agents/surfaces write memories, provenance is critical for debugging, filtering, and trust."

The MCP surface is precisely the primary path for agent-driven memory writes, yet every item stored through it will have NULL for both provenance fields. At a minimum, you should tag items with the surface they arrived on so the goal is actually achieved:

Suggested change

environment: None,

origin_domain: None,

environment: Some("mcp".to_string()),

origin_domain: None,

Separately, consider adding environment and origin_domain fields to MemoryStoreParams so MCP callers can supply their own provenance, the same way the HTTP server does in PutItemRequest.

greptile-apps · 2026-03-22T01:32:41Z

crates/cli/src/main.rs

                            id,
                            r#type: "people".to_string(),


Inconsistent environment across CLI-originated items

These people and trust-adjustment items are created directly by the CLI, but environment is None while the Put and CanonPut commands always default it to "cli". The same gap exists in several other places added by this PR:

crates/cli/src/main.rs – People command (environment: None, ~line 3596) and trust-adjustment observation (~line 3650)

crates/cli/src/main.rs – Decide/ReDecide paths (environment: None, ~lines 4117 and 4234)

crates/cli/src/main.rs – import_missing_files helper (environment: None, ~line 4744)

crates/cli/src/ingest_cmd.rs – Transcript ingestion sets environment: Some("cli") but origin_domain: None instead of default_origin_domain()

If provenance is the goal, all CLI-originated writes should consistently carry environment: Some("cli".to_string()) and call default_origin_domain() for origin_domain, or at least document why some paths intentionally omit it.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 800ef1db3b

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-22T01:35:03Z

crates/mcp/src/handler.rs

+            environment: None,
+            origin_domain: None,


Expose provenance fields in MCP memory_store

memory_store still hard-codes environment and origin_domain to None, and MemoryStoreParams has no matching inputs, so every memory written through the MCP surface loses the provenance this change is trying to add. In deployments that ingest memories via MCP, downstream filtering/debugging by source will never work because those rows are indistinguishable from legacy items.

Useful? React with 👍 / 👎.

GoZumie · 2026-03-22T01:49:23Z

Self-review notes:

✅ default_origin_domain() correctly tries HOSTNAME then COMPUTERNAME, trims, and filters empty strings.
✅ Both Cmd::Put and CanonCmd::Set accept --environment and --origin flags.
✅ memory_item_from_row() centralization eliminates ~200 lines of duplicated parsing — this is the highest-value change.
✅ Schema migration v1→v2 adds columns with sensible defaults.
✅ Server/MCP properly thread environment/origin_domain through PutItemRequest.
✅ Ingest command sets environment="cli" for transcripts.

⚠️ Note: PR #128 (daily tables) was cherry-picked from the same combined commit. Merge #127 first, then #128 can rebase to show only the daily-table delta.

…omain Addresses Greptile P1: MCP items were silently losing all provenance (environment and origin_domain hardcoded to None).

GoZumie · 2026-03-22T01:54:36Z

Pushed fix for P1 finding:

✅ Greptile P1: MCP items lose provenance — environment now defaults to "mcp", origin_domain accepted as parameter
ℹ️ Greptile P2 (inconsistent environment across CLI paths): Valid observation — trust/people commands don't pass provenance. Low priority since those are admin commands, not memory storage.
ℹ️ Codex P2: Same as Greptile P1 — fixed.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ae7e4264de

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-22T01:59:29Z

crates/cli/src/main.rs

+                        environment: None,
+                        origin_domain: None,


Tag capture/reflex writes with the new provenance fields

The wagl capture branch still constructs MemoryItem with both provenance fields set to None, and the Reflex wrapper repeats the same pattern a few lines below. In any setup that relies on policy-governed capture for decisions/incidents/deploys, every captured row remains indistinguishable from legacy items, so filtering or debugging by source will still fail on a first-class CLI write path.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-22T01:59:29Z

crates/cli/src/ingest_cmd.rs

+                environment: Some("cli".to_string()),
+                origin_domain: None,


Preserve host provenance when ingesting transcripts

run_ingest_transcripts now marks imported turns as environment="cli", but it still hard-codes origin_domain to None. When transcript archives are imported from multiple hosts or agents, the new origin_domain column never distinguishes those memories, so the bulk-ingest path still loses half of the provenance this change is trying to add.

Useful? React with 👍 / 👎.

GoZumie requested a review from ChrisCompton as a code owner March 22, 2026 01:28

greptile-apps bot reviewed Mar 22, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 22, 2026

View reviewed changes

This was referenced Mar 22, 2026

feat: daily narrative memory tables + wagl day CLI (#120) #128

Merged

docs: ARCHITECTURE.md — system overview + data flow diagrams (#116) #135

Merged

fix: MCP memory_store now sets environment='mcp' and accepts origin_d…

ae7e426

…omain Addresses Greptile P1: MCP items were silently losing all provenance (environment and origin_domain hardcoded to None).

chatgpt-codex-connector bot reviewed Mar 22, 2026

View reviewed changes

GoZumie merged commit b4b5a38 into dev Mar 23, 2026
4 checks passed

GoZumie deleted the feat/environment-origin-domain branch March 23, 2026 10:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add environment + origin_domain provenance fields (#119)#127

feat: add environment + origin_domain provenance fields (#119)#127
GoZumie merged 2 commits intodevfrom
feat/environment-origin-domain

GoZumie commented Mar 22, 2026 •

edited by greptile-apps bot

Loading

Uh oh!

greptile-apps bot Mar 22, 2026

Uh oh!

greptile-apps bot Mar 22, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Uh oh!

GoZumie commented Mar 22, 2026

Uh oh!

GoZumie commented Mar 22, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

GoZumie commented Mar 22, 2026 • edited by greptile-apps bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Schema

Why

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Comments Outside Diff (1)

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

GoZumie commented Mar 22, 2026

Uh oh!

GoZumie commented Mar 22, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

GoZumie commented Mar 22, 2026 •

edited by greptile-apps bot

Loading