diff --git a/.planning/ROADMAP.md b/.planning/ROADMAP.md index bc9c95d..a3262de 100644 --- a/.planning/ROADMAP.md +++ b/.planning/ROADMAP.md @@ -13,6 +13,9 @@ - ✅ **v9.0 Multi-Runtime Support** — Multi-runtime converter system (shipped 2026-03-16) - ✅ **v9.1.0 Generic Skills-Based Runtime Portability** — Phases 26-28 (shipped 2026-03-16) - ✅ **v9.2.0 Documentation Accuracy Audit** — Phases 29-33 (completed 2026-03-17) +- ⬜ **v9.3.0 LangExtract + Config Spec** — Phases 34-35 + - Phase 34: Config command spec + file watcher step (9-step wizard formalized) + - Phase 35: LangExtract document graph extractor (multi-provider, retire LLMEntityExtractor as default) ## Phases @@ -245,7 +248,9 @@ Plans: | 31 | 2/2 | Complete | 2026-03-17 | 2026-03-17 | | 32 | v9.2.0 | Complete | 2026-03-17 | 2026-03-17 | | 33 | v9.2.0 | Complete | 2026-03-17 | 2026-03-17 | +| 34 | v9.3.0 | 0 plans | Complete | 2026-03-17 | +| 35 | v9.3.0 | 0 plans | Complete | 2026-03-17 | --- *Roadmap created: 2026-02-07* -*Last updated: 2026-03-17 — Phase 33 complete (2/2 plans), v9.2.0 milestone complete (11/11 plans)* +*Last updated: 2026-03-17 — Phases 34-35 complete, v9.3.0 milestone (LangExtract + Config Spec)* diff --git a/.planning/phases/34-config-command-spec/SPEC.md b/.planning/phases/34-config-command-spec/SPEC.md new file mode 100644 index 0000000..155e9d7 --- /dev/null +++ b/.planning/phases/34-config-command-spec/SPEC.md @@ -0,0 +1,409 @@ +# Phase 34: Config Command Spec + +## Purpose + +Formal specification for the `/agent-brain:agent-brain-config` command. + +This document is the **source of truth** for the 9-step wizard behavior. The command file +(`agent-brain-plugin/commands/agent-brain-config.md`) is the implementation; this spec is the +contract. Any drift between the two is a bug. + +--- + +## Trigger Conditions + +The command is invoked explicitly: +``` +/agent-brain:agent-brain-config +``` + +It is also referenced as a prerequisite from: +- `/agent-brain:agent-brain-setup` (full setup wizard) +- `/agent-brain:agent-brain-install` (post-install config step) + +--- + +## 12-Step Wizard + +### Step 1: Detect Config File Location + +**Goal:** Identify which config file is active so subsequent steps edit the correct file. + +**Action:** +```bash +agent-brain config path +agent-brain config show +``` + +**Config search order (highest to lowest priority):** +1. `AGENT_BRAIN_CONFIG` env var +2. `$AGENT_BRAIN_STATE_DIR/config.yaml` +3. `./config.yaml` +4. `.agent-brain/config.yaml` or `.claude/agent-brain/config.yaml` (walk up from CWD) +5. `~/.config/agent-brain/config.yaml` (XDG preferred) +6. `~/.agent-brain/config.yaml` (legacy, deprecated) + +**Output:** Path to the active config file. + +--- + +### Step 2: Run Pre-Flight Detection + +**Goal:** Consolidate environment state into a single JSON blob used by all subsequent steps. + +**Action:** Run `ab-setup-check.sh` script from the plugin. + +**Output keys:** +| Key | Type | Description | +|-----|------|-------------| +| `ollama_running` | bool | Whether Ollama is reachable on localhost:11434 | +| `docker_available` | bool | Whether Docker is installed | +| `config_file_path` | str | Active config file path | +| `available_postgres_port` | int | First free port in 5432-5442 range | +| `large_dirs` | list | Dirs with >1000 files or >100MB (for exclude suggestions) | + +--- + +### Step 3: Provider Selection + +**Goal:** Choose the embedding + summarization provider stack. + +**AskUserQuestion options:** +| # | Name | Providers | Keys Required | +|---|------|-----------|---------------| +| 1 | Ollama (Local) | ollama/nomic-embed-text + ollama/llama3.2 | None | +| 2 | OpenAI + Anthropic | openai/text-embedding-3-large + anthropic/claude-haiku | OPENAI_API_KEY, ANTHROPIC_API_KEY | +| 3 | Google Gemini | gemini/text-embedding-004 + gemini/gemini-2.0-flash | GOOGLE_API_KEY | +| 4 | Custom Mix | User-chosen | Varies | +| 5 | Ollama + Mistral | ollama/nomic-embed-text + ollama/mistral-small3.2 | None | + +**Config keys written:** +```yaml +embedding: + provider: "" + model: "" + base_url: "" # only for ollama + api_key: "" # only for cloud providers + +summarization: + provider: "" + model: "" + base_url: "" # only for ollama + api_key: "" # only for cloud providers +``` + +**Env var equivalents:** +- `EMBEDDING_PROVIDER`, `EMBEDDING_MODEL` +- `SUMMARIZATION_PROVIDER`, `SUMMARIZATION_MODEL` +- `OPENAI_API_KEY`, `ANTHROPIC_API_KEY`, `GOOGLE_API_KEY` + +--- + +### Step 4: Provider Setup Instructions + +**Goal:** Show provider-specific setup instructions based on Step 3 selection. + +**Per-option output:** Installation commands, model pull commands, config YAML snippet. + +**Error states:** +- Ollama not installed → show install instructions +- Ollama not running → show `ollama serve` +- Missing API key for cloud provider → show export command + +--- + +### Step 5: Storage Backend Selection + +**Goal:** Choose ChromaDB or PostgreSQL. + +**AskUserQuestion options:** +| # | Backend | Description | +|---|---------|-------------| +| 1 | ChromaDB (Default) | Local-first, zero ops | +| 2 | PostgreSQL + pgvector | Larger datasets, requires database | + +**Auto-discovery:** When PostgreSQL selected, scan ports 5432-5442 for first free port. + +**Config keys written:** +```yaml +storage: + backend: "postgres" # or "chroma" + postgres: + host: "localhost" + port: + database: "agent_brain" + user: "agent_brain" + password: "agent_brain_dev" + pool_size: 10 + pool_max_overflow: 10 + language: "english" + hnsw_m: 16 + hnsw_ef_construction: 64 + debug: false +``` + +**Env var equivalents:** +- `AGENT_BRAIN_STORAGE_BACKEND` — `"chroma"` or `"postgres"` + +**Resolution order:** `AGENT_BRAIN_STORAGE_BACKEND` env > `storage.backend` YAML > default `"chroma"`. + +--- + +### Step 6: Indexing Excludes + +**Goal:** Configure which directories to skip during indexing. + +**Action:** Use large_dirs output from Step 2 pre-flight scan to suggest excludes. + +**Default excluded patterns (no config needed):** +`node_modules`, `.venv`, `venv`, `__pycache__`, `.git`, `dist`, `build`, `target`, `.next`, `.nuxt`, `coverage` + +**AskUserQuestion options:** +| # | Action | +|---|--------| +| 1 | Use defaults | +| 2 | Add custom exclude patterns | +| 3 | Skip | + +**Config keys written (if custom):** +```json +{ "exclude_patterns": ["**/my-custom-dir/**"] } +``` +Written to `.agent-brain/config.json`. + +--- + +### Step 7: GraphRAG Configuration + +**Goal:** Enable/configure the knowledge graph index. + +**AskUserQuestion (GraphRAG enable):** +| # | Option | +|---|--------| +| 1 | Disabled (Default) | +| 2 | Enabled — JSON persistence | +| 3 | Enabled + Kuzu — persistent graph store | + +**If option 2 or 3: AskUserQuestion (extraction mode):** +| # | Extractor | Requirements | +|---|-----------|--------------| +| 1 | AST / Code Metadata | Any provider, no API key | +| 2 | LLM Entity Extractor (Anthropic) | ANTHROPIC_API_KEY (legacy, Anthropic-only) | +| 3 | LangExtract (Multi-Provider) | Configured summarization provider (Gemini, OpenAI, Claude, Ollama) | + +**Auto-default:** If no `ANTHROPIC_API_KEY`, default to option 1 (AST) or option 3 if +summarization provider is configured. + +**Config keys written:** +```yaml +graphrag: + enabled: true + store_type: "simple" # or "kuzu" + index_path: ".agent-brain/graph_index" + traversal_depth: 2 + use_llm_extraction: false # true only for option 2 + use_code_metadata: true # true for options 1 and 3 +``` + +**Env var equivalents:** +- `ENABLE_GRAPH_INDEX=true` +- `GRAPH_STORE_TYPE=simple` or `kuzu` +- `GRAPH_USE_LLM_EXTRACTION=true/false` +- `GRAPH_USE_CODE_METADATA=true/false` +- `GRAPH_DOC_EXTRACTOR=langextract` or `none` (for option 3) + +--- + +### Step 8: Caching Configuration + +**Goal:** Configure embedding cache and query cache. + +**Embedding cache AskUserQuestion:** +| # | Option | +|---|--------| +| 1 | Use defaults (500 MB disk, 1000 in-memory) | +| 2 | Customize | +| 3 | Disable | + +**Query cache AskUserQuestion:** +| # | Option | +|---|--------| +| 1 | Use defaults (300s TTL, 256 max results) | +| 2 | Customize | +| 3 | Disable | + +**Config keys written:** +```yaml +cache: + embedding_max_disk_mb: + embedding_max_mem_entries: + query_cache_ttl: + query_cache_max_size: +``` + +**Env var equivalents:** +- `EMBEDDING_CACHE_MAX_DISK_MB` +- `EMBEDDING_CACHE_MAX_MEM_ENTRIES` +- `QUERY_CACHE_TTL` +- `QUERY_CACHE_MAX_SIZE` + +--- + +### Step 9: File Watcher Configuration + +**Goal:** Enable automatic re-indexing when files change. + +**AskUserQuestion:** +| # | Option | +|---|--------| +| 1 | Disabled (Default) — index manually with `agent-brain index` | +| 2 | Enabled — server watches indexed folders and re-indexes changed files | + +**If option 2:** +- Ask for global debounce (default 30s, valid range 5–300s) +- Set env var: `AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS=` +- Inform user: per-folder watch control is set at index time, not here + +**Key facts:** +| Item | Value | +|------|-------| +| Global debounce env var | `AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS` (default: 30) | +| Per-folder watch mode | `watch_mode`: `"off"` or `"auto"` | +| Per-folder debounce | `watch_debounce_seconds` (falls back to global if unset) | +| Storage | `indexed_folders.jsonl` per-folder entry | +| Enable per folder | `agent-brain folders add ./src --watch auto --debounce 10` | +| Job source marker | `source="auto"` in queue (watcher-triggered) | +| Dedup key | `dedupe_key` prevents double-indexing same path | + +**YAML config (no config.yaml key — global debounce is env-only):** +```bash +# Global debounce (env var only, not in config.yaml): +export AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS=30 + +# Per-folder watch (set at index time, not in config.yaml): +agent-brain folders add ./src --watch auto --debounce 10 +agent-brain folders add ./docs --watch auto # uses global debounce +``` + +--- + +### Step 10: Reranking Configuration + +**Goal:** Enable/configure two-stage search reranking. + +**AskUserQuestion:** +| # | Option | +|---|--------| +| 1 | Disabled (Default) | +| 2 | sentence-transformers (local, no API key) | +| 3 | Ollama | + +**Config keys written:** +```yaml +reranker: + provider: "sentence-transformers" + model: "cross-encoder/ms-marco-MiniLM-L-6-v2" +``` + +**Env var equivalents:** +- `ENABLE_RERANKING=true` +- `RERANKER_PROVIDER=sentence-transformers|ollama` +- `RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2` +- `RERANKER_TOP_K_MULTIPLIER=10` +- `RERANKER_MAX_CANDIDATES=100` + +--- + +### Step 11: Chunking & Search Tuning + +**Goal:** Tune chunk size, overlap, and query defaults for content type. + +**AskUserQuestion:** Use defaults or customize. + +**Config keys (env-only, no config.yaml block):** +- `DEFAULT_CHUNK_SIZE` (default 512, range 128–2048) +- `DEFAULT_CHUNK_OVERLAP` (default 50) +- `DEFAULT_TOP_K` (default 5) +- `DEFAULT_SIMILARITY_THRESHOLD` (default 0.7) + +**Content type guidance:** +- Source code → 256–512 +- Prose/docs → 512–1024 +- Long-form → 1024–2048 + +--- + +### Step 12: Server & Deployment Configuration + +**Goal:** Configure server bind address, port, and instance mode. + +**AskUserQuestion:** +| # | Option | +|---|--------| +| 1 | Local (Default) — 127.0.0.1:8000 | +| 2 | Network — 0.0.0.0:8000 | +| 3 | Custom port | + +**Security warning:** Binding to `0.0.0.0` requires a reverse proxy with auth. + +**Config keys (env-only):** +- `API_HOST` (default `127.0.0.1`) +- `API_PORT` (default `8000`) +- `AGENT_BRAIN_MODE` (`project` or `shared`) +- `AGENT_BRAIN_STATE_DIR` (optional state dir override) +- `DEBUG` (default `false`) + +--- + +### Advanced Configuration Reference + +Settings rarely changed, listed for completeness: +`CHROMA_PERSIST_DIR`, `BM25_INDEX_PATH`, `COLLECTION_NAME`, `EMBEDDING_DIMENSIONS`, +`EMBEDDING_BATCH_SIZE`, `MAX_CHUNK_SIZE`, `MIN_CHUNK_SIZE`, `MAX_TOP_K`, +`AGENT_BRAIN_MAX_QUEUE`, `AGENT_BRAIN_JOB_TIMEOUT`, `AGENT_BRAIN_MAX_RETRIES`, +`AGENT_BRAIN_CHECKPOINT_INTERVAL`, `EMBEDDING_CACHE_PERSIST_STATS`, +`AGENT_BRAIN_STRICT_MODE`, `GRAPH_EXTRACTION_MODEL`, `GRAPH_RRF_K` + +--- + +## Output Format Per Step + +Each step shows a YAML/env snippet of what was configured, e.g.: + +``` +Step 9 Complete: File Watcher +============================== + +Global debounce: 30s + AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS=30 + +Per-folder watcher is configured at index time: + agent-brain folders add ./src --watch auto --debounce 10 + +Restart server to apply: + agent-brain stop && agent-brain start +``` + +--- + +## Error States and Fallbacks + +| Condition | Response | +|-----------|----------| +| Ollama not installed | Show install instructions | +| Ollama not running | Show `ollama serve` | +| Missing API key | Show export command | +| kuzu not installed | Show install command | +| langextract not installed | Show `poetry install --extras graphrag` | +| No free PostgreSQL port | Show manual port config instructions | +| ab-setup-check.sh not found | Fall back to manual env detection | + +--- + +## Version + +This spec was introduced in v9.3.0 (Phase 34). It must be updated whenever the +`agent-brain-config.md` command file changes. The version field in the command frontmatter +must match the CLI version at release time. + +Current command version: matches CLI `agent-brain-cli` version. diff --git a/.planning/phases/35-langextract-document-extractor/SPEC.md b/.planning/phases/35-langextract-document-extractor/SPEC.md new file mode 100644 index 0000000..b9a912f --- /dev/null +++ b/.planning/phases/35-langextract-document-extractor/SPEC.md @@ -0,0 +1,107 @@ +# Phase 35: LangExtract Document Extractor + +## Goal + +Add `LangExtractExtractor` for document-chunk graph extraction. Code chunks continue using +`CodeMetadataExtractor`. `LLMEntityExtractor` (Anthropic-only) becomes a legacy fallback. + +--- + +## Routing Logic + +``` +source_type == "code" → CodeMetadataExtractor (AST, no API key — unchanged) +source_type == "document" → LangExtractExtractor (multi-provider: Anthropic/Claude, + OpenAI, Gemini, Ollama) + ↓ graceful degradation if langextract not installed → [] + ↓ legacy fallback if GRAPH_USE_LLM_EXTRACTION=true → LLMEntityExtractor +``` + +--- + +## New Settings (settings.py) + +| Setting | Default | Description | +|---------|---------|-------------| +| `GRAPH_DOC_EXTRACTOR` | `"langextract"` | `"langextract"` or `"none"` | +| `GRAPH_LANGEXTRACT_PROVIDER` | `""` | Override provider (default: `SUMMARIZATION_PROVIDER`) | +| `GRAPH_LANGEXTRACT_MODEL` | `""` | Override model (default: `SUMMARIZATION_MODEL`) | + +--- + +## Provider Resolution (LangExtractExtractor) + +Priority order: +1. `GRAPH_LANGEXTRACT_PROVIDER` (explicit override) +2. `SUMMARIZATION_PROVIDER` (reuse configured summarization provider) +3. `"ollama"` (safe default) + +Model priority: +1. `GRAPH_LANGEXTRACT_MODEL` (explicit override) +2. `SUMMARIZATION_MODEL` (reuse configured summarization model) +3. `""` (langextract provider default) + +--- + +## LangExtractExtractor Design + +```python +class LangExtractExtractor: + def __init__(self, provider, model, max_triplets): ... + def extract_triplets(self, text, max_triplets, source_chunk_id) -> list[GraphTriple]: ... + def _convert_relations(self, relations, source_chunk_id) -> list[GraphTriple]: ... +``` + +- `extract_triplets` uses lazy import (`try: from langextract import extract_relations`) +- Returns `[]` gracefully when langextract not installed +- Handles both dict-style and object-style relation returns from langextract +- Handles `head`/`tail` field names as aliases for `subject`/`object` + +--- + +## GraphIndexManager Changes + +Added `langextract_extractor` parameter. Updated `_extract_from_document`: + +```python +# 1. code chunks → CodeMetadataExtractor (unchanged) +# 2. doc chunks → LangExtractExtractor (new, when GRAPH_DOC_EXTRACTOR=langextract) +# 3. legacy fallback → LLMEntityExtractor (when GRAPH_USE_LLM_EXTRACTION=true, non-code) +``` + +`LLMEntityExtractor` is **retained** in the codebase but no longer the default path for +documents. It remains active only when `GRAPH_USE_LLM_EXTRACTION=true` and +`GRAPH_DOC_EXTRACTOR != "langextract"`. + +--- + +## Config Command Update + +Step 7 (GraphRAG) now shows three extractor options: +1. AST / Code Metadata (recommended for code repos) +2. LLM Entity Extractor (legacy, Anthropic-only) +3. LangExtract (multi-provider, uses `SUMMARIZATION_PROVIDER`) + +--- + +## Files Changed + +| File | Change | +|------|--------| +| `agent_brain_server/config/settings.py` | Added `GRAPH_DOC_EXTRACTOR`, `GRAPH_LANGEXTRACT_PROVIDER`, `GRAPH_LANGEXTRACT_MODEL` | +| `agent_brain_server/indexing/graph_extractors.py` | Added `LangExtractExtractor`, `get_langextract_extractor`, updated `reset_extractors` | +| `agent_brain_server/indexing/graph_index.py` | Added `LangExtractExtractor` import and routing in `_extract_from_document` | +| `agent-brain-plugin/commands/agent-brain-config.md` | Step 7 + Step 9 updates | +| `tests/unit/test_graph_extractors.py` | Added `TestLangExtractExtractor` test class | + +--- + +## Verification Checklist + +- [x] `task before-push` passes (285 tests, 74% coverage) +- [x] `LangExtractExtractor` tests pass with mocked langextract +- [x] Graceful degradation: `langextract` not installed → returns `[]` +- [x] Routing: code chunks → `CodeMetadataExtractor`, doc chunks → `LangExtractExtractor` +- [x] `GRAPH_DOC_EXTRACTOR=none` → no LLM extraction for documents +- [x] Plugin deployed: `~/.claude/plugins/agent-brain/commands/agent-brain-config.md` updated +- [x] Phase 34 spec created at `.planning/phases/34-config-command-spec/SPEC.md` diff --git a/agent-brain-cli/poetry.lock b/agent-brain-cli/poetry.lock index 6075812..0c1f908 100644 --- a/agent-brain-cli/poetry.lock +++ b/agent-brain-cli/poetry.lock @@ -2,13 +2,13 @@ [[package]] name = "agent-brain-rag" -version = "9.0.0" +version = "9.2.0" description = "Agent Brain RAG - Intelligent document indexing and semantic search server that gives AI agents long-term memory" optional = false python-versions = "^3.10" groups = ["main"] files = [] -develop = false +develop = true [package.dependencies] anthropic = "^0.40.0" @@ -7518,4 +7518,4 @@ type = ["pytest-mypy"] [metadata] lock-version = "2.1" python-versions = "^3.10" -content-hash = "718d2a43103b26eeed80547e84b4bf795c1edfacb2c09258386d4cfd0c76eed1" +content-hash = "261b698cd2ff05722bd30021d0cf4b0fa48b5b5f4ae515d68ab4e87835c8bbe8" diff --git a/agent-brain-cli/pyproject.toml b/agent-brain-cli/pyproject.toml index ec0c375..a2d21fa 100644 --- a/agent-brain-cli/pyproject.toml +++ b/agent-brain-cli/pyproject.toml @@ -27,7 +27,7 @@ httpx = "^0.28.0" rich = "^13.9.0" pyyaml = "^6.0.0" pydantic = "^2.10.0" -agent-brain-rag = "^9.2.0" +agent-brain-rag = {path = "../agent-brain-server", develop = true} [tool.poetry.group.dev.dependencies] pytest = "^8.3.0" diff --git a/agent-brain-plugin/.claude-plugin/plugin.json b/agent-brain-plugin/.claude-plugin/plugin.json index 7950850..181615f 100644 --- a/agent-brain-plugin/.claude-plugin/plugin.json +++ b/agent-brain-plugin/.claude-plugin/plugin.json @@ -1,7 +1,7 @@ { "name": "agent-brain", "description": "Document search with hybrid BM25/semantic retrieval, GraphRAG knowledge graphs, and pluggable providers for Claude Code. Index documentation and code, then search using keyword matching, semantic similarity, graph relationships, or comprehensive multi-mode fusion.", - "version": "8.0.0", + "version": "9.3.0", "author": { "name": "Spillwave Solutions", "email": "rick@spillwave.com" diff --git a/agent-brain-plugin/commands/agent-brain-config.md b/agent-brain-plugin/commands/agent-brain-config.md index 91dda17..73b2e3e 100644 --- a/agent-brain-plugin/commands/agent-brain-config.md +++ b/agent-brain-plugin/commands/agent-brain-config.md @@ -1,6 +1,6 @@ --- name: agent-brain-config -description: Configure providers, API keys, and indexing settings for Agent Brain (providers, exclude patterns) +description: 12-step wizard to configure all Agent Brain settings — providers, storage, GraphRAG, reranking, caching, file watcher, chunking, and server deployment parameters: [] skills: - configuring-agent-brain @@ -539,6 +539,541 @@ These are excluded by default (no config needed): | `**/.nuxt/**` | Nuxt.js build cache | | `**/coverage/**` | Test coverage reports | +## Step 7: Configure GraphRAG (Knowledge Graph) + +After storage backend selection, ask whether to enable graph indexing. + +### AskUserQuestion: GraphRAG Selection + +``` +Would you like to enable GraphRAG (knowledge graph indexing)? + +GraphRAG extracts entity relationships from your documents and code, enabling +graph-based queries like "what classes depend on X?" alongside standard search. + +Options: +1. Disabled (Default) - Standard vector + BM25 hybrid search only +2. Enabled - GraphRAG with JSON persistence (no extra dependencies) +3. Enabled + Kuzu - GraphRAG with Kuzu persistent graph store (requires kuzu install) +``` + +### If Option 2 or 3: AskUserQuestion: Extraction Mode + +**IMPORTANT:** Check which provider is configured first: + +```bash +# Check if ANTHROPIC_API_KEY is available +echo ${ANTHROPIC_API_KEY:+set} || echo "not set" +``` + +Then ask: + +``` +Which graph extractor would you like to use? + +Agent Brain has three extractors: + +1. AST / Code Metadata (Recommended) - Extracts function calls, imports, class + hierarchies directly from code. Works with ANY provider, no API key needed. + Best for code repositories. + +2. LLM Entity Extractor (Legacy) - Uses Anthropic API to extract semantic triplets + from text. Requires ANTHROPIC_API_KEY (does NOT use Ollama even if Ollama is + your embedding/summarization provider). Best for prose/documentation. + +3. LangExtract (Multi-Provider) - Uses your configured summarization provider + (Gemini, OpenAI, Claude, Ollama) for document semantic extraction. Zero new + config needed if summarization is already set up. + Set GRAPH_DOC_EXTRACTOR=langextract (default when langextract installed). +``` + +**Auto-default:** If the user is using Ollama OR no `ANTHROPIC_API_KEY` is set, +default to Option 1 (AST) for code repos. If the user has documents/prose and +`SUMMARIZATION_PROVIDER` is configured, suggest Option 3 (LangExtract) as well: +``` +Defaulting to AST/Code Metadata extractor — LLM extraction requires +ANTHROPIC_API_KEY which is not set in this environment. + +Tip: For prose/documentation, enable LangExtract (Option 3) to use your +configured summarization provider for semantic entity extraction. +``` + +### If Option 2 (Simple / JSON persistence) + AST extractor: + +Add to config.yaml: + +```yaml +graphrag: + enabled: true + store_type: "simple" + index_path: ".agent-brain/graph_index" + traversal_depth: 2 + use_llm_extraction: false + use_code_metadata: true +``` + +Or via environment variables: +```bash +export ENABLE_GRAPH_INDEX=true +export GRAPH_STORE_TYPE=simple # JSON persistence, no kuzu needed +export GRAPH_INDEX_PATH=.agent-brain/graph_index +export GRAPH_USE_LLM_EXTRACTION=false # No ANTHROPIC_API_KEY required +export GRAPH_USE_CODE_METADATA=true # AST-based relationship extraction +export GRAPH_TRAVERSAL_DEPTH=2 +``` + +### If Option 2 + LLM extractor (Anthropic key confirmed present): + +```bash +export ENABLE_GRAPH_INDEX=true +export GRAPH_STORE_TYPE=simple +export GRAPH_INDEX_PATH=.agent-brain/graph_index +export GRAPH_USE_LLM_EXTRACTION=true +export GRAPH_USE_CODE_METADATA=false +export GRAPH_TRAVERSAL_DEPTH=2 +``` + +### If Option 2 + LangExtract (multi-provider, uses SUMMARIZATION_PROVIDER): + +```bash +export ENABLE_GRAPH_INDEX=true +export GRAPH_STORE_TYPE=simple +export GRAPH_INDEX_PATH=.agent-brain/graph_index +export GRAPH_USE_LLM_EXTRACTION=false +export GRAPH_USE_CODE_METADATA=true # keep AST for code chunks +export GRAPH_DOC_EXTRACTOR=langextract # LangExtract for document chunks +export GRAPH_TRAVERSAL_DEPTH=2 +# Optional: override the provider/model used for LangExtract extraction +# (defaults to SUMMARIZATION_PROVIDER/SUMMARIZATION_MODEL) +# export GRAPH_LANGEXTRACT_PROVIDER=ollama +# export GRAPH_LANGEXTRACT_MODEL=mistral-small3.2:latest +``` + +Requires langextract to be installed (included in graphrag extras): +```bash +cd agent-brain-server && poetry install --extras graphrag +``` + +### If Option 3 (Kuzu / Persistent): + +First check if kuzu is installed: + +```bash +python3 -c "import kuzu" 2>/dev/null && echo "kuzu available" || echo "kuzu NOT installed" +``` + +If not installed: +``` +Kuzu requires the optional graphrag-kuzu dependency: + cd agent-brain-server && poetry install --extras graphrag-kuzu + +Or install directly: + uv pip install kuzu +``` + +Add to config.yaml (with preferred extractor from above): + +```yaml +graphrag: + enabled: true + store_type: "kuzu" + index_path: ".agent-brain/graph_index" + traversal_depth: 2 + use_llm_extraction: false # or true if ANTHROPIC_API_KEY is available + use_code_metadata: true +``` + +**Note:** Enabling GraphRAG increases indexing time. Re-index after enabling: +```bash +agent-brain reset --yes && agent-brain index ./your-docs +``` + +### If Option 2 or 3: AskUserQuestion: GraphRAG Tuning (Optional) + +After extraction mode selection, offer tuning: + +``` +Would you like to tune GraphRAG extraction settings? +(Default values work well for most projects) + +Options: +1. Use defaults — traversal_depth=2, max_triplets=10 per chunk +2. Customize — adjust depth and triplet density +``` + +**If Option 2 (Customize):** + +```bash +# Traversal depth: how many hops to follow from a matched entity (default: 2) +# Higher = richer context, slower queries (range: 1-5) +export GRAPH_TRAVERSAL_DEPTH=2 + +# Max triplets per chunk (default: 10) +# Higher = more relationships extracted, slower indexing +export GRAPH_MAX_TRIPLETS_PER_CHUNK=10 +``` + +Config YAML equivalent: +```yaml +graphrag: + traversal_depth: 2 + max_triplets_per_chunk: 10 +``` + +## Step 8: Configure Caching + +### Embedding Cache + +The embedding cache reduces API costs by storing computed embeddings locally (two-tier: in-memory LRU + SQLite disk). Always beneficial for cloud providers; less relevant for Ollama. + +### AskUserQuestion: Embedding Cache + +``` +Configure embedding cache settings? + +The embedding cache avoids recomputing embeddings for unchanged content. +Healthy cache shows >80% hit rate after first full index. + +Options: +1. Use defaults - 500 MB disk cache, 1000 in-memory entries +2. Customize - Set disk size and memory entries manually +3. Disable - No caching (not recommended for cloud providers) +``` + +**If Option 2 (Customize):** Ask for disk limit (MB) and memory entries, then add to config.yaml: + +```yaml +cache: + embedding_max_disk_mb: # e.g. 1000 + embedding_max_mem_entries: # e.g. 2000 +``` + +Or via environment variables: +```bash +export EMBEDDING_CACHE_MAX_DISK_MB=1000 +export EMBEDDING_CACHE_MAX_MEM_ENTRIES=2000 +``` + +**If Option 3 (Disable):** +```bash +export EMBEDDING_CACHE_MAX_DISK_MB=0 +``` + +### Query Cache + +The query cache stores identical query results for a TTL window. Graph and multi-mode queries are never cached. + +### AskUserQuestion: Query Cache + +``` +Configure query result cache? + +Caches repeated identical queries to reduce latency. +Note: graph and multi-mode queries are never cached. + +Options: +1. Use defaults - 300s TTL, 256 max results +2. Customize - Set TTL and max size +3. Disable - TTL=0 (no caching) +``` + +**If Option 2 (Customize):** Add to config.yaml: + +```yaml +cache: + query_cache_ttl: # e.g. 600 for stable codebases + query_cache_max_size: # e.g. 512 for large query workloads +``` + +Or via environment variables: +```bash +export QUERY_CACHE_TTL=600 +export QUERY_CACHE_MAX_SIZE=512 +``` + +**If Option 3 (Disable):** +```bash +export QUERY_CACHE_TTL=0 +``` + +## Step 9: Configure File Watcher (Auto-Reindex on Change) + +### AskUserQuestion: File Watcher + +``` +Would you like to enable automatic re-indexing when files change? + +Options: +1. Disabled (Default) — Index manually with `agent-brain index` +2. Enabled — Server watches indexed folders and re-indexes changed files + Debounce: 30s by default (prevents re-index thrashing on rapid saves) +``` + +### If Option 2 (Enabled): + +Ask for global debounce (default 30s, valid range 5–300s): + +``` +How many seconds should the watcher wait after a file change before re-indexing? +(Default: 30s — prevents re-indexing on every keystroke during active editing) +``` + +Set the global debounce via environment variable: + +```bash +export AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS=30 +``` + +**Key facts about the file watcher:** + +| Item | Detail | +|------|--------| +| Global debounce | `AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS` (default 30s) | +| Per-folder watch | Set at index time, not in config.yaml | +| Enable per folder | `agent-brain folders add ./src --watch auto --debounce 10` | +| Disable per folder | `agent-brain folders add ./src --watch off` | +| View watch status | `agent-brain folders list` shows watch_mode per folder | +| Job source | Watcher-triggered jobs appear with `source="auto"` in queue | +| Deduplication | Same path is never double-indexed (dedupe_key prevents this) | + +**Note:** `watch_mode` is a per-folder setting configured at index time. The +`AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS` env var sets the global default debounce that +applies when no per-folder override is set. + +``` +Step 9 Complete: File Watcher +============================== + +Global debounce: 30s + AGENT_BRAIN_WATCH_DEBOUNCE_SECONDS=30 + +Per-folder watcher is configured at index time: + agent-brain folders add ./src --watch auto # use global debounce + agent-brain folders add ./docs --watch auto --debounce 10 # 10s override + +View current watch settings: + agent-brain folders list + +Restart server to apply: + agent-brain stop && agent-brain start +``` + +--- + +## Step 10: Configure Reranking (Two-Stage Search Quality) + +Reranking adds a second-pass scoring pass after the initial hybrid search, re-ranking +candidates with a cross-encoder model for higher precision results. It's off by default +because it requires additional model downloads. + +### AskUserQuestion: Reranking + +``` +Would you like to enable two-stage reranking for higher search precision? + +Reranking re-scores the top candidates with a cross-encoder model, improving +result quality at the cost of slightly higher query latency. + +Options: +1. Disabled (Default) — single-stage hybrid BM25 + vector search +2. Enabled (sentence-transformers) — local cross-encoder, no API key needed + Requires: pip install sentence-transformers (first run auto-downloads ~90 MB) +3. Enabled (Ollama) — uses Ollama for reranking, requires Ollama running +``` + +**If Option 2 (sentence-transformers):** + +```bash +export ENABLE_RERANKING=true +export RERANKER_PROVIDER=sentence-transformers +export RERANKER_MODEL=cross-encoder/ms-marco-MiniLM-L-6-v2 # default +``` + +Config YAML (`reranker` block): +```yaml +reranker: + provider: "sentence-transformers" + model: "cross-encoder/ms-marco-MiniLM-L-6-v2" +``` + +**If Option 3 (Ollama):** + +```bash +export ENABLE_RERANKING=true +export RERANKER_PROVIDER=ollama +``` + +Config YAML: +```yaml +reranker: + provider: "ollama" + base_url: "http://localhost:11434" +``` + +**Advanced tuning** (rarely needed): +```bash +# How many candidates Stage 1 retrieves for Stage 2 to rerank (default: top_k × 10) +export RERANKER_TOP_K_MULTIPLIER=10 +# Hard cap on Stage 1 candidates (default: 100) +export RERANKER_MAX_CANDIDATES=100 +``` + +--- + +## Step 11: Chunking & Search Tuning + +Default values work well for most projects. Ask only if the user wants to tune +indexing quality or search behavior. + +### AskUserQuestion: Tuning + +``` +Would you like to adjust chunking and search defaults? +(These affect indexing quality and search result count) + +Options: +1. Use defaults — chunk_size=512, overlap=50, top_k=5, threshold=0.7 +2. Customize — adjust for your content type +3. Skip +``` + +**If Option 2 (Customize):** + +#### Chunk Size + +``` +What chunk size would you like? + +Larger chunks = more context per result but less precise retrieval. +Smaller chunks = more precise but may cut mid-thought. + +Recommended by content type: +- Source code: 256–512 (default: 512) +- Prose/docs: 512–1024 +- Long-form books: 1024–2048 + +Enter chunk size (128–2048, default: 512): +``` + +```bash +export DEFAULT_CHUNK_SIZE=512 +export DEFAULT_CHUNK_OVERLAP=50 # tokens of overlap between adjacent chunks +``` + +#### Search Top-K + +``` +How many results should queries return by default? (default: 5, max: 50) +``` + +```bash +export DEFAULT_TOP_K=5 +``` + +#### Similarity Threshold + +``` +Minimum similarity score to include a result (0.0–1.0, default: 0.7) + +Lower = more results but potentially less relevant. +Higher = fewer but more precise results. +``` + +```bash +export DEFAULT_SIMILARITY_THRESHOLD=0.7 +``` + +Config YAML (no `chunking` block — these are env-var only settings): +```bash +# All chunking/query settings are env-var only (not in config.yaml) +export DEFAULT_CHUNK_SIZE=512 +export DEFAULT_CHUNK_OVERLAP=50 +export DEFAULT_TOP_K=5 +export DEFAULT_SIMILARITY_THRESHOLD=0.7 +``` + +--- + +## Step 12: Server & Deployment Configuration + +### AskUserQuestion: Deployment Mode + +``` +How will Agent Brain be deployed? + +Options: +1. Local (Default) — binds to 127.0.0.1:8000, accessible only from this machine +2. Network — binds to 0.0.0.0 or specific IP, accessible from other machines +3. Custom port — same as option 1 but on a different port +``` + +**If Option 2 or 3:** + +```bash +# Bind address (default: 127.0.0.1 — localhost only) +# Use 0.0.0.0 to accept connections from any interface +export API_HOST=0.0.0.0 + +# Port (default: 8000) +export API_PORT=8000 +``` + +**Security note for network deployment:** +``` +WARNING: Binding to 0.0.0.0 exposes Agent Brain to your local network. +Agent Brain has no built-in authentication. Use a reverse proxy (nginx, Caddy) +with authentication in front of it for network deployments. +``` + +### Multi-Instance Mode (Advanced) + +If running multiple Agent Brain instances (one per project): + +```bash +# "project" (default) — state stored in .agent-brain/ in current directory +# "shared" — state stored in AGENT_BRAIN_STATE_DIR (one shared index) +export AGENT_BRAIN_MODE=project + +# Override state directory (overrides project-level .agent-brain/) +# export AGENT_BRAIN_STATE_DIR=/custom/path/to/state +``` + +### Debug Mode + +```bash +# Enable verbose logging (default: false) +# export DEBUG=true +``` + +--- + +## Advanced Configuration Reference + +Settings not covered by the wizard (rarely need changing): + +| Environment Variable | Default | Description | +|---------------------|---------|-------------| +| `CHROMA_PERSIST_DIR` | `./chroma_db` | ChromaDB storage directory | +| `BM25_INDEX_PATH` | `./bm25_index` | BM25 keyword index directory | +| `COLLECTION_NAME` | `agent_brain_collection` | ChromaDB collection name | +| `EMBEDDING_DIMENSIONS` | `3072` | Vector dimensions (must match model) | +| `EMBEDDING_BATCH_SIZE` | `100` | API batch size for embedding calls | +| `MAX_CHUNK_SIZE` | `2048` | Hard cap on chunk size | +| `MIN_CHUNK_SIZE` | `128` | Minimum chunk size | +| `MAX_TOP_K` | `50` | Maximum results per query | +| `AGENT_BRAIN_MAX_QUEUE` | `100` | Max pending jobs in queue | +| `AGENT_BRAIN_JOB_TIMEOUT` | `7200` | Job timeout in seconds (2 hours) | +| `AGENT_BRAIN_MAX_RETRIES` | `3` | Job retry attempts on failure | +| `AGENT_BRAIN_CHECKPOINT_INTERVAL` | `50` | Progress save interval (files) | +| `EMBEDDING_CACHE_PERSIST_STATS` | `false` | Persist cache hit/miss stats across restarts | +| `AGENT_BRAIN_STRICT_MODE` | `false` | Fail on critical validation errors | +| `GRAPH_EXTRACTION_MODEL` | `claude-haiku-4-5` | LLM model for legacy LLM entity extraction | +| `GRAPH_RRF_K` | `60` | Reciprocal Rank Fusion constant for graph queries | + +All settings can be placed in a `.env` file in the server directory or set as environment variables. + +--- + ## CLI Subcommands Reference The `agent-brain config` group has two subcommands: diff --git a/agent-brain-server/agent_brain_server/config/settings.py b/agent-brain-server/agent_brain_server/config/settings.py index 5c9ff3d..7281576 100644 --- a/agent-brain-server/agent_brain_server/config/settings.py +++ b/agent-brain-server/agent_brain_server/config/settings.py @@ -66,9 +66,16 @@ class Settings(BaseSettings): GRAPH_EXTRACTION_MODEL: str = "claude-haiku-4-5" # Model for entity extraction GRAPH_MAX_TRIPLETS_PER_CHUNK: int = 10 # Max triplets per document chunk GRAPH_USE_CODE_METADATA: bool = True # Use AST metadata for code entities - GRAPH_USE_LLM_EXTRACTION: bool = True # Use LLM for additional extraction + # Legacy: Anthropic LLM for doc extraction (superseded by GRAPH_DOC_EXTRACTOR) + GRAPH_USE_LLM_EXTRACTION: bool = False GRAPH_TRAVERSAL_DEPTH: int = 2 # Depth for graph traversal in queries GRAPH_RRF_K: int = 60 # Reciprocal Rank Fusion constant for multi-retrieval + # "langextract" (multi-provider) or "none"; overrides GRAPH_USE_LLM_EXTRACTION + GRAPH_DOC_EXTRACTOR: str = "langextract" + # Override provider for LangExtract (default: from summarization config) + GRAPH_LANGEXTRACT_PROVIDER: str = "" + # Override model for LangExtract (default: from summarization config) + GRAPH_LANGEXTRACT_MODEL: str = "" # Job Queue Configuration (Feature 115) AGENT_BRAIN_MAX_QUEUE: int = 100 # Max pending jobs in queue diff --git a/agent-brain-server/agent_brain_server/indexing/graph_extractors.py b/agent-brain-server/agent_brain_server/indexing/graph_extractors.py index 5ffa83a..df3c514 100644 --- a/agent-brain-server/agent_brain_server/indexing/graph_extractors.py +++ b/agent-brain-server/agent_brain_server/indexing/graph_extractors.py @@ -1,8 +1,14 @@ """Entity extraction for GraphRAG (Feature 113). Provides extractors for building the knowledge graph: -- LLMEntityExtractor: Uses LLM to extract entity-relationship triplets -- CodeMetadataExtractor: Extracts relationships from code AST metadata +- LLMEntityExtractor: Legacy Anthropic-only LLM extraction (doc chunks) +- CodeMetadataExtractor: Extracts relationships from code AST metadata (code chunks) +- LangExtractExtractor: Multi-provider extraction via langextract (doc chunks, default) + +Routing in GraphIndexManager._extract_from_document: + source_type == "code" → CodeMetadataExtractor (AST, no API key) + source_type == "document" → LangExtractExtractor (GRAPH_DOC_EXTRACTOR=langextract) + legacy fallback → LLMEntityExtractor (GRAPH_USE_LLM_EXTRACTION=true) All extractors return GraphTriple objects for graph construction. """ @@ -620,9 +626,221 @@ def _extract_go_imports( return triplets +class LangExtractExtractor: + """Multi-provider document graph extraction via LangExtract library. + + Supports Gemini, OpenAI, Claude, and Ollama for document-chunk entity + extraction. Falls back to returning [] when langextract is not installed. + + Provider resolution order: + 1. GRAPH_LANGEXTRACT_PROVIDER (explicit override) + 2. SUMMARIZATION_PROVIDER (reuse configured summarization provider) + 3. "ollama" (safe default if nothing else configured) + + Attributes: + provider: The provider to use for extraction. + model: The model to use for extraction. + max_triplets: Maximum triplets to extract per chunk. + """ + + def __init__( + self, + provider: str | None = None, + model: str | None = None, + max_triplets: int | None = None, + ) -> None: + """Initialize LangExtract extractor. + + Args: + provider: LangExtract provider (defaults to settings resolution chain). + model: Model to use (defaults to settings resolution chain). + max_triplets: Max triplets per chunk (defaults to settings value). + """ + # Priority: explicit > GRAPH_LANGEXTRACT_PROVIDER > summarization > ollama + _summarization_provider = "" + _summarization_model = "" + try: + from agent_brain_server.config.provider_config import ( + load_provider_settings, + ) + + _prov = load_provider_settings() + _summarization_provider = str(_prov.summarization.provider or "") + _summarization_model = str(_prov.summarization.model or "") + except Exception: + pass # Config not loaded yet (e.g. during testing) — use fallback + + self.provider = ( + provider + or settings.GRAPH_LANGEXTRACT_PROVIDER + or _summarization_provider + or "ollama" + ) + # Resolve model: explicit > GRAPH_LANGEXTRACT_MODEL > summarization model > "" + self.model = ( + model or settings.GRAPH_LANGEXTRACT_MODEL or _summarization_model or "" + ) + self.max_triplets = max_triplets or settings.GRAPH_MAX_TRIPLETS_PER_CHUNK + + def extract_triplets( + self, + text: str, + max_triplets: int | None = None, + source_chunk_id: str | None = None, + ) -> list[GraphTriple]: + """Extract entity-relationship triplets from document text. + + Uses langextract library for multi-provider extraction. Returns [] + gracefully when langextract is not installed or extraction fails. + + Args: + text: Document text content to extract entities from. + max_triplets: Override for max triplets (uses instance default). + source_chunk_id: Optional source chunk ID for provenance. + + Returns: + List of GraphTriple objects extracted from text. + Returns empty list on failure (graceful degradation). + """ + if not settings.ENABLE_GRAPH_INDEX: + return [] + + if settings.GRAPH_DOC_EXTRACTOR == "none": + return [] + + if not text: + return [] + + try: + import langextract # noqa: F401 (check availability) + from langextract import extract_relations + except ImportError: + logger.warning( + "langextract not installed; document graph extraction disabled. " + "Install: cd agent-brain-server && poetry install --extras graphrag" + ) + return [] + + max_count = max_triplets or self.max_triplets + + # Truncate very long text to avoid token limits + max_chars = 4000 + if len(text) > max_chars: + text = text[:max_chars] + "..." + + try: + relations = extract_relations( + text, + provider=self.provider, + model=self.model or None, + max_relations=max_count, + ) + + triplets = self._convert_relations(relations, source_chunk_id) + + logger.debug( + "langextract_extractor.extract_triplets: completed", + extra={ + "triplet_count": len(triplets), + "provider": self.provider, + "model": self.model, + "text_length": len(text), + "source_chunk_id": source_chunk_id, + }, + ) + return triplets + + except Exception as e: + logger.warning( + "langextract_extractor.extract_triplets: failed", + extra={ + "error": str(e), + "provider": self.provider, + "model": self.model, + "text_length": len(text), + "source_chunk_id": source_chunk_id, + }, + ) + return [] + + def _convert_relations( + self, + relations: Any, + source_chunk_id: str | None, + ) -> list[GraphTriple]: + """Convert langextract relations to GraphTriple list. + + Args: + relations: Relations returned by langextract.extract_relations(). + source_chunk_id: Optional source chunk ID for provenance. + + Returns: + List of GraphTriple objects. + """ + triplets: list[GraphTriple] = [] + + if not relations: + return triplets + + # langextract returns a list of relation dicts or objects + for rel in relations: + try: + # Handle both dict and object-style returns + if isinstance(rel, dict): + subject = str(rel.get("subject") or rel.get("head") or "") + predicate = str(rel.get("relation") or rel.get("predicate") or "") + obj = str(rel.get("object") or rel.get("tail") or "") + subject_type = rel.get("subject_type") or rel.get("head_type") + object_type = rel.get("object_type") or rel.get("tail_type") + else: + subject = str( + getattr(rel, "subject", None) or getattr(rel, "head", "") or "" + ) + predicate = str( + getattr(rel, "relation", None) + or getattr(rel, "predicate", "") + or "" + ) + obj = str( + getattr(rel, "object", None) or getattr(rel, "tail", "") or "" + ) + subject_type = getattr(rel, "subject_type", None) or getattr( + rel, "head_type", None + ) + object_type = getattr(rel, "object_type", None) or getattr( + rel, "tail_type", None + ) + + if not subject or not predicate or not obj: + continue + + predicate = predicate.lower().strip() + if subject_type: + subject_type = normalize_entity_type(str(subject_type)) + if object_type: + object_type = normalize_entity_type(str(object_type)) + + triplets.append( + GraphTriple( + subject=subject, + subject_type=subject_type, + predicate=predicate, + object=obj, + object_type=object_type, + source_chunk_id=source_chunk_id, + ) + ) + except Exception as e: + logger.debug(f"Failed to convert langextract relation: {e}") + continue + + return triplets + + # Module-level singleton instances _llm_extractor: LLMEntityExtractor | None = None _code_extractor: CodeMetadataExtractor | None = None +_langextract_extractor: LangExtractExtractor | None = None def get_llm_extractor() -> LLMEntityExtractor: @@ -641,8 +859,17 @@ def get_code_extractor() -> CodeMetadataExtractor: return _code_extractor +def get_langextract_extractor() -> LangExtractExtractor: + """Get the global LangExtract extractor instance.""" + global _langextract_extractor + if _langextract_extractor is None: + _langextract_extractor = LangExtractExtractor() + return _langextract_extractor + + def reset_extractors() -> None: """Reset extractor singletons. Used for testing.""" - global _llm_extractor, _code_extractor + global _llm_extractor, _code_extractor, _langextract_extractor _llm_extractor = None _code_extractor = None + _langextract_extractor = None diff --git a/agent-brain-server/agent_brain_server/indexing/graph_index.py b/agent-brain-server/agent_brain_server/indexing/graph_index.py index d3de320..b521f9b 100644 --- a/agent-brain-server/agent_brain_server/indexing/graph_index.py +++ b/agent-brain-server/agent_brain_server/indexing/graph_index.py @@ -12,8 +12,10 @@ from agent_brain_server.config import settings from agent_brain_server.indexing.graph_extractors import ( CodeMetadataExtractor, + LangExtractExtractor, LLMEntityExtractor, get_code_extractor, + get_langextract_extractor, get_llm_extractor, ) from agent_brain_server.models.graph import ( @@ -38,16 +40,23 @@ class GraphIndexManager: """Manages graph index building and querying. Coordinates: - - Entity extraction from documents (LLM and code metadata) + - Entity extraction from documents (LLM, LangExtract, and code metadata) - Triplet storage in GraphStoreManager - Graph-based retrieval for queries + Extraction routing: + - source_type == "code" → CodeMetadataExtractor (AST, no API key) + - source_type == "document" → LangExtractExtractor (multi-provider) when + GRAPH_DOC_EXTRACTOR == "langextract" + - Legacy fallback → LLMEntityExtractor when GRAPH_USE_LLM_EXTRACTION + All operations are no-ops when ENABLE_GRAPH_INDEX is False. Attributes: graph_store: The underlying graph store manager. - llm_extractor: LLM-based entity extractor. + llm_extractor: LLM-based entity extractor (legacy Anthropic-only). code_extractor: Code metadata extractor. + langextract_extractor: Multi-provider document extractor. """ def __init__( @@ -55,6 +64,7 @@ def __init__( graph_store: GraphStoreManager | None = None, llm_extractor: LLMEntityExtractor | None = None, code_extractor: CodeMetadataExtractor | None = None, + langextract_extractor: LangExtractExtractor | None = None, ) -> None: """Initialize graph index manager. @@ -62,10 +72,14 @@ def __init__( graph_store: Graph store manager (defaults to singleton). llm_extractor: LLM extractor (defaults to singleton). code_extractor: Code extractor (defaults to singleton). + langextract_extractor: LangExtract extractor (defaults to singleton). """ self.graph_store = graph_store or get_graph_store_manager() self.llm_extractor = llm_extractor or get_llm_extractor() self.code_extractor = code_extractor or get_code_extractor() + self.langextract_extractor = ( + langextract_extractor or get_langextract_extractor() + ) self._last_build_time: datetime | None = None self._last_triplet_count: int = 0 @@ -169,7 +183,7 @@ def _extract_from_document(self, doc: Any) -> list[GraphTriple]: source_type = metadata.get("source_type", "doc") language = metadata.get("language") - # 1. Extract from code metadata (fast, deterministic) + # 1. Extract from code metadata (fast, deterministic — code chunks only) if source_type == "code" and settings.GRAPH_USE_CODE_METADATA: code_triplets = self.code_extractor.extract_from_metadata( metadata, source_chunk_id=chunk_id @@ -183,8 +197,19 @@ def _extract_from_document(self, doc: Any) -> list[GraphTriple]: ) triplets.extend(text_triplets) - # 2. Extract using LLM (slower, more comprehensive) - if settings.GRAPH_USE_LLM_EXTRACTION and text: + # 2. Extract from document chunks using LangExtract (multi-provider) + if ( + text + and source_type != "code" + and settings.GRAPH_DOC_EXTRACTOR == "langextract" + ): + doc_triplets = self.langextract_extractor.extract_triplets( + text, source_chunk_id=chunk_id + ) + triplets.extend(doc_triplets) + + # 3. Legacy LLM extraction (Anthropic-only fallback) + elif settings.GRAPH_USE_LLM_EXTRACTION and text and source_type != "code": llm_triplets = self.llm_extractor.extract_triplets( text, source_chunk_id=chunk_id ) diff --git a/agent-brain-server/tests/unit/test_graph_extractors.py b/agent-brain-server/tests/unit/test_graph_extractors.py index 7b1aae6..0f1e60e 100644 --- a/agent-brain-server/tests/unit/test_graph_extractors.py +++ b/agent-brain-server/tests/unit/test_graph_extractors.py @@ -6,8 +6,10 @@ from agent_brain_server.indexing.graph_extractors import ( CodeMetadataExtractor, + LangExtractExtractor, LLMEntityExtractor, get_code_extractor, + get_langextract_extractor, get_llm_extractor, reset_extractors, ) @@ -492,6 +494,226 @@ def test_extract_normalizes_class_type(self, mock_settings: MagicMock): assert defined_in[0].subject_type == "Class" # Not "class" +class TestLangExtractExtractor: + """Tests for LangExtractExtractor.""" + + def test_init_with_defaults(self): + """Test initialization resolves provider from settings.""" + extractor = LangExtractExtractor() + + # provider should be resolved (non-empty string) + assert extractor.provider is not None + assert isinstance(extractor.provider, str) + assert extractor.max_triplets > 0 + + def test_init_with_explicit_params(self): + """Test initialization with explicit provider and model.""" + extractor = LangExtractExtractor( + provider="openai", + model="gpt-4o", + max_triplets=5, + ) + + assert extractor.provider == "openai" + assert extractor.model == "gpt-4o" + assert extractor.max_triplets == 5 + + @patch("agent_brain_server.indexing.graph_extractors.settings") + def test_extract_triplets_disabled(self, mock_settings: MagicMock): + """Test extraction is no-op when graph indexing disabled.""" + mock_settings.ENABLE_GRAPH_INDEX = False + mock_settings.GRAPH_DOC_EXTRACTOR = "langextract" + + extractor = LangExtractExtractor(provider="ollama") + result = extractor.extract_triplets("Some document text") + + assert result == [] + + @patch("agent_brain_server.indexing.graph_extractors.settings") + def test_extract_triplets_none_extractor(self, mock_settings: MagicMock): + """Test extraction is no-op when GRAPH_DOC_EXTRACTOR=none.""" + mock_settings.ENABLE_GRAPH_INDEX = True + mock_settings.GRAPH_DOC_EXTRACTOR = "none" + + extractor = LangExtractExtractor(provider="ollama") + result = extractor.extract_triplets("Some document text") + + assert result == [] + + @patch("agent_brain_server.indexing.graph_extractors.settings") + def test_extract_triplets_empty_text(self, mock_settings: MagicMock): + """Test extraction returns empty list for empty text.""" + mock_settings.ENABLE_GRAPH_INDEX = True + mock_settings.GRAPH_DOC_EXTRACTOR = "langextract" + + extractor = LangExtractExtractor(provider="ollama") + result = extractor.extract_triplets("") + + assert result == [] + + @patch("agent_brain_server.indexing.graph_extractors.settings") + def test_extract_triplets_langextract_not_installed(self, mock_settings: MagicMock): + """Test graceful degradation when langextract is not installed.""" + mock_settings.ENABLE_GRAPH_INDEX = True + mock_settings.GRAPH_DOC_EXTRACTOR = "langextract" + + extractor = LangExtractExtractor(provider="ollama") + + with patch.dict("sys.modules", {"langextract": None}): + result = extractor.extract_triplets("FastAPI uses Pydantic for validation") + + assert result == [] + + @patch("agent_brain_server.indexing.graph_extractors.settings") + def test_convert_relations_produces_correct_triplets( + self, mock_settings: MagicMock + ): + """Test _convert_relations produces correct GraphTriple list. + + Tests conversion directly rather than mocking langextract's lazy import, + which is separately covered by the graceful-degradation tests. + """ + mock_settings.ENABLE_GRAPH_INDEX = True + mock_settings.GRAPH_DOC_EXTRACTOR = "langextract" + + extractor = LangExtractExtractor(provider="ollama") + + relations = [ + { + "subject": "FastAPI", + "relation": "uses", + "object": "Pydantic", + "subject_type": "Framework", + "object_type": "Library", + }, + { + "subject": "Agent Brain", + "relation": "depends_on", + "object": "ChromaDB", + }, + ] + + triplets = extractor._convert_relations(relations, source_chunk_id="doc_1") + + assert len(triplets) == 2 + assert triplets[0].subject == "FastAPI" + assert triplets[0].predicate == "uses" + assert triplets[0].object == "Pydantic" + assert triplets[0].source_chunk_id == "doc_1" + assert triplets[1].subject == "Agent Brain" + assert triplets[1].predicate == "depends_on" + assert triplets[1].object == "ChromaDB" + + def test_convert_relations_dict_format(self): + """Test _convert_relations handles dict-style relations.""" + extractor = LangExtractExtractor(provider="ollama") + + relations = [ + { + "subject": "FastAPI", + "relation": "uses", + "object": "Pydantic", + "subject_type": "Framework", + "object_type": "Library", + } + ] + + triplets = extractor._convert_relations(relations, source_chunk_id="c1") + + assert len(triplets) == 1 + assert triplets[0].subject == "FastAPI" + assert triplets[0].predicate == "uses" + assert triplets[0].object == "Pydantic" + assert triplets[0].source_chunk_id == "c1" + + def test_convert_relations_head_tail_format(self): + """Test _convert_relations handles head/tail dict format.""" + extractor = LangExtractExtractor(provider="ollama") + + relations = [ + { + "head": "Agent Brain", + "predicate": "depends_on", + "tail": "ChromaDB", + } + ] + + triplets = extractor._convert_relations(relations, source_chunk_id=None) + + assert len(triplets) == 1 + assert triplets[0].subject == "Agent Brain" + assert triplets[0].predicate == "depends_on" + assert triplets[0].object == "ChromaDB" + + def test_convert_relations_object_format(self): + """Test _convert_relations handles object-style relations.""" + extractor = LangExtractExtractor(provider="ollama") + + mock_rel = MagicMock() + mock_rel.subject = "Service" + mock_rel.relation = "calls" + mock_rel.object = "Database" + mock_rel.subject_type = "Class" + mock_rel.object_type = "Database" + # Remove head/tail attributes + del mock_rel.head + del mock_rel.tail + + triplets = extractor._convert_relations([mock_rel], source_chunk_id=None) + + assert len(triplets) == 1 + assert triplets[0].subject == "Service" + assert triplets[0].predicate == "calls" + assert triplets[0].object == "Database" + + def test_convert_relations_skips_incomplete(self): + """Test _convert_relations skips relations with missing fields.""" + extractor = LangExtractExtractor(provider="ollama") + + relations = [ + {"subject": "", "relation": "uses", "object": "Pydantic"}, + {"subject": "FastAPI", "relation": "", "object": "Pydantic"}, + {"subject": "FastAPI", "relation": "uses", "object": ""}, + ] + + triplets = extractor._convert_relations(relations, source_chunk_id=None) + + assert triplets == [] + + def test_convert_relations_empty_input(self): + """Test _convert_relations returns empty list for empty input.""" + extractor = LangExtractExtractor(provider="ollama") + + assert extractor._convert_relations([], source_chunk_id=None) == [] + assert extractor._convert_relations(None, source_chunk_id=None) == [] # type: ignore[arg-type] + + @patch("agent_brain_server.indexing.graph_extractors.settings") + def test_extract_triplets_handles_exception(self, mock_settings: MagicMock): + """Test extraction returns empty list on unexpected exceptions.""" + import sys + + mock_settings.ENABLE_GRAPH_INDEX = True + mock_settings.GRAPH_DOC_EXTRACTOR = "langextract" + mock_settings.GRAPH_MAX_TRIPLETS_PER_CHUNK = 10 + + extractor = LangExtractExtractor(provider="ollama", max_triplets=10) + + mock_langextract = MagicMock() + mock_langextract.extract_relations.side_effect = RuntimeError("provider error") + + original = sys.modules.get("langextract") + sys.modules["langextract"] = mock_langextract + try: + result = extractor.extract_triplets("some text") + finally: + if original is None: + sys.modules.pop("langextract", None) + else: + sys.modules["langextract"] = original + + assert result == [] + + class TestModuleFunctions: """Tests for module-level convenience functions.""" @@ -509,6 +731,29 @@ def test_get_code_extractor_singleton(self): assert extractor1 is extractor2 + def test_get_langextract_extractor_singleton(self): + """Test get_langextract_extractor returns singleton.""" + extractor1 = get_langextract_extractor() + extractor2 = get_langextract_extractor() + + assert extractor1 is extractor2 + + def test_reset_extractors_clears_all(self): + """Test reset_extractors clears all three singletons.""" + llm1 = get_llm_extractor() + code1 = get_code_extractor() + langextract1 = get_langextract_extractor() + + reset_extractors() + + llm2 = get_llm_extractor() + code2 = get_code_extractor() + langextract2 = get_langextract_extractor() + + assert llm1 is not llm2 + assert code1 is not code2 + assert langextract1 is not langextract2 + def test_reset_extractors(self): """Test reset_extractors clears singletons.""" extractor1 = get_llm_extractor()