Code intelligence engine for LLMs, AI agents, and RAG pipelines
Understands your codebase. Ranks what matters. Generates context that makes AI actually useful.
npm install -g infiniloomAlso available via brew install infiniloom, cargo install infiniloom, or pip install infiniloom.
Infiniloom parses your codebase with Tree-sitter, builds a dependency graph, ranks symbols with PageRank, and generates structured context that fits your model's token budget. It's not a file concatenator — it understands code.
# Understand what's in a repo
infiniloom scan .
# → 847 files, 12 languages, 142K tokens (Claude), 6.2s
# Generate ranked, compressed context for Claude
infiniloom pack . -o context.xml
# → XML with symbol map, dependency graph, ranked files, git history
# Generate context for GPT with exact tiktoken counts
infiniloom pack . -f markdown -m gpt4o --max-tokens 80000 -o ctx.md
# See what depends on a file before you change it
infiniloom index . && infiniloom impact . src/auth.rs
# → 12 files depend on auth.rs, 3 test files, call graph with 47 edges
# Get diff context that includes callers and callees, not just changed lines
infiniloom diff --staged --include-diff --depth 2 -o review.xml
# Generate chunks for your vector database
infiniloom embed . -o chunks.jsonl
# → 1,203 content-addressable chunks with call graphs, tags, and signatures| Capability | What It Actually Does |
|---|---|
| AST parsing | Tree-sitter super-queries extract symbols, signatures, docstrings, and call relationships in a single pass across 23 languages |
| PageRank ranking | Builds a symbol graph from imports/calls/inheritance, runs PageRank (damping 0.85, parallel for >100 nodes), filters generic accessors |
| Smart diff expansion | Classifies changes (deletion 1.5x, signature 1.3x, docs 0.3x) and expands context proportionally — deletions get more callers included |
| Content-addressable chunks | BLAKE3 hashing with Unicode NFC normalization. Same code anywhere = same ID. Enables cross-repo deduplication and incremental RAG updates |
| Exact token counting | tiktoken for all OpenAI models (o200k, cl100k). Calibrated estimation (~95% prose, ~85% code) for Claude, Gemini, Llama, and 20+ others |
| Security scanning | 30+ regex patterns detect AWS keys, GitHub tokens, private keys, JWTs, database URLs, Slack webhooks. NFKC normalization catches homoglyph attacks |
| Cache-optimized output | XML output marks cacheable vs dynamic sections for Claude prompt caching |
| Document distillation | 5-stage pipeline (strip, dedup, compress, score, arrange) grounded in LLMLingua research showing 17-21% accuracy improvement |
Works with Claude Code, Codex, Gemini CLI, OpenCode, and any terminal-based AI tool:
# Feed structured context to Claude Code
infiniloom pack . -f xml --redact-secrets | claude "Review this codebase for security issues"
# PR review with dependency context for Codex
infiniloom diff main..feature --include-diff --depth 2 -f markdown | codex "Review these changes"
# Check blast radius before asking any agent to refactor
infiniloom impact . src/core/parser.rs --depth 3 --call-graphAnthropic Claude SDK (pip install infiniloom anthropic):
import infiniloom
from anthropic import Anthropic
# Use the Python binding directly — no subprocess needed
context = infiniloom.pack(".", format="xml", model="claude", compression="balanced")
client = Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=4096,
messages=[{"role": "user", "content": f"{context}\n\nExplain the authentication flow."}]
)Vercel AI SDK / Mastra (npm install infiniloom-node ai @ai-sdk/anthropic):
import { pack, scan, embed } from 'infiniloom-node';
import { streamText } from 'ai';
import { anthropic } from '@ai-sdk/anthropic';
// Use the Node.js binding — native Rust speed, no child process
const context = pack('.', { format: 'xml', model: 'claude', tokenBudget: 60000 });
const result = streamText({
model: anthropic('claude-sonnet-4-20250514'),
system: `Codebase context:\n${context}`,
messages,
});LangChain / LlamaIndex (pip install infiniloom):
import infiniloom
import json
# Generate chunks using the Python binding
chunks_jsonl = infiniloom.embed(".", max_tokens=1000)
# Parse and load into your vector DB
for line in chunks_jsonl.strip().split("\n"):
chunk = json.loads(line)
# chunk has: id, content, kind, source.symbol, context.calls, context.tags
upsert_to_vector_db(chunk["id"], chunk["content"], chunk)Infiniloom will expose pack, map, diff, impact, and embed as MCP tools — making them available to Claude Desktop, Claude Code, Codex, and any MCP-compatible client without custom integration.
# .github/workflows/ai-review.yml
- run: infiniloom diff origin/main..HEAD --include-diff --redact-secrets -o context.xml
- run: infiniloom embed . --diff -o updated-chunks.jsonl # Incremental RAG update| Command | What It Does |
|---|---|
pack |
Generate AI-ready context (XML, Markdown, YAML, TOON, JSON) |
scan |
Repository statistics — files, tokens across 27 models, languages |
map |
PageRank-ranked symbol overview of the most important code |
embed |
Content-addressable chunks for vector databases and RAG |
diff |
Context for code changes — includes callers, callees, and related tests |
index |
Build symbol index and dependency graph (powers diff and impact) |
impact |
Analyze what depends on a file or symbol — blast radius with call graph |
chunk |
Split repo into token-budgeted chunks for multi-turn conversations |
ingest |
Convert documents (Markdown, HTML, CSV, DOCX, XLSX) with PII redaction |
init |
Create .infiniloom.yaml with language-specific templates |
info |
Show supported models, formats, and configuration |
Output formats: XML (Claude) · Markdown (GPT/Codex) · YAML (Gemini) · TOON (~40% smaller) · JSON (pipelines)
embed generates deterministic, AST-aware chunks designed for retrieval:
{
"id": "ec_a1b2c3d4e5f6g7h8",
"content": "async fn authenticate(token: &str) -> Result<User, AuthError> {...}",
"tokens": 245,
"kind": "function",
"source": { "file": "src/auth.rs", "symbol": "authenticate", "fqn": "src::auth::authenticate" },
"context": {
"signature": "async fn authenticate(token: &str) -> Result<User, AuthError>",
"calls": ["verify_jwt", "find_user_by_id"],
"called_by": ["login_handler", "refresh_token"],
"tags": ["async", "security", "public-api"],
"cyclomatic_complexity": 4
}
}Key properties: Content-addressable IDs (BLAKE3) · AST-aware boundaries · Incremental manifest diffing · Call graph context · Hierarchical parent-child linking · Auto-generated semantic tags · Streaming mode for large repos · pgvector/Neptune export
Works with Pinecone, Weaviate, Qdrant, ChromaDB, pgvector, Milvus — or any system that accepts JSONL.
| Metric | Value |
|---|---|
| 100 files | ~400ms |
| 5,000 files | ~8 seconds |
| Languages | 23 (Tree-sitter AST) |
| Tokenizers | 27 models (exact tiktoken for OpenAI, calibrated for rest) |
| Secret patterns | 30+ (AWS, GitHub, OpenAI, Stripe, SSH keys, JWTs, DB strings, ...) |
| Parallelism | Thread-local parsers, zero mutex contention (Rayon) |
| Getting Started | Quick Start · Installation · Configuration |
| Integration Guides | Claude Code · Codex / GPT · CI/CD |
| Reference | All Commands · Full Reference · Recipes |
| Deep Dives | Languages (23) · Tokenizers (27) · Output Formats |
| Support | FAQ · Troubleshooting · Large Repos |
| API Bindings | Python · Node.js |
- Found a bug? Open an issue
- Have an idea? Start a discussion
- Want to contribute? See CONTRIBUTING.md
cargo test --workspace && cargo clippy --workspace && cargo fmt --allMIT — see LICENSE.