mcp-name: io.github.oxgeneral/agentmem
Lightweight persistent memory for AI agents. One SQLite file. Hybrid search (keywords + semantics). Zero to 12MB install.
No PyTorch. No cloud. No server. Just memory.
206 unit tests. 107 quality tests on real data. Typed API (16 TypedDict). Production-ready.
Built by an AI agent that wakes up with no memory every session — and needed a way to remember.
Every AI agent session starts from zero. Context windows compress, conversations end, memory vanishes. agentmem gives agents persistent memory that survives across sessions — in a single SQLite file.
- Hybrid search: FTS5 full-text keywords + vector semantic search, fused with adaptive ranking
- 4 operational modes: from zero dependencies (stdlib only) to best quality (12MB)
- 16 MCP tools: recall, remember, save_state, compact, consolidate, entities, and more
- HTTP REST API: 14 endpoints, zero-dependency server, CORS-ready
- 5 memory tiers: core, learned, episodic, working (auto-expires), procedural (behavioral rules)
- Namespaces: multi-user, multi-agent memory isolation
- Temporal versioning: fact evolution chains with supersedes tracking
- Entity extraction: auto-extracts @mentions, URLs, IPs, env vars, money amounts
- Conversation extraction: auto-extracts facts, decisions, TODOs from chat history
- Importance scoring: auto-scores memories by tier, length, specificity, structure
- Memory consolidation: finds and merges near-duplicate memories
- Recency boost: newer memories rank higher with configurable decay
- Multilingual: Russian keywords via FTS5, English semantics via embeddings
- Fast: <1ms/query hybrid search, <5ms cold start, <0.2ms/chunk import
# Best quality (sqlite-vec + model2vec, 12MB total)
pip install agentmem-lite[all]
# Minimal (sqlite-vec + hash embeddings, 151KB)
pip install agentmem-lite
# Zero dependencies (pure Python, stdlib only)
pip install agentmem-lite --no-deps
# From source
git clone https://github.com/oxgeneral/agentmem && cd agentmem
pip install -e ".[all]"from agentmem import MemoryStore, get_embedding_model
# Auto-selects best available backend
embed = get_embedding_model()
store = MemoryStore("memory.db", embedding_dim=embed.dim)
store.set_embed_fn(embed)
# Store memories with namespaces
store.remember("Server costs $50/month", tier="core", namespace="infra")
store.remember("API returns 403 without auth", tier="learned", namespace="api")
store.remember("Deployed v2.1 at 15:30", tier="episodic")
# Search — hybrid keyword + semantic, with recency boost
results = store.recall("server costs", recency_weight=0.15)
# Namespace isolation
results = store.recall("server", namespace="infra")
# Save working state before context compression
store.save_state("Working on auth fix, step 3/5, blocked by CORS")
# Add behavioral rules (procedural memory)
store.add_procedure("Always use HTTPS in production")
store.add_procedure("Never expose debug endpoints")
rules = store.get_procedures() # → formatted for system prompt
# Update facts with version chain
store.update_memory(old_id=1, new_content="Server costs $75/month")
history = store.history(memory_id=2) # → trace fact evolution
# Find related memories by entity
related = store.related("10.0.0.1") # → all memories mentioning this IP
entities = store.entities(entity_type="ip") # → list all known IPs
# Auto-extract from conversations
messages = [
{"role": "user", "content": "Set API_KEY to sk-abc123. Always validate input."},
{"role": "assistant", "content": "Noted. I decided to use pydantic for validation."},
]
result = store.process_conversation(messages, namespace="project")
# → extracts config, preferences, decisions automatically
# Maintenance
store.compact(max_age_days=90) # archive old low-value memories
store.consolidate(similarity_threshold=0.85) # merge near-duplicates
# Import markdown files
store.import_markdown("MEMORY.md", tier="core")# Initialize database
agentmem init --db memory.db
# Import markdown files
agentmem import MEMORY.md --tier core -n my-agent
agentmem import-dir ./daily-logs/ --tier episodic
# Search with namespace filter
agentmem search "deployment process" --limit 5 -n infra
# Manage procedures
agentmem add-procedure "Always use markdown formatting"
agentmem procedures
# View entities and relations
agentmem entities --type ip
agentmem related 10.0.0.1
# Maintenance
agentmem compact --max-age-days 90 --dry-run
agentmem consolidate --threshold 0.85
# Process conversation
agentmem process chat.json -n project
# Stats and export
agentmem stats
agentmem export --tier corepython -m agentmem --db memory.dbAdd to your MCP client config:
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["-m", "agentmem", "--db", "/path/to/memory.db"]
}
}
}# Start HTTP server
agentmem serve-http --port 8422
# Or directly
agentmem-http --port 8422 --db memory.db# Store a memory
curl -X POST http://localhost:8422/remember \
-H "Content-Type: application/json" \
-d '{"content": "Server IP is 10.0.0.1", "tier": "core", "namespace": "infra"}'
# Search
curl "http://localhost:8422/recall?query=server+IP&namespace=infra"
# Health check
curl http://localhost:8422/health16 MCP tools / 14 HTTP endpoints:
| Tool | HTTP | Description |
|---|---|---|
recall |
GET /recall |
Hybrid keyword + semantic search |
remember |
POST /remember |
Store a new memory |
save_state |
POST /save_state |
Emergency save before context compression |
today |
GET /today |
Get all memories from today |
forget |
POST /forget |
Archive a memory (soft delete) |
unarchive |
POST /unarchive |
Restore an archived memory |
stats |
GET /stats |
Memory statistics and health |
compact |
POST /compact |
Archive low-value memories |
consolidate |
POST /consolidate |
Merge near-duplicate memories |
update_memory |
POST /update_memory |
Replace a memory with version chain |
history |
GET /history |
Trace fact version history |
related |
GET /related |
Find memories by entity |
entities |
GET /entities |
List all extracted entities |
get_procedures |
— | Get behavioral rules for system prompt |
add_procedure |
— | Add a behavioral rule |
process_conversation |
— | Auto-extract from chat history |
| Tier | Purpose | Auto-compacted | Example |
|---|---|---|---|
core |
Permanent facts | Never | "Server IP is 10.0.0.1" |
procedural |
Behavioral rules | Never | "Always use HTTPS" |
learned |
Discovered knowledge | After 90 days | "API returns 403 without auth" |
episodic |
Events | After 90 days | "Deployed v2.1 at 15:30" |
working |
Current task state | After 24 hours | "Working on step 3/5" |
Isolate memories per user, agent, or project:
# Store in namespaces
store.remember("Alice's API key", namespace="user/alice")
store.remember("Bob's config", namespace="user/bob")
store.remember("Shared fact", namespace="team")
# Search within namespace (prefix matching)
store.recall("API", namespace="user/alice") # only Alice's memories
store.recall("API", namespace="user") # Alice + Bob (prefix match)
store.recall("API") # everythingTrack how facts evolve over time:
# Initial fact
r1 = store.remember("Server costs $50/month", tier="core")
# Fact changes — old version archived, linked via supersedes
r2 = store.update_memory(r1["id"], "Server costs $75/month")
# Trace the history
history = store.history(r2["id"])
# → [{"id": 2, "content": "...$75..."}, {"id": 1, "content": "...$50..."}]Automatic regex-based NER on every remember() call:
| Type | Pattern | Example |
|---|---|---|
mention |
@username |
@alice |
url |
https://... |
https://api.example.com |
ip |
N.N.N.N |
10.0.0.1 |
port |
:NNNN |
:8080 |
email |
user@domain |
admin@example.com |
env_var |
ALL_CAPS |
OPENAI_API_KEY |
money |
$NNN |
$50 |
path |
/unix/path |
/etc/nginx/conf.d |
hashtag |
#tag |
#deployment |
# Find all memories mentioning an entity
store.related("10.0.0.1")
store.related("@alice", entity_type="mention")
# List all known entities
store.entities() # sorted by memory count
store.entities(entity_type="ip")Auto-extract memories from chat history (regex-only, no LLM):
messages = [
{"role": "user", "content": "Set DATABASE_URL to postgres://localhost/mydb"},
{"role": "assistant", "content": "I decided to use connection pooling. Important: max 20 connections."},
{"role": "user", "content": "Always validate input. TODO: add rate limiting."},
]
result = store.process_conversation(messages)
# Extracts: config→core, decisions→episodic, preferences→procedural, todos→working, important→coreagentmem automatically selects the best available mode:
| Mode | Install Size | Init Time | Query Time | Dependencies |
|---|---|---|---|---|
| sqlite-vec + model2vec | 12 MB | ~5ms* | ~1ms | sqlite-vec, model2vec, numpy |
| sqlite-vec + hash | 151 KB | ~5ms | ~0.8ms | sqlite-vec |
| pure Python + hash | 0 KB | ~3ms | ~1.8ms | none (stdlib only) |
| pure + int8 quantize | 0 KB | ~3ms | ~3ms | none (stdlib only) |
*With lazy loading — model2vec loads on first query, not on init
┌──────────────────────────────────────────────┐
│ MemoryStore │
│ ┌──────────┐ ┌──────────┐ ┌────────────┐ │
│ │ FTS5 │ │ Vector │ │ Entity │ │
│ │ keywords │ │ Index │ │ Index │ │
│ │ + BM25 │ │ cosine │ │ regex NER │ │
│ └────┬─────┘ └────┬─────┘ └─────┬──────┘ │
│ └──────┬───────┘ │ │
│ Adaptive Hybrid Scorer │ │
│ (query classify + recency + │ │
│ importance boost) │ │
│ ┌──────────────────────────────────┴───────┐ │
│ │ SQLite + WAL │ │
│ │ memories │ memories_fts │ vecs │ entities│ │
│ └──────────────────────────────────────────┘ │
│ One file: memory.db │
└───────────────────────────────────────────────┘
| Feature | agentmem | ChromaDB | LanceDB | mem0 | Zep |
|---|---|---|---|---|---|
| Install size | 0-12 MB | 400+ MB | 100+ MB | 500+ MB | Cloud |
| Cold start | 3-5 ms | seconds | seconds | seconds | N/A |
| PyTorch required | No | Yes | No | Yes | N/A |
| Cloud required | No | No | No | Yes | Yes |
| Zero-dep mode | Yes | No | No | No | No |
| Keyword search | FTS5 (BM25) | No | No | No | Yes |
| MCP server | 16 tools | No | No | Yes | No |
| HTTP API | Built-in | Yes | No | Yes | Yes |
| Single file DB | Yes | No | Yes | No | No |
| Namespaces | Yes | Yes | Yes | Yes | Yes |
| Temporal versioning | Yes | No | Yes | No | Yes |
| Entity extraction | Auto (regex) | No | No | No | No |
| Procedural memory | Yes | No | No | No | No |
| Importance scoring | Auto | No | No | No | No |
| Conversation extraction | Auto (regex) | No | No | Yes (LLM) | Yes (LLM) |
| Memory consolidation | Yes | No | No | Yes (LLM) | No |
- 206 unit tests covering core CRUD, namespaces, temporal versioning, entity extraction, consolidation, WAL management, HTTP server, error handling
- 107 quality tests against real-world agent memory data (100 search queries across 10 categories, all passing)
- Benchmark suite with reproducible numbers: <1ms hybrid query, 10K+ inserts/sec, ~835 bytes/memory
- Auto-translate for multilingual queries (Russian → English via deep-translator: 4/10 → 10/10)
- Python 3.10, 3.11, 3.12
MIT