diff --git a/docs/sessions/memory.md b/docs/sessions/memory.md index 51d846b79..7b175e441 100644 --- a/docs/sessions/memory.md +++ b/docs/sessions/memory.md @@ -20,17 +20,17 @@ The `BaseMemoryService` defines the interface for managing this searchable, long ## Choosing the Right Memory Service -The ADK offers two distinct `MemoryService` implementations, each tailored to different use cases. Use the table below to decide which is the best fit for your agent. - -| **Feature** | **InMemoryMemoryService** | **VertexAiMemoryBankService** | -| :--- | :--- | :--- | -| **Persistence** | None (data is lost on restart) | Yes (Managed by Vertex AI) | -| **Primary Use Case** | Prototyping, local development, and simple testing. | Building meaningful, evolving memories from user conversations. | -| **Memory Extraction** | Stores full conversation | Extracts [meaningful information](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/generate-memories) from conversations and consolidates it with existing memories (powered by LLM) | -| **Search Capability** | Basic keyword matching. | Advanced semantic search. | -| **Setup Complexity** | None. It's the default. | Low. Requires an [Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/overview) instance in Vertex AI. | -| **Dependencies** | None. | Google Cloud Project, Vertex AI API | -| **When to use it** | When you want to search across multiple sessions’ chat histories for prototyping. | When you want your agent to remember and learn from past interactions. | +The ADK offers three distinct `MemoryService` implementations, each tailored to different use cases. Use the table below to decide which is the best fit for your agent. + +| **Feature** | **InMemoryMemoryService** | **VertexAiMemoryBankService** | **OpenMemoryService** | +| :--- | :--- | :--- | :--- | +| **Persistence** | None (data is lost on restart) | Yes (Managed by Vertex AI) | Yes (Self-hosted backend) | +| **Primary Use Case** | Prototyping, local development, and simple testing. | Building meaningful, evolving memories from user conversations. | Self-hosted deployments with data sovereignty requirements, on-premise deployments, cost-effective memory solutions. | +| **Memory Extraction** | Stores full conversation | Extracts [meaningful information](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/generate-memories) from conversations and consolidates it with existing memories (powered by LLM) | Stores session events with multi-sector embeddings and graceful decay | +| **Search Capability** | Basic keyword matching. | Advanced semantic search. | Advanced semantic search with multi-sector embeddings | +| **Setup Complexity** | None. It's the default. | Low. Requires an [Agent Engine](https://cloud.google.com/vertex-ai/generative-ai/docs/agent-engine/memory-bank/overview) instance in Vertex AI. | Medium. Requires self-hosted OpenMemory backend (Docker or Node.js). | +| **Dependencies** | None. | Google Cloud Project, Vertex AI API | Self-hosted OpenMemory server, `httpx` (via `google-adk[openmemory]`) | +| **When to use it** | When you want to search across multiple sessions' chat histories for prototyping. | When you want your agent to remember and learn from past interactions. | When you need self-hosted, open-source memory with full data control, on-premise deployments, or cost-effective alternatives to cloud services. | ## In-Memory Memory @@ -194,6 +194,232 @@ runner = adk.Runner( ) ``` +## OpenMemory + +The `OpenMemoryService` connects your agent to [OpenMemory](https://openmemory.cavira.app/), a self-hosted, open-source memory system that provides brain-inspired multi-sector embeddings, graceful memory decay, and server-side filtering for efficient multi-user agent deployments. + +### How It Works + +OpenMemory provides a production-ready, self-hosted memory backend that integrates seamlessly with ADK's `BaseMemoryService` interface. The service handles two key operations: + +* **Storing Memories:** Automatically converts ADK session events to OpenMemory memories with enriched content format (embedding author/timestamp metadata). +* **Retrieving Memories:** Leverages OpenMemory's multi-sector embeddings for semantic search and retrieval, with server-side filtering by `user_id` for multi-tenant isolation. + +### Key Features + +* **Multi-sector embeddings:** Factual, emotional, temporal, and relational memory sectors for richer context understanding. +* **Graceful memory decay:** Automatic reinforcement keeps relevant context sharp while allowing less important memories to fade. +* **Server-side filtering:** Efficient multi-user isolation through indexed database queries. +* **Self-hosted:** Full data ownership with no vendor lock-in, perfect for on-premise deployments. +* **Cost-effective:** 6-10× cheaper than SaaS memory APIs while providing high performance. + +### Installation + +Install ADK with OpenMemory support: + +```bash +pip install google-adk[openmemory] +``` + +This installs `httpx` for making HTTP requests to the OpenMemory API. + +### Prerequisites + +Before you can use OpenMemory, you need: + +1. **A self-hosted OpenMemory backend:** You can run OpenMemory using Docker or by setting up the Node.js backend manually. See the [Self-Hosted Setup](#self-hosted-setup) section below. +2. **Environment Variables (Optional):** You can configure OpenMemory via environment variables or pass them directly to the service: + ```bash + export OPENMEMORY_BASE_URL="http://localhost:3000" + export OPENMEMORY_API_KEY="your-api-key" # Optional, only if server requires authentication + ``` + +### Configuration + +You can configure OpenMemory in two ways: + +#### Option 1: Using the CLI (Recommended for `adk web` and `adk api_server`) + +To connect your agent to OpenMemory using the CLI, use the `--memory_service_uri` flag when starting the ADK server. The URI format is `openmemory://:`. + +```bash title="bash" +# Basic usage +adk web path/to/your/agents_dir --memory_service_uri="openmemory://localhost:3000" + +# With API key +adk web path/to/your/agents_dir --memory_service_uri="openmemory://localhost:3000?api_key=your-secret-key" + +# API server +adk api_server path/to/your/agents_dir --memory_service_uri="openmemory://localhost:3000" +``` + +**Supported URI formats:** +- `openmemory://localhost:3000` → Connects to `http://localhost:3000` +- `openmemory://localhost:3000?api_key=secret` → Connects with API key authentication +- `openmemory://https://example.com` → Connects to `https://example.com` + +#### Option 2: Using Python Code + +Alternatively, you can configure OpenMemory by manually instantiating the `OpenMemoryService` and passing it to the `Runner`: + +```py +from google.adk.memory import OpenMemoryService +from google.adk import Agent, Runner +from google.adk.sessions import InMemorySessionService +from google.adk.artifacts import InMemoryArtifactService + +# Configure OpenMemory with defaults +memory_service = OpenMemoryService( + base_url="http://localhost:3000", + api_key="your-key" # Optional, only if server requires authentication +) + +# Create agent +agent = Agent( + name="my_agent", + model="gemini-2.0-flash", + instruction="You are a helpful assistant." +) + +# Use with Runner +runner = Runner( + app_name="my_app", + agent=agent, + session_service=InMemorySessionService(), + artifact_service=InMemoryArtifactService(), + memory_service=memory_service +) + +# Run with memory +response = await runner.run("Hello, remember this conversation!") +``` + +### Advanced Configuration + +You can customize OpenMemory behavior using `OpenMemoryServiceConfig`: + +```py +from google.adk.memory import OpenMemoryService, OpenMemoryServiceConfig + +# Create custom configuration +config = OpenMemoryServiceConfig( + search_top_k=20, # Number of memories to retrieve (default: 10) + timeout=10.0, # Request timeout in seconds (default: 30.0) + user_content_salience=0.9, # Importance score for user messages (default: 0.8) + model_content_salience=0.75, # Importance score for model responses (default: 0.7) + default_salience=0.6, # Fallback salience value (default: 0.6) + enable_metadata_tags=True # Toggle session/app tagging (default: True) +) + +memory_service = OpenMemoryService( + base_url="http://localhost:3000", + api_key="your-api-key", + config=config +) +``` + +**Configuration Parameters:** + +* `search_top_k` (int, default: 10): Maximum number of memories to retrieve per search query. +* `timeout` (float, default: 30.0): HTTP request timeout in seconds. +* `user_content_salience` (float, default: 0.8): Importance score (0.0-1.0) assigned to user messages when storing memories. +* `model_content_salience` (float, default: 0.7): Importance score (0.0-1.0) assigned to model responses when storing memories. +* `default_salience` (float, default: 0.6): Fallback salience value for content without a recognized author. +* `enable_metadata_tags` (bool, default: True): Whether to include session and app tags for filtering memories by application context. + +### Self-Hosted Setup + +OpenMemory can be deployed using Docker (recommended) or by setting up the Node.js backend manually. + +#### Option 1: Docker (Recommended) + +The easiest way to run OpenMemory is using Docker: + +```bash +# Run OpenMemory container +docker run -p 3000:3000 cavira/openmemory + +# Or use the production network build +docker run -p 3000:3000 cavira/openmemory:production +``` + +Verify it's running: + +```bash +curl http://localhost:3000/health +``` + +#### Option 2: Node.js Backend + +For more control, you can set up the OpenMemory backend manually: + +1. **Clone the OpenMemory repository:** + ```bash + git clone https://github.com/CaviraOSS/OpenMemory.git + cd OpenMemory/backend + ``` + +2. **Install dependencies:** + ```bash + npm install + ``` + +3. **Configure environment variables:** + + Create a `.env` file in `OpenMemory/backend/`: + ```bash + # Embedding Provider (e.g., Gemini) + OM_EMBEDDINGS=gemini + GEMINI_API_KEY=your-gemini-api-key + EMBED_MODE=simple + + # Server Configuration + OM_PORT=3000 + OM_API_KEY=openmemory-secret-key # Optional, for API authentication + + # Database + DB_PATH=./data/openmemory.db + ``` + +4. **Start the server:** + ```bash + npm start + # Server will run on http://localhost:3000 + ``` + +For more detailed setup instructions, see the [OpenMemory documentation](https://openmemory.cavira.app/). + +### Advanced Usage + +#### Multi-User Isolation + +OpenMemory uses server-side filtering by `user_id` for efficient multi-tenant isolation. The `user_id` is passed as a top-level parameter to leverage OpenMemory's indexed database column, ensuring fast queries and proper tenant isolation in production deployments. + +#### App-Level Filtering + +When `enable_metadata_tags=True` (default), OpenMemory automatically tags memories with session and app information. This allows you to filter memories by application context, enabling different memory spaces for different applications. + +#### Enriched Content Format + +OpenMemory uses an enriched content format where author and timestamp metadata are embedded directly in the content string during storage: + +``` +[Author: user, Time: 2025-11-04T12:34:56] What is the weather today? +``` + +On retrieval, the service automatically parses this metadata and returns clean content to users. This design avoids N+1 API calls for metadata while preserving context information efficiently. + +### Sample Agent + +See the [OpenMemory sample agent](https://github.com/google/adk-python/tree/main/contributing/samples/open_memory) in the ADK Python repository for a complete example that demonstrates: + +* Setting up OpenMemoryService with custom configuration +* Storing session events to memory +* Retrieving memories across different sessions +* Using memory in agent conversations + +The sample includes setup instructions and shows how to run a complete memory-enabled agent workflow. + ## Using Memory in Your Agent When a memory service is configured, your agent can use a tool or callback to retrieve memories. ADK includes two pre-built tools for retrieving memories: @@ -249,11 +475,11 @@ The memory workflow internally involves these steps: ### Can an agent have access to more than one memory service? -* **Through Standard Configuration: No.** The framework (`adk web`, `adk api_server`) is designed to be configured with one single memory service at a time via the `--memory_service_uri` flag. This single service is then provided to the agent and accessed through the built-in `self.search_memory()` method. From a configuration standpoint, you can only choose one backend (`InMemory`, `VertexAiMemoryBankService`) for all agents served by that process. +* **Through Standard Configuration: No.** The framework (`adk web`, `adk api_server`) is designed to be configured with one single memory service at a time via the `--memory_service_uri` flag. This single service is then provided to the agent and accessed through the built-in `self.search_memory()` method. From a configuration standpoint, you can only choose one backend (`InMemory`, `VertexAiMemoryBankService`, `OpenMemoryService`) for all agents served by that process. * **Within Your Agent's Code: Yes, absolutely.** There is nothing preventing you from manually importing and instantiating another memory service directly inside your agent's code. This allows you to access multiple memory sources within a single agent turn. -For example, your agent could use the framework-configured `VertexAiMemoryBankService` to recall conversational history, and also manually instantiate a `InMemoryMemoryService` to look up information in a technical manual. +For example, your agent could use the framework-configured `VertexAiMemoryBankService` to recall conversational history, and also manually instantiate a `OpenMemoryService` to look up information in a self-hosted memory store. #### Example: Using Two Memory Services @@ -261,7 +487,7 @@ Here’s how you could implement that in your agent's code: ```python from google.adk.agents import Agent -from google.adk.memory import InMemoryMemoryService, VertexAiMemoryBankService +from google.adk.memory import InMemoryMemoryService, VertexAiMemoryBankService, OpenMemoryService from google.genai import types class MultiMemoryAgent(Agent): @@ -275,6 +501,10 @@ class MultiMemoryAgent(Agent): location="LOCATION", agent_engine_id="AGENT_ENGINE_ID" ) + # Or use OpenMemoryService for self-hosted memory + self.openmemory_service = OpenMemoryService( + base_url="http://localhost:3000" + ) async def run(self, request: types.Content, **kwargs) -> types.Content: user_query = request.parts[0].text @@ -286,11 +516,16 @@ class MultiMemoryAgent(Agent): # 2. Search the document knowledge base using the manually created service document_context = await self.vertexai_memorybank_service.search_memory(query=user_query) - # Combine the context from both sources to generate a better response + # 3. Search self-hosted memory using OpenMemory + openmemory_context = await self.openmemory_service.search_memory(query=user_query) + + # Combine the context from all sources to generate a better response prompt = "From our past conversations, I remember:\n" prompt += f"{conversation_context.memories}\n\n" prompt += "From the technical manuals, I found:\n" prompt += f"{document_context.memories}\n\n" + prompt += "From the self-hosted memory, I found:\n" + prompt += f"{openmemory_context.memories}\n\n" prompt += f"Based on all this, here is my answer to '{user_query}':" return await self.llm.generate_content_async(prompt)