Engram docs by g-despot · Pull Request #345 · weaviate/docs

g-despot · 2026-02-12T14:46:20Z

What's being changed:

Add Engram documentation: introduction, quickstart, concepts, guides (store, search, manage, run status), and API reference
Integrate interactive API reference via Scalar + OpenAPI spec
Add Engram to secondary navbar and sidebar

New pages

All under docs/engram/: introduction, quickstart, core concepts (memories, topics, groups, scoping, pipelines, runs), four how-to guides, and an API reference overview linking to the interactive Scalar reference at /engram/api-reference/rest.

Config changes

sidebars.js — engramSidebar
secondaryNavbar.js — Engram nav entry
docusaurus.config.js — Scalar plugin instance for Engram OpenAPI spec
static/specs/engram-openapi.json — Generated OpenAPI spec (user-facing endpoints only)

Type of change:

Documentation content updates (non-breaking change to fix/update documentation )
Bug fix (non-breaking change to fixes an issue with the site)
Feature or enhancements (non-breaking change to add functionality)

How has this been tested?

Local build - the site works as expected when running yarn start

orca-security-eu

Orca Security Scan Summary

Status	Check	Issues by priority
Passed	Infrastructure as Code	0 0 0 0	View in Orca
Passed	SAST	0 0 0 0	View in Orca
Passed	Secrets	0 0 0 0	View in Orca
Passed	Vulnerabilities	0 0 0 0	View in Orca

augustas1 · 2026-03-04T09:17:04Z

docs/engram/_includes/quickstart.py

+
+# START Connect
+client = EngramClient(
+    api_key=os.environ["ENGRAM_API_KEY"], base_url="https://api.engram.weaviate.io"


we shouldn't pass base_url. that's the whole idea of SDK to encapsulate things like this.

Good point, will remove this completely

augustas1 · 2026-03-04T09:19:15Z

docs/engram/_includes/quickstart.py

+# START AddMemory
+run = client.memories.add(
+    "The user prefers dark mode and uses VS Code as their primary editor.",
+    user_id=test_user_id,


maybe we can put some static string to show that user_id can be simple user name? Field is called user_id for only reason to emphasize that it has to be unique.

I'm using this because I ran into problems with deduplication with the same user id when running the tests multiple times, I will see if I can remove it

gotcha. My point is, use some "readable" string instead of uuid, so users don't think we require uuid here.

danmichaeljones · 2026-03-05T10:41:36Z

docs/engram/quickstart.md

+
+You can select a predefined template when creating a project. For this tutorial, use the **Personalization template**.
+
+The template provides you with a default [group](concepts/groups.md) called `personalization` and a default [topic](concepts/topics.md) called `preferences` (description: *"Stable user preferences, defaults, and behavioral patterns"*). This is enough to get started.


The default topic is now called UserKnowledge with the description Anything relating to the user personally: their personal details (name, age, interpersonal relationships, etc.), their preferences, what they've done or plan to do, etc., i.e., any generic information about the user.

The template also lets you optionally add a ConversationSummary topic which maintains a single summary for each conversation ID which contains the entire history of that conversation so far. Enabling this option makes the conversation_id required when adding memories (which is why it's disabled by default).

docs/engram/_includes/quickstart.py

danmichaeljones · 2026-03-05T10:59:01Z

docs/engram/concepts/pipelines.md

+## Pipeline steps
+
+Each pipeline processes content through a sequence of steps:
+
+1. **Extract** — Pulls structured memories from the input content. The extraction method depends on the [input type](input-data-types.md) (`ExtractFromString`, `ExtractFromConversation`, or `ExtractFromPreExtracted`).
+2. **Transform** — Refines extracted memories using existing context. Steps like `TransformWithContext` and `TransformOperations` deduplicate, merge, and resolve conflicts with existing memories.
+3. **Commit** — Finalizes the operations (create, update, delete) and persists them to storage.


Pipelines can also contain buffer steps. When a pipeline run hits one of these buffers, it pauses and waits for one of several possible triggers to flush and continue the pipeline run (it's during this pause that the run status is in_buffer).

These buffers don't have to be at the start of the pipeline (e.g., it's not just queueing and waiting to start running), they can be in the middle too.

One use case for this is aggregated memories over time. For example, we can have a pipeline which has steps like

[extract] -> [transform] -> [commit] -> [buffer] -> [transform] -> [commit]

where the buffer is configured to trigger after 24hrs. This pipeline would immediately extract/transform/commit memories as they're added. Then, those memories get sent to the buffer. That buffer accumulates all the memories that get added with the same scope (i.e., user/conversation ID) into a single batch that runs through the rest of the pipeline after it triggers. In this case, the second transform step could be configured to combine all those memories into a single "daily activity" memory which is committed by the second commit step.

The default personalisation pipeline doesn't include a buffer at the moment, so hopefully this will be a bit clearer once we have something using it (the continual learning template will have a buffer I think, and it'll be available as a step once pipelines are fully user-configurable).

danmichaeljones · 2026-03-05T11:07:04Z

docs/engram/guides/store-memories.md

+import PyCode from '!!raw-loader!../_includes/store_memories.py';
+import CurlCode from '!!raw-loader!../_includes/store_memories.sh';
+
+Engram supports three [content types](../concepts/input-data-types.md) for storing memories. Each triggers a different extraction [pipeline](../concepts/pipelines.md).


Each triggers a different extraction pipeline

It's not quite a different pipeline for each content type, it's a different entrypoint into that pipeline (that's why the pipelines are DAGs). For example, a pipeline might look like

ExtractFromConversation ExtractFromString │ │ │ │ └──────►TransformWithContext◄──┘ │ ▼ Commit

so if you add conversation data, it'd pass through the steps

[ExtractFromConversation] -> [TransformWithContext] -> [Commit]

and if you added string data it'd pass through the steps

[ExtractFromString] -> [TransformWithContext] -> [Commit]

and that TransformWithContext is exactly the same step with the same config etc. That distinction is important when you include buffer steps, because if you route memories extracted from different content type into the same buffer step, they get aggregated together for later transformation steps.

Thank you a lot for the detailed descriptions, they are amazing for the docs 💚

danmichaeljones · 2026-03-05T11:10:30Z

docs/engram/_includes/store_memories.py

+        {"role": "user", "content": "I prefer specialty coffee, not chains."},
+    ],
+    user_id=test_user_id,
+    conversation_id=str(uuid.uuid4()),


I don't know the best way to make it clear in the docs, but we want to separate the idea that adding conversation content type data means you need a conversation ID. They're separate concepts, so you can add conversation data (list of role/content) without a conversation_id, and conversely could include a conversation_id for string / pre-extracted content types.

Whether the conversation_id is required when adding or not just depends on the topic in that group, not the content type. I'd worry that only including that kwarg when adding conversation data in these examples conflates those two.

danmichaeljones · 2026-03-05T11:13:07Z

docs/engram/guides/check-run-status.md

+:::tip
+For production systems, implement a polling loop that checks the run status at regular intervals (e.g. every 1-2 seconds) until the status is `completed` or `failed`. The Python SDK provides `client.runs.wait(run_id)` which handles polling automatically and blocks until the run completes.
+:::


I don't think we want to encourage this generally. The idea is that most of the time people shouldn't be waiting on the pipelines completing, the memories should just be eventually consistent.

While the status could change to failed part way through a pipeline running, that should only be because of internal errors. It's probably best IMO to encourage the user to just check the run status that is returned from memories.add to see if it errored immediately.

danmichaeljones · 2026-03-05T11:20:12Z

docs/engram/tutorials/memory-chat-app.md

+<FilteredTextBlock
+  text={PyCode}
+  startMarker="# START StoreConversation"
+  endMarker="# END StoreConversation"
+  language="py"
+/>


Similar comment here as on the run status page - we don't need to wait after adding memories, and definitely not blocking the chat response.

It's useful to note that memories are most useful between conversation contexts or from very far back in the conversation etc. We generally don't need to wait until memories from the last message have been added to retrieve them for the next message, because the previous message will still be in context for the LLM.

danmichaeljones · 2026-03-05T11:26:41Z

docs/engram/tutorials/context-window-management.md

This tutorial would probably be a good place to demonstrate the optional Include Conversation Summary Topic checkbox in the personalisation template, since that's basically built for exactly this use case.

Once that is enabled, it adds a ConversationSummary topic to the group (conversation-scoped, so all adds need a conversation ID when it's enabled). The pipeline keeps that topic updated with the history of the conversation so far, and always rewrites it with updates, i.e., every unique conversation_id has just one memory with the ConversationSummary topic.

Then in retrieval, you can just fetch that memory by using the fetch retrieval type (which I've just noticed is enabled in backend but not the python SDK so will TODO that!) using limit 1 and passing this conversation's conversation_id, and you should always get an up to date history of the entire conversation so far that you can use to replace old messages just like you did here.

Will include this as soon as it's available in the Python SDK

Add initial docs

301bac7

orca-security-eu bot reviewed Feb 12, 2026

View reviewed changes

Update docs

bc28bd0

g-despot changed the title ~~Add initial docs~~ Engram docs Feb 13, 2026

g-despot added 10 commits February 15, 2026 10:23

Remove duplicate reference

1b934bb

Merge branch 'main' into engram

a129d89

Update docs

8157717

Update docs

bd35628

Add three tutorials

cbde720

Merge branch 'main' into engram

6ba99c6

Update concepts

d8469d5

Update docs

4c90794

Minor update

11e3181

Improve Engram docs

567503c

augustas1 reviewed Mar 4, 2026

View reviewed changes

g-despot added 11 commits March 4, 2026 10:39

Improve code verification

4e54443

Update docs

4ed4952

Improve code

165cda3

Update concepts

85f9cab

Minor updates

f398fad

Improve diagrams and linking

70a714d

Add links

0a69c5a

Improve diagram

ff9a955

Improve concepts

5f40a64

Link to REST API

fc28b39

Update diagrams

4c6bac3

danmichaeljones reviewed Mar 5, 2026

View reviewed changes

Implement feedback

c4f56a6


		You can select a predefined template when creating a project. For this tutorial, use the Personalization template.

		The template provides you with a default [group](concepts/groups.md) called `personalization` and a default [topic](concepts/topics.md) called `preferences` (description: "Stable user preferences, defaults, and behavioral patterns"). This is enough to get started.

Conversation

g-despot commented Feb 12, 2026

What's being changed:

New pages

Config changes

Type of change:

How has this been tested?

Uh oh!

orca-security-eu bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Orca Security Scan Summary

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

orca-security-eu bot left a comment •

edited

Loading