Skip to content

Fix/sub thread routing and apitoken access#78

Open
rbuergi wants to merge 78 commits intomainfrom
fix/sub-thread-routing-and-apitoken-access
Open

Fix/sub thread routing and apitoken access#78
rbuergi wants to merge 78 commits intomainfrom
fix/sub-thread-routing-and-apitoken-access

Conversation

@rbuergi
Copy link
Copy Markdown
Contributor

@rbuergi rbuergi commented Apr 3, 2026

No description provided.

rbuergi and others added 30 commits March 31, 2026 22:24
Mark CosmosImport and PostgreSqlImport tools as IsPackable=false
to fix NU5019 errors during dotnet pack.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Map ApiToken nodeType to Permission.Api in CreateNodePermissionAttribute
  (same satellite pattern as Thread/Comment)
- Set IsSatelliteType=true on ApiToken node, add validation cache
- Fix delegation null delivery check and add cancellation registration
  to prevent infinite hang on sub-thread routing failures
- Add tests for ApiToken creation and delegation failure handling

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The RoutingGrain was passing deliveries to grains without updating
the target address when path resolution split prefix/remainder.
This caused routing loops for deeply nested sub-thread paths (6+
segments) because the grain received a delivery whose target didn't
match its hub address.

Now mirrors RoutingServiceBase behavior: sets UnifiedPath property
and updates delivery target to the resolved prefix address.

Also adds InternalsVisibleTo for MeshWeaver.Hosting.Orleans to
access WithTarget, and Orleans tests for sub-thread routing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… leak

Organization instances (e.g., PartnerRe) were visible to all authenticated
users in search because of three layers of public read access:
- ConfigureNodeTypeAccess(WithPublicRead) bypassed partition access in SQL
- WithPublicRead() on hub config allowed unauthenticated hub reads
- Access rule returned true for all reads

Now Organization instances require partition-level permissions for read
access. The Organization type definition itself remains visible (it's
nodeType=NodeType which has its own WithPublicRead). Routing is unaffected
as MeshCatalog path resolution is unprotected by design.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The GenerateAccessControlClause was using OR between public_read and
partition_access, meaning any node type with public_read=true (Markdown,
User, Organization) was visible to all authenticated users across ALL
partitions. This leaked cross-partition data in search results.

Now: partition_access is always required for schema-qualified queries.
public_read only skips node-level permission checks within accessible
partitions, not the partition check itself.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Updates the stored procedure so partition_access is always required.
public_read only skips node-level permission checks, not the partition
check. Prevents cross-partition data leakage in global search.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests now reflect that public_read does not bypass partition_access.
- GlobalAdmin tests: grant partition_access to all org schemas
- PublicRead test: verifies no results without partition_access
- CrossPartition access test: asserts other orgs are excluded
- Renamed PartnerRe references to FutuRe in test data

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AsyncLocal doesn't flow through the AI framework's async streaming
and tool invocation chain, so MeshPlugin tool calls (Get, Search,
Create, Update, Patch, Delete) ran without user identity. This caused
"Access denied" when agents tried to update nodes in partitions the
user had access to.

Now each tool call explicitly restores the user's AccessContext from
ThreadExecutionContext.UserAccessContext via SwitchAccessContext.

Also fixes FutuRe schema reference in CrossPartitionSearchTests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Verifies that Get and Patch work when AsyncLocal context is cleared
(simulating AI framework tool invocation). The plugin must restore
the user's identity from ThreadExecutionContext.UserAccessContext.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
SetContext directly instead of SwitchAccessContext — no await needed,
no disposal needed. The AsyncLocal is scoped to the thread's InvokeAsync
async flow so setting it once per tool call is sufficient.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The delegation tool calls meshService.CreateNode(subThreadNode) which
requires Permission.Thread. Without access context (AsyncLocal lost in
AI framework's tool invocation), this fails silently → delegation returns
error → AI retries infinitely creating endless delegation attempts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests end-to-end delegation: agent calls delegate_to_agent tool,
which creates a sub-thread and submits to it. Verifies access context
flows through the AI tool invocation chain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- RoutingGrain: log resolution details at Info level for debugging
- Delegation: guard against depth >= 3 to prevent infinite sub-threads
- ThreadPathResolutionTest: verifies PostgreSQL correctly resolves
  deeply nested _Thread paths via satellite table (all 5 tests pass)
- OrleansDelegationFlowTest: skeleton for end-to-end delegation test

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace per-tool RestoreUserContext() with AccessContextAIFunction
(DelegatingAIFunction) wrapper applied to ALL tools in CreateAgentCore.
Every tool invocation — MeshPlugin, delegation, PlanStorage, etc. —
now automatically restores the user's identity from
ThreadExecutionContext.UserAccessContext before executing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
StreamingCompact was recursively embedding sub-thread StreamingArea
via LayoutAreaControl when tc.Result == null. This caused infinite
grain activations when the sub-thread didn't exist (CreateNode failed
due to missing access context) — each failed activation triggered
another embed attempt.

Now delegation links are static with status indicators (dot/checkmark).
No recursive LayoutAreaControl embedding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- AccessContextToolCallTest: verifies tool calls restore user identity
  from ThreadExecutionContext, even when AsyncLocal is cleared
- StreamingRecursionTest: verifies delegation ToolCalls don't trigger
  recursive LayoutAreaControl embedding
- DelegationDepthGuard: verifies depth >= 3 is detected correctly

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Previous guard counted _Thread segments which didn't detect
Worker→Worker→Worker recursion. Now counts segments after _Thread/
to determine real delegation depth (each level adds msgId/subId = 2
segments). Maximum depth = 2 (one delegation level).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ToolStatusFormatter: show agent name (without Agent/ prefix) + task
  preview instead of "Delegating to Agent/Worker"
- appsettings: set MeshWeaver.AI and RoutingGrain to Information level
  so delegation and routing traces appear in App Insights
- Fix delegation depth guard to count actual nesting from path segments

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Delegation entries now use <details> (same as regular tool calls) so
users can expand to see the full task and result. Removed the recursive
LayoutAreaView embed for in-progress delegations that caused stack
overflow via cascading grain activations.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
NotifyParentCompletion was using DelegationTracker (static in-memory
dictionary) which can't work across Orleans silos. Now posts a second
SubmitMessageResponse with Status=ExecutionCompleted back through the
hub, which the parent's RegisterCallback receives to resolve the
delegation TCS.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Tests that SubmitMessageRequest produces both CellsCreated and
ExecutionCompleted responses via RegisterCallback. This is the exact
pattern used by the delegation tool — without the second response,
the parent thread hangs forever.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ration

Root cause: RegisterCallback removes callbacks after first invocation.
The CellsCreated response consumed the callback, leaving nothing for
ExecutionCompleted → parent thread hung forever.

Fix:
- HandleSubmitMessage registers CompletionCallbacks[threadPath] closure
  that posts ResponseFor(originalDelivery) on the thread hub
- NotifyParentCompletion invokes the callback to send ExecutionCompleted
- Delegation tool re-registers callback after CellsCreated response

DelegationCompletionTest verifies both responses arrive via RegisterCallback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
StreamingView now shows:
- Thread title as clickable link
- Executing message's Overview (bubble with text + tool calls)

For executing delegations in the bubble, embeds the sub-thread's
StreamingView (bounded by delegation depth guard, max 2 levels).
For completed delegations, shows expandable details.

No infinite recursion: StreamingView → Overview → Streaming is bounded
by the max delegation depth (2), not by rendering depth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
StreamingView: if thread has executing cell, return its default area.
Otherwise null. No title, no wrapping — simple passthrough.

StreamingCompact delegation rendering:
- Running (Result==null): show name + embed sub-thread's Streaming area
- Completed (Result!=null): show title with link (checkmark)

Recursion is bounded by delegation depth guard (max 2 levels):
StreamingView → Overview → StreamingCompact → sub-thread Streaming →
sub-thread Overview → done (no further delegation at max depth).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…alls

When recovering a stale executing thread after restart, delegation
tool calls now check their sub-thread's status:
- Sub-thread completed (IsExecuting=false): mark as done
- Sub-thread still running: mark as cancelled (parent can't re-subscribe)
- Non-delegation: mark as cancelled

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…de API

- AzureClaudeChatClient: handle DataContent as base64 document/image
  blocks in the Claude API format
- AgentChatClient: detect content: prefix with binary extensions (.pdf,
  .png, .jpg, etc.), load via IContentService as Stream, create
  DataContent and include in ChatMessage.Contents
- Path resolution: local (content:file.pdf → context path) or absolute
  (@OrgA/Doc/content:file.pdf)
- ChatMessage supports mixed content: TextContent + DataContent
- Tests for serialization, path parsing, and content type detection

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Foundation for document format conversion pipeline:
- IContentTransformer interface for binary-to-markdown conversion
- ContentCollection.GetContentAsTextAsync uses registered transformers
- DocSharp.Markdown package added for docx → markdown conversion

Next: restore ContentPlugin with xlsx/docx/pdf readers, wire into
content browser and agent attachments.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 3, 2026

Test Results

2 005 tests   - 530   1 997 ✅  - 525   3m 56s ⏱️ - 1m 19s
   20 suites  -  11       8 💤  -   5 
   20 files    -  11       0 ❌ ±  0 

Results for commit c27c513. ± Comparison against base commit 14ddfb7.

This pull request removes 549 and adds 19 tests. Note that renamed tests count towards both.
MeshWeaver.Data.Test.ActivityTest ‑ Activity_ErrorThenCompleteInOrder_ReportsFailedStatus
MeshWeaver.Data.Test.ActivityTest ‑ Activity_WithError_ReportsFailedStatus
MeshWeaver.Data.Test.ActivityTest ‑ Activity_WorkspaceOperationsFlow_ParentSeesSubActivityError
MeshWeaver.Data.Test.ActivityTest ‑ TestActivity
MeshWeaver.Data.Test.ActivityTest ‑ TestAutoCompletion
MeshWeaver.Data.Test.CombinedAccessRestrictionTest ‑ Create_AsAdmin_ShouldSucceed
MeshWeaver.Data.Test.CombinedAccessRestrictionTest ‑ Create_AsAnonymous_ShouldFailGlobalRestriction
MeshWeaver.Data.Test.CombinedAccessRestrictionTest ‑ Create_AsAuthenticatedNonAdmin_ShouldFailTypeRestriction
MeshWeaver.Data.Test.CombinedAccessRestrictionTest ‑ Update_AsAuthenticatedNonAdmin_ShouldSucceed
MeshWeaver.Data.Test.DataCombinedValidatorTest ‑ CombinedValidators_AllOperationsValid_ShouldSucceed
…
MeshWeaver.ContentCollections.Test.DocxConversionTest ‑ ContentAutocomplete_ExactMatch_Gets_Highest_Priority
MeshWeaver.ContentCollections.Test.DocxConversionTest ‑ ContentAutocomplete_Filters_And_Scores_By_Query
MeshWeaver.ContentCollections.Test.DocxConversionTest ‑ ContentAutocomplete_Wraps_Spaces_In_Quotes
MeshWeaver.ContentCollections.Test.DocxConversionTest ‑ DocSharpContentTransformer_Converts_Docx_To_Markdown
MeshWeaver.ContentCollections.Test.DocxConversionTest ‑ FileContentProvider_Auto_Converts_Docx
MeshWeaver.ContentCollections.Test.DocxConversionTest ‑ FileContentProvider_Returns_PlainText_For_Md
MeshWeaver.ContentCollections.Test.DocxConversionTest ‑ GetDataRequest_Content_Prefix_Returns_Markdown_For_Docx
MeshWeaver.Threading.Test.AccessContextToolCallTest ‑ AccessContextAIFunction_RestoresIdentity_BeforeToolCall
MeshWeaver.Threading.Test.AccessContextToolCallTest ‑ AccessContextAIFunction_WithoutExecutionContext_DoesNotCrash
MeshWeaver.Threading.Test.AccessContextToolCallTest ‑ DelegationDepthGuard_BlocksExcessiveNesting
…

♻️ This comment has been updated with latest results.

rbuergi and others added 29 commits April 3, 2026 13:18
- CircuitAccessHandler uses IMeshQueryCore (no access control) for user
  lookup during login — before user identity is established
- IMeshQueryCore made internal with InternalsVisibleTo for Hosting.Blazor
- Fix double access control in cross-schema GenerateWhereClause

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Direct user nodes (User/{id}) must remain publicly readable for any
authenticated caller (portal hubs, other users). Only children (threads,
activities) delegate to ISecurityService for partition-level access control.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The UNION ALL wrapping subquery uses alias "combined" but
MapOrderBySelector returns "n.xxx" — strip the "n." prefix
for the outer ORDER BY clause.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- ApiTokenService: use WellKnownUsers.System for infrastructure queries
  (bypasses AC via Public inheritance in standard deployments)
- PostgreSqlMeshQuery: System identity returns empty userId to skip AC clause
- AgentChatClient: fix attachment section injection order in prompt assembly
- ToolStatusFormatter: remove stale agent name format that no longer applies
- ContentReferenceIntegrityTest: exclude known non-file icon paths from validation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sub-threads (delegation threads) were showing alongside top-level threads.
Top-level threads have namespace ending with /_Thread, sub-threads have
deeper nesting. Added wildcard support to namespace: query qualifier —
namespace:*/_Thread maps to ILIKE '%/_Thread' in SQL and wildcard match
in the in-memory evaluator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…add Edit area

- Fix HandleResubmitMessage deadlock: move workspace.UpdateMeshNode to handler body
  (grain scheduler), fire-and-forget meshService.CreateNode — no Subscribe callbacks
- Fix HandleSubmitMessage: same pattern — fire-and-forget node creation, state update
  in handler body
- Fix HandleDeleteFromMessage: direct workspace.UpdateMeshNode in handler body
- Fix HandleCancelStream: replace await QueryAsync with workspace reads
- Fix recovery: remove blocking .GetAwaiter().GetResult() call
- Remove ThreadMessage.Id (redundant with MeshNode.Id) — 48+ sites cleaned
- Remove ThreadMessage.DelegationPath (redundant with ToolCallEntry.DelegationPath)
- Add Edit area to ThreadMessage layout, bubble toggles via local Blazor state
- Propagate NodeChangeEntry from sub-threads via SubmitMessageResponse.UpdatedNodes
- Aggregate node changes: same path → min(VersionBefore), max(VersionAfter)
- Add AsynchronousCalls.md: truly async patterns documentation
- Add Orleans resubmit deadlock test (OrleansNodeChangePropagationTest)
- Tune distributed portal logging for AI debugging

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…meout

Orleans test classes each create their own TestCluster. Running them
in parallel causes grain address collisions and "No handler found" errors.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…tion

Restored Zip pattern for node creation — execution must wait for both
user and response nodes to exist before streaming starts. The Subscribe
callback only does hub.Post (safe from any thread), no workspace access.
Workspace update moved to handler body (grain scheduler).

Also: fix test timeout, add xunit.runner.json for Orleans sequential execution.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…rancy)

The grain is already [Reentrant] — InvokeAsync yields at await points.
The delegation hang in OrleansThreadStreamingTest is a DelegationPath
stamping timing issue, not a scheduler deadlock: tool calls appear
(DelegationFlow test passes) but DelegationPath is not stamped in time.

Added logging to ChatClientAgentFactory and ThreadExecution to trace
delegation flow: CreateNode, Post, DelegationStatus, DelegationPaths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ackages

- Microsoft.Agents.AI 1.0.0-rc1 → 1.0.0 (stable)
- Microsoft.Extensions.AI* 10.3.0 → 10.4.0
- OpenAI 2.8.0 → 2.9.1, Azure.Core 1.51.0 → 1.51.1
- Add FunctionInvocation middleware on ChatClientAgent via AsBuilder().Use()
  to forward tool calls in real-time via chat.ForwardToolCall
- This gives the streaming loop visibility into tool invocations that
  FunctionInvokingChatClient otherwise consumes internally

Note: test factories that create ChatClientAgent directly bypass the
middleware — they need to use ChatClientAgentFactory or add their own.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… subscription

- ToolCallingFakeChatClientFactory now wraps agents with function calling
  middleware (same as production ChatClientAgentFactory)
- Delegation_ParentShowsToolCall test uses reactive stream subscription
  instead of polling (child notifies parent pattern)
- Test still shows toolCalls=0 because the test factory doesn't register
  the delegate_to_agent tool — FunctionInvokingChatClient can't invoke
  unregistered tools. Production uses ChatClientAgentFactory which
  registers delegation tools via GetAgentTools().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DelegationFlow test passes vacuously (0 tool calls, All() is true on empty).
Added TODO: PushToResponseMessage sends UpdateThreadMessageContent with
ToolCalls but they arrive empty at the response grain. The middleware
fires (ForwardToolCall) and toolCallLog has entries, but the grain
doesn't receive them. Needs investigation into message routing from
InvokeAsync context to response message grain.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…gurePortalMesh

ConfigurePortalMesh registered FakeChatClientFactory with Order=0,
which won over test-specific factories. This caused wrong agents to run
(plain text instead of tool-calling/delegating) and tool calls to never
appear on response messages.

Fix: each test configurator registers its own factory. No default.
Production (MemexConfiguration) registers real factories via
AddAzureFoundryClaude, AddAzureOpenAI, etc. — no conflict.

Result: DelegationFlow_SubThreadStreamsText_ParentCompletes now passes
with actual tool call assertion (was passing vacuously before).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…elegation tests

- Fix: FunctionResultContent handler now preserves existing DelegationPath
  that was stamped by UpdateDelegationStatus. Previously it overwrote with
  null from ExtractDelegationPath when the result text didn't contain the path.
- Add OrleansDelegationTest with DelegationTestAgentFactory extending
  ChatClientAgentFactory — uses the full production pipeline (delegation tools,
  MeshPlugin, function calling middleware).
- Delegation_ToolCallsAppear_WithDelegationPath: verifies delegation tool call
  appears with DelegationPath set, sub-thread created and completed.
- Resubmit_AfterDelegation_DoesNotDeadlock: verifies no hang after resubmit.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- SharedOrleansFixture: boots TestCluster once per assembly via
  ICollectionFixture. Uses SwappableChatClientFactory for test isolation.
- DelegationTestAgentFactory needs per-grain hub (ChatClientAgentFactory
  subclass), so delegation tests keep per-class cluster.
- Both delegation tests pass individually. Sequential execution within
  same class may timeout due to cluster overhead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…AzureAI)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… remaining ThreadMessage.Id

- DependencyInjection.Abstractions 10.0.3 → 10.0.4
- Logging.Abstractions 10.0.3 → 10.0.4
- Options 10.0.2 → 10.0.3
- Fix ThreadMessage.Id in PathResolution and Security tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- actions/checkout v3/v4 → v6
- actions/setup-dotnet v3/v4 → v5
- actions/upload-artifact v4 → v6

Node.js 20 deprecated June 2026, removed Sept 2026.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…submit flow

- Extract HandleResubmitMessage + HandleDeleteFromMessage to ThreadMessageHandlers.cs
- Delete EditMessageRequest completely (record, type registration, all references)
  Editing is handled purely in Blazor bubble (local isEditing toggle)
- Merge ThreadsLayoutArea.cs into ThreadLayoutAreas.cs, delete old file
- Fix HandleResubmitMessage: single UpdateMeshNode, delete ALL old cells,
  create new output cell, link as last cell only after node exists
- Fix ContextPath: read MainNode from thread, never use threadPath as context
  (avoids GetRemoteStream "owner same as subscriber" crash)
- Add JsonException catch in ConvertJson for transient type mismatches
- Update Edit area docstring (Cancel is Blazor-side)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ddress

After saving a node to persistence, post CreateNodeRequest to the node's
own grain address. The init gate allows CreateNodeRequest through, so
the grain initializes with the correct NodeType and HubConfiguration
before any other messages (like SubmitMessageRequest) arrive.

Without this, sub-thread grains activated by SubmitMessageRequest would
load with default config (no HandleSubmitMessage) because the init gate
hadn't received a CreateNodeRequest to trigger NodeType resolution.

The grain's handler sees "node already exists" — this is expected and
the response is fire-and-forget.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Create new output cell FIRST, then in one UpdateMeshNode: truncate
  old messages, delete old cells, link new cell, start execution
- Set IsExecuting=true with "Preparing..." status IMMEDIATELY in handler
  body (before async CreateNode) so the UI shows spinner instantly

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removed extra UpdateMeshNode for IsExecuting feedback — the single
update inside CreateNode.Subscribe does everything: truncate, delete
old cells, link new cell, set IsExecuting. One cycle, not two.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Set IsExecuting=true with "Preparing..." status immediately in handler
body so the UI shows the spinner before the async CreateNode round trip.
The Subscribe callback then does the full update (truncate, link, execute).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When CreateNodeRequest routes through the RoutingGrain, the path resolver
resolves to the parent node (child doesn't exist yet) and caches the
result. When SubmitMessageRequest follows, it hits the stale cache and
routes to the wrong grain (parent instead of child).

Fix: only cache exact matches (no remainder). Partial matches are
transient — the child node will exist after CreateNodeRequest completes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Worker and Coder agents now have explicit "CRITICAL: You MUST produce
output" instructions. Every task must end with a write tool call.
Agents must never just describe what they would create.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…lete failures

Posting CreateNodeRequest to the node's own grain after creation caused
5 CI test failures: deleted nodes were re-created by the queued
CreateNodeRequest. The path resolver cache fix (no caching partial
matches) is the correct fix for sub-thread routing.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…LISTEN/NOTIFY)

MeshCatalog subscribes to IDataChangeNotifier on construction. When
nodes are created or deleted in the DB, PostgreSqlChangeListener
publishes notifications via LISTEN/NOTIFY on mesh_node_changes channel.
MeshCatalog invalidates the exact path AND all ancestor paths from
the resolution cache.

This fixes sub-thread routing: CreateNodeRequest resolves to parent
(cached), then SubmitMessageRequest hits stale cache. Now the cache
entry is evicted when the sub-thread node is persisted, so the next
resolution finds the correct grain.

All resolutions are cached again (including partial matches with
remainder) since we have proper DB-driven invalidation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant