Fix: Check Process.alive? before sending to SSE handler by mellelieuwes · Pull Request #239 · cloudwalk/hermes-mcp

mellelieuwes · 2026-01-16T11:53:22Z

Summary

This PR includes two related fixes for session handling:

1. Fix stale SSE handler race condition

When an SSE handler process dies but the transport hasn't yet processed the :DOWN message, responses could be silently lost.

The issue:

SSE handler process dies (network drop, crash, etc.)
:DOWN message is queued to transport GenServer
Before transport processes :DOWN, a new request calls get_sse_handler
get_sse_handler returns the stale PID
send/2 silently drops the message to the dead process
Client receives HTTP 202 but never gets the actual response

The fix:

Add Process.alive? check in route_sse_response before sending
If the handler is stale, clean up the entry and establish a new SSE connection

2. Add session_expired error type

When a session is missing or expired (e.g., after server restart), return a clear error message instead of the vague "Server not initialized".

Changes:

New error code -32001 for session_expired
Clear message: "Session expired or not initialized. Please reconnect."
Enables MCP clients to detect and potentially auto-reconnect

Test plan

Added unit tests demonstrating the race condition
All existing tests pass (445 tests, 0 failures)
Verified fixes in production application (Flux MCP server)

🤖 Generated with Claude Code

This fixes a race condition where responses could be silently lost when an SSE handler process died but the transport hadn't yet processed the :DOWN message. The issue: 1. SSE handler process dies (network drop, crash, etc.) 2. :DOWN message is queued to transport GenServer 3. Before transport processes :DOWN, a new request calls get_sse_handler 4. get_sse_handler returns the stale PID 5. send/2 silently drops the message to the dead process 6. Client receives HTTP 202 but never gets the actual response The fix adds a Process.alive? check in route_sse_response before sending. If the handler is stale, it cleans up the entry and establishes a new SSE connection for the request.

Instead of returning a generic "Server not initialized" error when a session is missing or expired, return a specific session_expired error with code -32001 and message "Session expired or not initialized. Please reconnect." This gives clients: 1. A specific error code (-32001) they can detect and handle 2. A clear message telling users what to do 3. Potential for auto-reconnect in MCP clients

mellelieuwes added 2 commits January 16, 2026 12:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: Check Process.alive? before sending to SSE handler#239

Fix: Check Process.alive? before sending to SSE handler#239
mellelieuwes wants to merge 2 commits intocloudwalk:mainfrom
eyra:fix/stale-sse-handler

mellelieuwes commented Jan 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

mellelieuwes commented Jan 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

1. Fix stale SSE handler race condition

2. Add session_expired error type

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mellelieuwes commented Jan 16, 2026 •

edited

Loading