-
-
Notifications
You must be signed in to change notification settings - Fork 84
Description
Problem
When gateway.tailscale.mode is set to "serve", all OpenClaw CLI commands that use WebSocket connections to the gateway fail with:
gateway connect failed: Error: gateway closed (1000):
nodes status failed: Error: gateway closed (1000 normal closure): no close reason
Gateway target: ws://127.0.0.1:18789
Source: local loopback
This affects:
openclaw health --json(used by AlphaClaw watchdog for health checks)openclaw nodes status --json(used by/api/nodesroute)openclaw nodes pending --jsonopenclaw devices list --json
The HTTP /health endpoint works fine — only WS-based CLI commands are broken.
Impact
-
False health failures: AlphaClaw's watchdog uses
openclaw health --jsonfor health checks. When these fail, the UI shows the gateway as "constantly restarting" even though it's running fine. -
Zombie process flood: The
/api/nodesdashboard route spawnsopenclaw nodes status/pendingCLI processes that hang indefinitely (WS never completes), accumulating dozens of zombie processes that consume CPU and memory. -
Gateway overload: The zombie WS connections flood the gateway with handshake timeouts, degrading performance.
Root Cause
Tailscale Serve intercepts loopback traffic to the gateway port (18789) and appears to break the WS handshake for CLI subprocess connections. When tailscale.mode is set to "off", all CLI commands work instantly.
Environment
- OpenClaw: 2026.3.13
- AlphaClaw: 0.8.0
- Node: v24.14.0
- OS: Ubuntu Noble (Linux 6.8.0-106-generic)
- Tailscale: 1.94.2
Workaround
Set gateway.tailscale.mode: "off" and use SSH tunnels or gateway.bind: "tailnet" for remote node connections instead of Tailscale Serve.
Suggested Fix
Consider using the HTTP /health endpoint for watchdog health checks instead of the WS-based CLI, since HTTP works regardless of Tailscale Serve configuration. The /api/nodes route could also use an HTTP-based approach or add a timeout to prevent zombie processes.