Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
155 changes: 155 additions & 0 deletions ISSUE_18295_FIX.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
# Fix for GitHub Issue #18295: MCP Tool Calling Loop Issues

## Problem Statement

Agents were getting stuck in infinite loops when calling MCP tools like `github-list_commits`. The agent would repeatedly make the same tool call, receive a response indicating the payload was too large, attempt to read the payload file, fail (FILE_NOT_FOUND), and retry - creating an infinite loop that would only stop when the workflow timed out.

## Root Cause Analysis

The issue was caused by the interaction between three factors:

1. **Low Default Threshold**: The default payload size threshold was set to 10KB (10,240 bytes)
2. **Typical GitHub API Response Sizes**: GitHub API responses (especially `list_commits` over 3 days) frequently exceed 10KB:
- Small query (1-5 commits): ~2-5KB
- Medium query (10-30 commits): **10-50KB** ← Often exceeds threshold
- Large query (100+ commits): 100KB-1MB

3. **Inaccessible Payload Path**: When a response exceeded 10KB:
- The middleware would save the payload to disk at an **absolute host path**: `/tmp/jq-payloads/{sessionID}/{queryID}/payload.json`
- The middleware would return metadata with `payloadPath` pointing to this host path
- The agent would try to read this path, but it **doesn't exist in the agent's container filesystem**
- The agent would see `FILE_NOT_FOUND` and retry the tool call
- This created an infinite loop

## Example from Issue #18295

From the user's logs:
```
github-list_commits
└ {"agentInstructions":"The payload was too large for an MCP response. The comp...

✗ bash: cat /tmp/gh-aw/mcp-payloads/***/a47e03f1b3561c858a06b84d5e02eb38/payload.json 2>/dev/null || echo "FILE_NOT_FOUND"
"description": Required
```

The agent received:
- `agentInstructions`: "The payload was too large..."
- `payloadPath`: `/tmp/jq-payloads/{session}/{query}/payload.json`

Then tried to read the file, got `FILE_NOT_FOUND`, and retried the tool call.

## Solution

**Increase the default payload size threshold from 10KB to 512KB (524,288 bytes)**

This ensures that:
1. Typical GitHub API responses (10-50KB) are returned **inline** without disk storage
2. Only truly large responses (>512KB) trigger the payload-to-disk mechanism
3. Agents don't encounter the inaccessible file path issue for normal operations
4. The threshold can still be overridden via:
- Environment variable: `MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD=<bytes>`
- Command-line flag: `--payload-size-threshold <bytes>`
- Config file: `payload_size_threshold = <bytes>`

## Changes Made

### Code Changes

1. **internal/config/config_payload.go**
- Changed `DefaultPayloadSizeThreshold` from `10240` to `524288`
- Updated comment to explain rationale

2. **internal/cmd/flags_logging.go**
- Changed `defaultPayloadSizeThreshold` from `10240` to `524288`
- Updated comment

3. **internal/config/config_core.go**
- Updated comment from "10KB" to "512KB"

4. **internal/cmd/flags_logging_test.go**
- Updated test assertion from `10240` to `524288`

### Documentation Changes

1. **README.md**
- Updated CLI flag default from `10240` to `524288`
- Updated environment variable table default from `10240` to `524288`
- Updated configuration alternative default from `10240` to `524288`

2. **config.example-payload-threshold.toml**
- Updated default from `10240` to `524288`
- Updated examples to use larger, more realistic values:
- 256KB (more aggressive storage)
- 512KB (default)
- 1MB (minimize disk storage)

## Testing

All tests pass with the new default:
- Unit tests: ✅ PASS (all packages)
- Integration tests: ✅ PASS
- Configuration tests: ✅ PASS

## Impact

### Before Fix (10KB threshold)
- GitHub `list_commits` responses frequently exceeded threshold
- Agents got stuck in infinite loops trying to read inaccessible files
- Workflows would timeout after repeatedly calling the same tool
- Poor user experience

### After Fix (512KB threshold)
- GitHub `list_commits` responses are returned inline (typical size 10-50KB)
- Agents receive complete data without file system access issues
- No more infinite loops for typical use cases
- Greatly improved user experience

### Performance Considerations

**Memory Impact**: Minimal. Most responses are still well under 512KB.

**Network Impact**: Reduced. Returning data inline is faster than writing to disk and returning metadata.

**Disk I/O Impact**: Significantly reduced. Fewer responses trigger disk storage.

## Configuration Options

Users can still customize the threshold for their specific needs:

```bash
# Lower threshold (more aggressive disk storage)
MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD=262144 ./awmg --config config.toml

# Higher threshold (minimize disk storage)
MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD=1048576 ./awmg --config config.toml

# Or via config file
[gateway]
payload_size_threshold = 524288
```

## Related Issue

GitHub Issue: https://github.com/github/gh-aw/issues/18295

## Backward Compatibility

✅ **Fully backward compatible**

- Existing configurations continue to work
- Environment variable and CLI flag still functional
- Users can explicitly set the old 10KB threshold if desired
- New default is a **quality-of-life improvement** that makes the gateway work better out-of-the-box

## Future Considerations

If payload path accessibility issues persist for responses >512KB:

1. Consider adding a mount configuration to make payload paths accessible to agents
2. Consider adding a flag to return large payloads inline (disable disk storage)
3. Consider implementing payload compression to reduce size before threshold check
4. Consider per-tool threshold configuration for tools known to return large responses

## Summary

The fix addresses the root cause of the infinite loop issue by ensuring that typical MCP tool responses are small enough to be returned inline, avoiding the inaccessible file path problem entirely. The threshold remains configurable for advanced use cases, maintaining flexibility while providing sensible defaults.
25 changes: 17 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -220,11 +220,18 @@ See **[Configuration Specification](https://github.com/github/gh-aw/blob/main/do

**Configuration Alternatives**:
- **`payloadSizeThreshold`** is not supported in JSON stdin format. Use:
- CLI flag: `--payload-size-threshold <bytes>` (default: 10240)
- CLI flag: `--payload-size-threshold <bytes>` (default: 524288)
- Environment variable: `MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD=<bytes>`
- TOML config file: `payload_size_threshold = <bytes>` in `[gateway]` section
- Payloads **larger** than this threshold are stored to disk and return metadata
- Payloads **smaller than or equal** to this threshold are returned inline
- **`payloadPathPrefix`** is not supported in JSON stdin format. Use:
- CLI flag: `--payload-path-prefix <path>` (default: empty - use actual filesystem path)
- Environment variable: `MCP_GATEWAY_PAYLOAD_PATH_PREFIX=<path>`
- TOML config file: `payload_path_prefix = "<path>"` in `[gateway]` section
- When set, the `payloadPath` returned to clients uses this prefix instead of the actual filesystem path
- Example: Gateway saves to `/tmp/jq-payloads/session/query/payload.json`, but returns `/workspace/payloads/session/query/payload.json` to clients if `payload_path_prefix = "/workspace/payloads"`
- This allows agents running in containers to access payload files via mounted volumes

**Environment Variable Features**:
- **Passthrough**: Set value to empty string (`""`) to pass through from host
Expand Down Expand Up @@ -297,13 +304,14 @@ Flags:
-l, --listen string HTTP server listen address (default "127.0.0.1:3000")
--log-dir string Directory for log files (falls back to stdout if directory cannot be created) (default "/tmp/gh-aw/mcp-logs")
--payload-dir string Directory for storing large payload files (segmented by session ID) (default "/tmp/jq-payloads")
--payload-size-threshold int Size threshold (in bytes) for storing payloads to disk. Payloads larger than this are stored, smaller ones returned inline (default 10240)
--payload-path-prefix string Path prefix to use when returning payloadPath to clients (allows remapping host paths to client/agent container paths)
--payload-size-threshold int Size threshold (in bytes) for storing payloads to disk. Payloads larger than this are stored, smaller ones returned inline (default 524288)
--routed Run in routed mode (each backend at /mcp/<server>)
--sequential-launch Launch MCP servers sequentially during startup (parallel launch is default)
--unified Run in unified mode (all backends at /mcp)
--validate-env Validate execution environment (Docker, env vars) before starting
-v, --verbose count Increase verbosity level (use -v for info, -vv for debug, -vvv for trace)
--version version for awmg
--sequential-launch Launch MCP servers sequentially during startup (parallel launch is default)
--unified Run in unified mode (all backends at /mcp)
--validate-env Validate execution environment (Docker, env vars) before starting
-v, --verbose count Increase verbosity level (use -v for info, -vv for debug, -vvv for trace)
--version version for awmg

Use "awmg [command] --help" for more information about a command.
```
Expand Down Expand Up @@ -333,7 +341,8 @@ When running locally (`run.sh`), these variables are optional (warnings shown if
| `MCP_GATEWAY_API_KEY` | API authentication key | (disabled) |
| `MCP_GATEWAY_LOG_DIR` | Log file directory (sets default for `--log-dir` flag) | `/tmp/gh-aw/mcp-logs` |
| `MCP_GATEWAY_PAYLOAD_DIR` | Large payload storage directory (sets default for `--payload-dir` flag) | `/tmp/jq-payloads` |
| `MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD` | Size threshold in bytes for payload storage (sets default for `--payload-size-threshold` flag) | `10240` |
| `MCP_GATEWAY_PAYLOAD_PATH_PREFIX` | Path prefix for remapping payloadPath returned to clients (sets default for `--payload-path-prefix` flag) | (empty - use actual filesystem path) |
| `MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD` | Size threshold in bytes for payload storage (sets default for `--payload-size-threshold` flag) | `524288` |
| `DEBUG` | Enable debug logging with pattern matching (e.g., `*`, `server:*,launcher:*`) | (disabled) |
| `DEBUG_COLORS` | Control colored debug output (0 to disable, auto-disabled when piping) | Auto-detect |

Expand Down
28 changes: 21 additions & 7 deletions config.example-payload-threshold.toml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# 1. Command-line flag: --payload-size-threshold 2048
# 2. Environment variable: MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD=2048
# 3. Config file: payload_size_threshold = 2048
# 4. Default: 10240 bytes (10KB)
# 4. Default: 524288 bytes (512KB)

[gateway]
port = 3000
Expand All @@ -18,21 +18,35 @@ api_key = "your-api-key-here"
# Default: /tmp/jq-payloads
payload_dir = "/tmp/jq-payloads"

# Payload path prefix for remapping file paths returned to clients
# When set, payloadPath uses this prefix instead of the actual filesystem path
# This allows agents in containers to access payload files via mounted volumes
#
# Can also be set via:
# - Flag: --payload-path-prefix /workspace/payloads
# - Env: MCP_GATEWAY_PAYLOAD_PATH_PREFIX=/workspace/payloads
# Default: (empty - use actual filesystem path)
#
# Example:
# Gateway saves to: /tmp/jq-payloads/session123/query456/payload.json
# Returns to client: /workspace/payloads/session123/query456/payload.json
# Agent mounts: -v /tmp/jq-payloads:/workspace/payloads
# payload_path_prefix = "/workspace/payloads"

# Payload size threshold (in bytes) for storing responses to disk
# Payloads LARGER than this threshold are stored to disk and return metadata
# Payloads SMALLER than or equal to this threshold are returned inline
#
# Can also be set via:
# - Flag: --payload-size-threshold 2048
# - Env: MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD=2048
# Default: 10240 bytes (10KB)
# Default: 524288 bytes (512KB)
#
# Examples:
# payload_size_threshold = 512 # 512 bytes - more aggressive file storage
# payload_size_threshold = 1024 # 1KB - more aggressive file storage
# payload_size_threshold = 2048 # 2KB - fewer files, more inline responses
# payload_size_threshold = 10240 # 10KB - default, good for most use cases
payload_size_threshold = 10240
# payload_size_threshold = 262144 # 256KB - more aggressive file storage
# payload_size_threshold = 524288 # 512KB - default, balances inline vs disk storage
# payload_size_threshold = 1048576 # 1MB - minimize disk storage for most use cases
payload_size_threshold = 524288

[servers.github]
command = "docker"
Expand Down
11 changes: 10 additions & 1 deletion internal/cmd/flags_logging.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,20 +11,23 @@ import (
const (
defaultLogDir = "/tmp/gh-aw/mcp-logs"
defaultPayloadDir = "/tmp/jq-payloads"
defaultPayloadSizeThreshold = 10240 // 10KB default threshold
defaultPayloadPathPrefix = "" // Empty by default - use actual filesystem path
defaultPayloadSizeThreshold = 524288 // 512KB default threshold
)

// Logging flag variables
var (
logDir string
payloadDir string
payloadPathPrefix string
payloadSizeThreshold int
)

func init() {
RegisterFlag(func(cmd *cobra.Command) {
cmd.Flags().StringVar(&logDir, "log-dir", getDefaultLogDir(), "Directory for log files (falls back to stdout if directory cannot be created)")
cmd.Flags().StringVar(&payloadDir, "payload-dir", getDefaultPayloadDir(), "Directory for storing large payload files (segmented by session ID)")
cmd.Flags().StringVar(&payloadPathPrefix, "payload-path-prefix", getDefaultPayloadPathPrefix(), "Path prefix to use when returning payloadPath to clients (allows remapping host paths to client/agent container paths)")
cmd.Flags().IntVar(&payloadSizeThreshold, "payload-size-threshold", getDefaultPayloadSizeThreshold(), "Size threshold (in bytes) for storing payloads to disk. Payloads larger than this are stored, smaller ones returned inline")
})
}
Expand All @@ -41,6 +44,12 @@ func getDefaultPayloadDir() string {
return envutil.GetEnvString("MCP_GATEWAY_PAYLOAD_DIR", defaultPayloadDir)
}

// getDefaultPayloadPathPrefix returns the default payload path prefix, checking MCP_GATEWAY_PAYLOAD_PATH_PREFIX
// environment variable first, then falling back to the hardcoded default
func getDefaultPayloadPathPrefix() string {
return envutil.GetEnvString("MCP_GATEWAY_PAYLOAD_PATH_PREFIX", defaultPayloadPathPrefix)
}
Comment on lines +47 to +51
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A new --payload-path-prefix default/env helper was added, but there’s no corresponding test coverage (unlike getDefaultPayloadDir / getDefaultPayloadSizeThreshold). Please add tests for getDefaultPayloadPathPrefix() covering: no env var (default empty), env var set to a value, and env var set to empty string.

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added TestGetDefaultPayloadPathPrefix test with coverage for:

  • No env var (returns default empty string)
  • Env var set to a custom path
  • Env var set to empty string (returns default)

Commit: 0203bba


// getDefaultPayloadSizeThreshold returns the default payload size threshold, checking
// MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD environment variable first, then falling back to the hardcoded default
func getDefaultPayloadSizeThreshold() int {
Expand Down
42 changes: 41 additions & 1 deletion internal/cmd/flags_logging_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -154,7 +154,7 @@ func TestPayloadSizeThresholdFlagDefault(t *testing.T) {
t.Setenv("MCP_GATEWAY_PAYLOAD_SIZE_THRESHOLD", "")

result := getDefaultPayloadSizeThreshold()
assert.Equal(t, 10240, result, "Default should be 10240 bytes")
assert.Equal(t, 524288, result, "Default should be 524288 bytes (512KB)")
}

func TestPayloadSizeThresholdEnvVar(t *testing.T) {
Expand All @@ -163,3 +163,43 @@ func TestPayloadSizeThresholdEnvVar(t *testing.T) {
result := getDefaultPayloadSizeThreshold()
assert.Equal(t, 4096, result, "Environment variable should override default")
}

func TestGetDefaultPayloadPathPrefix(t *testing.T) {
tests := []struct {
name string
envValue string
setEnv bool
expected string
}{
{
name: "no env var - returns default",
setEnv: false,
expected: defaultPayloadPathPrefix,
},
{
name: "env var set - returns custom path",
envValue: "/workspace/payloads",
setEnv: true,
expected: "/workspace/payloads",
},
{
name: "empty env var - returns default",
envValue: "",
setEnv: true,
expected: defaultPayloadPathPrefix,
},
}

for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
if tt.setEnv {
t.Setenv("MCP_GATEWAY_PAYLOAD_PATH_PREFIX", tt.envValue)
} else {
t.Setenv("MCP_GATEWAY_PAYLOAD_PATH_PREFIX", "")
}

result := getDefaultPayloadPathPrefix()
assert.Equal(t, tt.expected, result)
})
}
}
8 changes: 8 additions & 0 deletions internal/cmd/root.go
Original file line number Diff line number Diff line change
Expand Up @@ -225,6 +225,14 @@ func run(cmd *cobra.Command, args []string) error {
cfg.Gateway.PayloadDir = payloadDir
}

// Apply payload path prefix flag (if different from default, it was explicitly set)
if cmd.Flags().Changed("payload-path-prefix") {
cfg.Gateway.PayloadPathPrefix = payloadPathPrefix
} else if payloadPathPrefix != "" && payloadPathPrefix != defaultPayloadPathPrefix {
// Environment variable was set
cfg.Gateway.PayloadPathPrefix = payloadPathPrefix
}

// Apply payload size threshold flag (if different from default, it was explicitly set)
if cmd.Flags().Changed("payload-size-threshold") {
cfg.Gateway.PayloadSizeThreshold = payloadSizeThreshold
Expand Down
9 changes: 8 additions & 1 deletion internal/config/config_core.go
Original file line number Diff line number Diff line change
Expand Up @@ -78,9 +78,16 @@ type GatewayConfig struct {
// PayloadDir is the directory for storing large payloads
PayloadDir string `toml:"payload_dir" json:"payload_dir,omitempty"`

// PayloadPathPrefix is the path prefix to use when returning payloadPath to clients.
// This allows remapping the host filesystem path to a path accessible in the client/agent container.
// If empty, the actual filesystem path (PayloadDir) is returned.
// Example: If PayloadDir="/tmp/jq-payloads" and PayloadPathPrefix="/workspace/payloads",
// then payloadPath will be "/workspace/payloads/{sessionID}/{queryID}/payload.json"
PayloadPathPrefix string `toml:"payload_path_prefix" json:"payload_path_prefix,omitempty"`

// PayloadSizeThreshold is the size threshold (in bytes) for storing payloads to disk.
// Payloads larger than this threshold are stored to disk, smaller ones are returned inline.
// Default: 10240 bytes (10KB)
// Default: 524288 bytes (512KB)
PayloadSizeThreshold int `toml:"payload_size_threshold" json:"payload_size_threshold,omitempty"`
}

Expand Down
6 changes: 4 additions & 2 deletions internal/config/config_payload.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,10 @@ const DefaultPayloadDir = "/tmp/jq-payloads"

// DefaultPayloadSizeThreshold is the default size threshold (in bytes) for storing payloads to disk.
// Payloads larger than this threshold are stored to disk, smaller ones are returned inline.
// Default: 10240 bytes (10KB)
const DefaultPayloadSizeThreshold = 10240
// Default: 524288 bytes (512KB) - chosen to accommodate typical MCP tool responses including
// GitHub API queries (list_commits, list_issues, etc.) without triggering disk storage.
// This prevents agent looping issues when payloadPath is not accessible in agent containers.
const DefaultPayloadSizeThreshold = 524288

func init() {
// Register default setter for PayloadDir and PayloadSizeThreshold
Expand Down
Loading
Loading