feat: Retry individual runs on platform inference errors by thomasvangurp · Pull Request #333 · ridgesai/ridges

thomasvangurp · 2026-03-17T00:51:33Z

Summary

Builds on feat: Detect platform-side inference errors #332 by @statxc (platform inference error detection)
Instead of restarting the entire evaluation when inference errors exceed the threshold, retries only the specific failed run
Addresses feedback from @ibraheem-abe in feat: Detect platform-side inference errors #332: "We want to change this so only that specific test is retried instead of the entire thing"

What changed

File	Change
`validator/main.py`	Retry loop in `_run_evaluation_run()`: on inference error threshold, reset counter and retry just this run (up to 2 retries). Only marks as platform error (3050) after exhausting retries.
`inference_gateway/error_hash_map.py`	Added `reset_inference_errors()` method to clear error count before retry
`inference_gateway/main.py`	Added `POST /api/reset-inference-errors` endpoint
`tests/test_inference_error_tracking.py`	Added tests for reset behavior

How it works

Agent runs problem X → hits 5 inference errors → threshold exceeded
  → Validator resets error counter
  → Validator retries ONLY problem X (not the whole evaluation)
  → If retry also fails → one more retry
  → If all 3 attempts fail → mark as platform error (3050)
Meanwhile, problems Y and Z continue running normally

Test plan

Verify ErrorHashMap.reset_inference_errors() clears the count
Verify reset doesn't affect other runs
Verify retry loop retries on threshold exceeded
Verify platform error raised after max retries exhausted
Run python3 -m pytest tests/test_inference_error_tracking.py -v

🤖 Generated with Claude Code

@statxc

…ailing entire evaluation Based on ridgesai#332 by @statxc which detects platform-side inference errors. Changes the behavior so that when an evaluation run hits the inference error threshold, only that specific run is retried (up to 2 times) instead of marking the entire evaluation as failed. Flow: 1. Agent finishes → validator checks /api/usage for inference errors 2. If errors >= threshold and retries remaining: - Reset error counter via POST /api/reset-inference-errors - Re-run only this specific problem (not the whole evaluation) 3. If errors >= threshold and retries exhausted: - Mark as PLATFORM_TOO_MANY_INFERENCE_ERRORS (3050) New additions on top of ridgesai#332: - ErrorHashMap.reset_inference_errors() method - POST /api/reset-inference-errors gateway endpoint - Retry loop in _run_evaluation_run() with MAX_SINGLE_RUN_RETRIES=2 - Tests for reset behavior Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

thomasvangurp · 2026-03-17T00:51:55Z

@ibraheem-abe This implements the change you requested in #332 — retrying only the specific run instead of restarting the entire evaluation. Would appreciate your review when you get a chance. Thanks!

- Fix test_reset_endpoint_allows_new_inferences_after_threshold to verify reset via usage endpoint instead of calling inference (avoids mock issues) - Mark TestValidatorRetryLogic as skip when NETUID not set (requires full env) - Move validator imports inside test to avoid import-time config failures - 24 pass, 1 skipped (validator retry test needs full environment) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Retry individual runs on platform inference errors#333

feat: Retry individual runs on platform inference errors#333
thomasvangurp wants to merge 2 commits intoridgesai:mainfrom
thomasvangurp:feat/retry-single-run-on-inference-errors

thomasvangurp commented Mar 17, 2026

Uh oh!

thomasvangurp commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

thomasvangurp commented Mar 17, 2026

Summary

What changed

How it works

Test plan

Uh oh!

thomasvangurp commented Mar 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant