Skip to content

Improve Claude review reliability#14526

Open
xingbowang wants to merge 2 commits intofacebook:mainfrom
xingbowang:fix/claude-review-timeout
Open

Improve Claude review reliability#14526
xingbowang wants to merge 2 commits intofacebook:mainfrom
xingbowang:fix/claude-review-timeout

Conversation

@xingbowang
Copy link
Copy Markdown
Contributor

@xingbowang xingbowang commented Mar 28, 2026

Problems

(1) Claude review times out on large PRs
The claude-code-base-action defaults to timeout_minutes=10. Large PRs (e.g. #14499) get killed with exit code 124 after 600 seconds mid-review.

(2) Exported PRs never get reviewed
PRs exported from Meta's internal pipeline (e.g. #14515) trigger CI within milliseconds of PR creation. The workflow_run payload has pull_requests=[], and the SHA-based fallback also misses because GitHub hasn't registered the PR yet — so Claude review never fires.

Fixes

  • Set timeout_minutes: "60" on both auto-review and manual-review Run Claude steps
  • Retry the SHA→PR lookup up to 5 times with a 10s delay, giving GitHub up to ~50s to register the PR. Also bumped per_page from 30 to 100.

The claude-code-base-action defaults to timeout_minutes=10, which is
insufficient for large PRs. The review of PR facebook#14499 timed out after
600 seconds (exit code 124) while Claude was mid-review.

Set timeout_minutes=60 for both auto-review and manual-review jobs.
@meta-cla meta-cla bot added the CLA Signed label Mar 28, 2026
@github-actions
Copy link
Copy Markdown

✅ clang-tidy: No findings on changed lines

Completed in 0.0s.

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 28, 2026

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 020ff0f


Code Review: ci: increase Claude review timeout from 10 to 60 minutes

PR: ci: increase Claude review timeout from 10 to 60 minutes
Author: xingbowang
Scope: .github/workflows/claude-review.yml only (CI configuration, no RocksDB core code)

Summary

This PR makes two changes to the Claude review CI workflow:

  1. Timeout increase: Adds timeout_minutes: "60" to both auto-review and manual-review Run Claude steps.
  2. SHA-to-PR retry logic: Replaces a single API call to find the PR by SHA with a retry loop (up to 5 attempts, 10s delay), and increases per_page from 30 to 100.

Overall Assessment

LGTM. Straightforward CI reliability improvement. The changes are well-motivated and correctly implemented.

Detailed Findings

Finding 1: Timeout increase — Correct and appropriate

Severity: No issue (positive change)

Adding timeout_minutes: "60" to both auto-review (line 199) and manual-review (line 464) steps fixes large-PR timeouts. Both invocation sites are covered.

Finding 2: Retry loop — Correct implementation

Severity: No issue (positive change)

  • First attempt runs immediately (no delay on attempt === 0), preserving current behavior.
  • Subsequent retries wait 10s each, for up to ~50s total — reasonable for GitHub's PR registration lag.
  • break on match correctly exits the loop.
  • Falls through to if (!prNumber) with improved error message on exhaustion.

Finding 3 (Suggestion): per_page: 100 — reasonable but bounded

Severity: Low / Informational

For repos with >100 open PRs, the SHA could still be missed. For RocksDB's typical open-PR count, 100 is sufficient. Pagination could be added later if needed.

Finding 4: No security concerns

  • Permissions remain contents: read only.
  • Same GitHub API call, no new secrets/tokens.
  • headSha is GitHub-controlled (context.payload.workflow_run.head_sha), not user-controlled.

Finding 5: No cost concern with longer timeout

timeout_minutes: "60" is a safety net. Actual runtime is governed by max_turns: "300".

Verdict

Approve. Clean, well-scoped CI fix. No correctness, security, or compatibility concerns.


ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

  • Claude may miss context from files not in the diff
  • Large PRs may be truncated
  • Always apply human judgment to AI suggestions

Commands:

  • /claude-review [context] — Request a code review
  • /claude-query <question> — Ask about the PR or codebase

Exported PRs (e.g. from Meta's internal export pipeline) can trigger
CI within milliseconds of PR creation. When this happens, the
workflow_run payload has pull_requests=[], and the SHA-based fallback
lookup also misses because the PR isn't registered yet.

Fix by retrying the SHA lookup up to 5 times with a 10s delay, giving
GitHub up to ~50 seconds to associate the PR. Also increase per_page
from 30 to 100 to reduce the chance of missing the PR in busy repos.
@xingbowang xingbowang changed the title ci: increase Claude review timeout from 10 to 60 minutes Improve Claude review reliability Mar 28, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Mar 28, 2026

@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D98636217.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant