[Ci] Improve integration checkout resilience on self-hosted runners#22
Merged
jiahy0825 merged 18 commits intoSandAI-org:mainfrom Apr 14, 2026
Merged
Conversation
Replace dual actions/checkout steps with a manual git fetch strategy that retries up to 15 times (20s interval) before failing, to reduce transient network/proxy checkout flakes without changing low-speed threshold settings.
035e140 to
7f7f6b7
Compare
Handle pre-existing .git directories by reusing/updating origin remote instead of blindly running `git remote add origin`.
Keep strict low-speed thresholds while using proxy in early attempts, then fallback to proxy-unset direct fetch with relaxed low-speed limits for later retries.
Shorten retry backoff from 20s to 5s to speed up recovery when transient network errors clear quickly.
Remove common stale .git lock files before/after each retry attempt so leftover locks from previous interrupted runs don't block fetch.
Use two-dot diff (`base..head`) instead of three-dot to prevent `no merge base` failures when only shallow commit objects are fetched.
Switch checkout retries to strict/relaxed alternating strategy (odd attempts strict via proxy, even attempts relaxed direct mode).
jiahy0825
reviewed
Apr 13, 2026
Move the long retry-based PR checkout logic out of integration_test.yml into .github/scripts/checkout_pr.sh, and keep the workflow step minimal with only REPO_URL/BASE_SHA/HEAD_SHA inputs.
Inline the checkout retry logic back into integration_test.yml so the step does not depend on an external script file that may be unavailable in runner workspaces.
Download .github/scripts/checkout_pr.sh from PR head SHA via GitHub contents API before running checkout, so the workflow no longer depends on local script availability in stale workspaces.
git-fetch consistently times out on self-hosted runners due to network instability. Switch to downloading the HEAD tarball via GitHub REST API (single HTTP request, curl-based) as the primary method, with git-fetch kept as a fallback.
After downloading HEAD/BASE tarballs via GitHub API, create a local git repo with two commits (base -> head) and git-replace refs so that `git rev-parse <sha>` and `git diff base..head` work correctly for downstream CI steps (check_chinese_chars, pre-commit, etc.).
Replace bash checkout_pr.sh with checkout_pr.py. Strategy unchanged: git-fetch with retry first, tarball fallback with synthetic git history.
git-fetch consistently times out on self-hosted runner; tarball via GitHub API is reliable. Swap order: tarball first, git-fetch fallback.
urllib honors env proxy which may throttle large downloads. Switch to curl subprocess with alternating proxy/direct attempts.
git-replace is unreliable for mapping real SHAs to synthetic commits. Instead, output local base_ref/head_ref via GITHUB_OUTPUT and use step outputs in downstream steps (check_chinese_chars).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
🗂️ PR Category
📝 Description
This PR replaces the dual actions/checkout flow with a retry-based manual fetch that is more robust under unstable network/proxy conditions. It adds per-attempt timeouts, alternating strict/relaxed fetch modes (including direct fallback without proxy), stale lock-file cleanup, and a merge-base-free debug diff path to avoid shallow-fetch failures.