-
Notifications
You must be signed in to change notification settings - Fork 4.8k
OCPBUGS-67313: Fix race condition in external binary extraction #30616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@stbenjam: This pull request references Jira Issue OCPBUGS-67313, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@stbenjam: This pull request references Jira Issue OCPBUGS-67313. The bug has been updated to no longer refer to the pull request using the external bug tracker. All external bug links have been closed. The bug has been moved to the NEW state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/pipeline required |
|
Scheduling required tests: |
When multiple openshift-tests processes run concurrently and share the same cache directory, they can race on extracting and decompressing the same binaries. This commonly occurs when using shell pipelines like: openshift-tests run ... --dry-run | openshift-tests run -f - Both processes start simultaneously, and without synchronization, one process may delete a .gz file while another is still trying to access it, resulting in errors like "failed to delete original .gz file: no such file or directory". Add flock-based file locking around the extraction process to ensure only one process extracts a given binary at a time. The lock is automatically released when the process exits, even if killed.
e8b95f9 to
a46dd04
Compare
|
/jira refresh |
|
@stbenjam: This pull request references Jira Issue OCPBUGS-67313, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@stbenjam: This pull request references Jira Issue OCPBUGS-67313, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira backport release-4.21,release-4.20 |
|
@stbenjam: The following backport issues have been created:
Queuing cherrypicks to the requested branches to be created after this PR merges: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-ci-robot: once the present PR merges, I will cherry-pick it on top of DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
petr-muller
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: petr-muller, stbenjam The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/pipeline required |
|
Scheduling required tests: |
|
/verified by CI |
|
@stbenjam: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/override ci/prow/okd-scos-images |
|
@stbenjam: Overrode contexts on behalf of stbenjam: ci/prow/okd-scos-images DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@stbenjam: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
@stbenjam: Jira Issue Verification Checks: Jira Issue OCPBUGS-67313 Jira Issue OCPBUGS-67313 has been moved to the MODIFIED state and will move to the VERIFIED state when the change is available in an accepted nightly payload. 🕓 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@openshift-ci-robot: new pull request created: #30641 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
@openshift-ci-robot: new pull request created: #30642 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Fix included in accepted release 4.22.0-0.nightly-2025-12-25-043524 |
When multiple openshift-tests processes run concurrently and share the same cache directory, they can race on extracting and decompressing the same binaries. This commonly occurs when using shell pipelines like:
Both processes start simultaneously, and without synchronization, one process may delete a .gz file while another is still trying to access it, resulting in errors like
failed to delete original .gz file: no such file or directory.Add flock-based file locking around the extraction process to ensure only one process extracts a given binary at a time. The lock is automatically released when the process exits, even if killed.