Merge https://github.com/vmware-tanzu/velero:main (66ac235) into oadp-dev#483
Merge https://github.com/vmware-tanzu/velero:main (66ac235) into oadp-dev#483oadp-rebasebot-app[bot] wants to merge 56 commits intoopenshift:oadp-devfrom
Conversation
…estore exec hooks. Signed-off-by: dongqingcc <dongqingcc@vmware.com>
Revert PR 6105. Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
…-PR9366 Add e2e test case for PR 9366
…mware-tanzu#9581) * Fix DBR stuck when CSI snapshot no longer exists in cloud provider During backup deletion, VolumeSnapshotContentDeleteItemAction creates a new VSC with the snapshot handle from the backup and polls for readiness. If the underlying snapshot no longer exists (e.g., deleted externally), the CSI driver reports Status.Error but checkVSCReadiness() only checks ReadyToUse, causing it to poll for the full 10-minute timeout instead of failing fast. Additionally, the newly created VSC is never cleaned up on failure, leaving orphaned resources in the cluster. This commit: - Adds Status.Error detection in checkVSCReadiness() to fail immediately on permanent CSI driver errors (e.g., InvalidSnapshot.NotFound) - Cleans up the dangling VSC when readiness polling fails Fixes vmware-tanzu#9579 Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com> * Add changelog for PR vmware-tanzu#9581 Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com> * Fix typo in pod_volume_test.go: colume -> volume Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com> --------- Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
The itemOperationTimeout field was missing from the Schedule API type documentation even though it is supported in the Schedule CRD template. This led users to believe the field was not available per-schedule. Fixes vmware-tanzu#9598 Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
…disable_search_in_site Disable Algolia docs search
Signed-off-by: lou <alex1988@outlook.com> Co-authored-by: lou <alex1988@outlook.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
1. Skip deleting the restore files from storage if the backup/BSL is not found 2. Allow deleting the restore files from storage even though the BSL is readonly Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
Add PSA audit and warn labels. Signed-off-by: Xun Jiang <jxun@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: allenxu404 <qix2@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
(cherry picked from commit 8e01d1b) Signed-off-by: David Zaninovic <dzaninovic@catalogicsoftware.com>
Replace the base image with paketobuildpacks image Fixes vmware-tanzu#6851 Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: allenxu404 <qix2@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Use informer cache with dynamic client for Get calls on restore When enabled, also make the Get call before create. Add server and install parameter to allow disabling this feature, but enable by default Signed-off-by: Scott Seago <sseago@redhat.com>
When creating resources with generateName, apimachinery does not guarantee uniqueness when it appends the random suffix to the generateName stub, so if it fails with already exists error, we need to retry. Signed-off-by: Scott Seago <sseago@redhat.com>
Signed-off-by: Sebastian Glab <sglab@catalogicsoftware.com>
Signed-off-by: allenxu404 <qix2@vmware.com>
PVC block mode backup and restore introduced some OS specific system calls. Those calls are not available for Windows, so add both non Windows version and Windows version code, and return error for block mode on the Windows platform. Signed-off-by: Xun Jiang <jxun@vmware.com>
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
Signed-off-by: allenxu404 <qix2@vmware.com>
add UBI dockerfiles Use numeric user for velero-restic-restore-helper Enable multiarch builds (openshift#135) Use arm64-graviton2 for arm builds (openshift#137) Add required keys for arm builds (openshift#139) Update Travis build job to work w/o changes on new branches Use a full VM for arm Use numeric non-root user for nonroot SCC compatibility
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Mateus Oliveira <msouzaol@redhat.com>
* fix: ARM images Signed-off-by: Mateus Oliveira <msouzaol@redhat.com> * fixup! fix: ARM images Signed-off-by: Mateus Oliveira <msouzaol@redhat.com> --------- Signed-off-by: Mateus Oliveira <msouzaol@redhat.com>
…#336) Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
…openshift#334) (openshift#338) add missing unit test for kopia hashing algo (openshift#337) Introduction of downstream only option to override Kopia default: - hashing algorithm - splitting algorithm - encryption algorithm With introduction of 3 environment variables it is possible to override Kopia algorithms used by Velero: KOPIA_HASHING_ALGORITHM KOPIA_SPLITTER_ALGORITHM KOPIA_ENCRYPTION_ALGORITHM If the env algorithms are not set or they are not within Kopia SupportedAlgorithms, the default algorithm will be used. This behavior is consistent with current behavior without this change. Signed-off-by: Michal Pryc <mpryc@redhat.com> Signed-off-by: Shubham Pampattiwar <shubhampampattiwar7@gmail.com>
The rework of Makefile to make it more readable and inclusion of lint as a target as well extract golangci-lint version from the upstream Dockerfile, so we test in PROW or locally on the same version as upstream. Signed-off-by: Michal Pryc <mpryc@redhat.com>
This fixes the PR openshift#334 where one additional line was in the code. This was not exposed previously as we did not had downstream CI Lint jobs. Signed-off-by: Michal Pryc <mpryc@redhat.com>
* run oadp-operator e2e test from the velero repo execute openshift/oadp-operator e2e tests directly against the velero repo locally or via prow ci Signed-off-by: Wesley Hayutin <weshayutin@gmail.com> * update variable names, add a cleanup * make sure env variable overrides default velero_image Signed-off-by: Wesley Hayutin <weshayutin@gmail.com> * add options to build, push, and only test Signed-off-by: Wesley Hayutin <weshayutin@gmail.com> * add arch to name Signed-off-by: Wesley Hayutin <weshayutin@gmail.com> * remove duplicated clean/rm operator checkout * simplify by dropping export var and use a oneliner Co-authored-by: Tiger Kaovilai <passawit.kaovilai@gmail.com> * drop export and use oneliner Co-authored-by: Tiger Kaovilai <passawit.kaovilai@gmail.com> * just in case, allow oadp to be deployed from makefile Signed-off-by: Wesley Hayutin <weshayutin@gmail.com> * Update Makefile.prow Co-authored-by: Tiger Kaovilai <passawit.kaovilai@gmail.com> --------- Signed-off-by: Wesley Hayutin <weshayutin@gmail.com> Co-authored-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>
…t#436) Signed-off-by: Scott Seago <sseago@redhat.com>
Fixes linting error. Signed-off-by: oadp-team-rebase-bot <oadp-maintainers@redhat.com>
Fix golangci-lint version extraction and disable concat-loop check. Signed-off-by: Michal Pryc <mpryc@redhat.com>
WalkthroughThe pull request addresses a CSI snapshot cleanup issue by adding error handling and fast-fail logic to VolumeSnapshotContent processing, includes corresponding unit and end-to-end tests, removes DocSearch integration from documentation, adds itemOperationTimeout field documentation, and updates the changelog. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 1 | ❌ 4❌ Failed checks (4 warnings)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Hi @oadp-rebasebot-app[bot]. Thanks for your PR. I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
internal/delete/actions/csi/volumesnapshotcontent_action_test.go (2)
97-109: Test coverage could be more explicit about cleanup verification.The test case validates that an error is returned when CSI reports an error, but it doesn't explicitly verify that the dangling VSC was cleaned up (deleted). Consider adding an assertion to check that the VSC no longer exists in the fake client after
Executereturns, to fully validate the cleanup behavior described in the test name.💡 Suggested enhancement to verify cleanup
After
p.Execute(...)returns, you could add:// Verify VSC was cleaned up vscList := &snapshotv1api.VolumeSnapshotContentList{} require.NoError(t, crClient.List(context.Background(), vscList)) require.Empty(t, vscList.Items, "Expected VSC to be cleaned up")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/delete/actions/csi/volumesnapshotcontent_action_test.go` around lines 97 - 109, The test should explicitly verify that the dangling VolumeSnapshotContent was deleted after Execute; after calling p.Execute(...) use the test client (crClient) to List VolumeSnapshotContent objects (snapshotv1api.VolumeSnapshotContentList) and assert no items remain by calling require.NoError for the List result and require.Empty on the list.Items (referencing the existing vsc variable and the Execute call) so the test confirms cleanup behavior rather than only checking expectErr.
242-248: Consider reusing existing pointer utilities.The codebase already has
pkg/util/boolptr(used in the main implementation file). For consistency, consider reusing that package forboolPtr. ForstringPtr, this local helper is acceptable since there doesn't appear to be a corresponding utility in the codebase.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/delete/actions/csi/volumesnapshotcontent_action_test.go` around lines 242 - 248, The local test helper boolPtr should be replaced with the existing utility to ensure consistency: remove the local boolPtr function and use the package pkg/util/boolptr (used in the implementation) wherever boolPtr() is referenced in this test (e.g., in volumesnapshotcontent_action_test.go); keep the local stringPtr helper as-is since no shared string pointer utility exists. Ensure the import for the util boolptr package is added and update references from boolPtr(...) to the util's API.test/e2e/basic/restore_exec_hooks.go (1)
51-54: Unusual format string for namespace numbering.The format
%00000dis unusual and likely not producing the intended result. The sequence00000means the minimum width is 0 (the rightmost digit), so there's no zero-padding. If the intent is 5-digit zero-padded numbers, use%05dinstead.💡 Suggested fix
for nsNum := 0; nsNum < r.NamespacesTotal; nsNum++ { - createNSName := fmt.Sprintf("%s-%00000d", r.CaseBaseName, nsNum) + createNSName := fmt.Sprintf("%s-%05d", r.CaseBaseName, nsNum) *r.NSIncluded = append(*r.NSIncluded, createNSName) }Note: This also appears at lines 78 and 142 in
CreateResources()andVerify().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/basic/restore_exec_hooks.go` around lines 51 - 54, The namespace name formatting in the loop that builds createNSName uses the incorrect format string "%00000d" which doesn't produce zero-padded numbers; change it to "%05d" in the loop that constructs createNSName (inside CreateResources()/the namespace-building loop) and update the other occurrences of the same format (the instances referenced in Verify() and the second CreateResources() occurrence) to use "%05d" so namespace numbers are zero-padded to 5 digits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@site/content/docs/main/api-types/schedule.md`:
- Around line 66-69: Update the grammatical errors in the documentation values:
change "The default value is 4 hour." to "The default value is 4 hours." for the
itemOperationTimeout entry (referenced by the symbol itemOperationTimeout) and
also change "10 minute" to "10 minutes" in the related schedule/config entry
mentioned earlier (ensure the corresponding field name remains correct in the
same doc), keeping plural "minutes" and "hours" for the default value
descriptions.
---
Nitpick comments:
In `@internal/delete/actions/csi/volumesnapshotcontent_action_test.go`:
- Around line 97-109: The test should explicitly verify that the dangling
VolumeSnapshotContent was deleted after Execute; after calling p.Execute(...)
use the test client (crClient) to List VolumeSnapshotContent objects
(snapshotv1api.VolumeSnapshotContentList) and assert no items remain by calling
require.NoError for the List result and require.Empty on the list.Items
(referencing the existing vsc variable and the Execute call) so the test
confirms cleanup behavior rather than only checking expectErr.
- Around line 242-248: The local test helper boolPtr should be replaced with the
existing utility to ensure consistency: remove the local boolPtr function and
use the package pkg/util/boolptr (used in the implementation) wherever boolPtr()
is referenced in this test (e.g., in volumesnapshotcontent_action_test.go); keep
the local stringPtr helper as-is since no shared string pointer utility exists.
Ensure the import for the util boolptr package is added and update references
from boolPtr(...) to the util's API.
In `@test/e2e/basic/restore_exec_hooks.go`:
- Around line 51-54: The namespace name formatting in the loop that builds
createNSName uses the incorrect format string "%00000d" which doesn't produce
zero-padded numbers; change it to "%05d" in the loop that constructs
createNSName (inside CreateResources()/the namespace-building loop) and update
the other occurrences of the same format (the instances referenced in Verify()
and the second CreateResources() occurrence) to use "%05d" so namespace numbers
are zero-padded to 5 digits.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9dacb3df-e509-4af3-8614-195975c0171b
📒 Files selected for processing (9)
changelogs/unreleased/9581-shubham-pampattiwarinternal/delete/actions/csi/volumesnapshotcontent_action.gointernal/delete/actions/csi/volumesnapshotcontent_action_test.gosite/algolia-crawler.jsonsite/content/docs/main/api-types/schedule.mdsite/layouts/docs/docs.htmlsite/layouts/partials/head-docs.htmltest/e2e/basic/restore_exec_hooks.gotest/e2e/e2e_suite_test.go
💤 Files with no reviewable changes (3)
- site/layouts/docs/docs.html
- site/layouts/partials/head-docs.html
- site/algolia-crawler.json
| # ItemOperationTimeout specifies the time used to wait for | ||
| # asynchronous BackupItemAction operations | ||
| # The default value is 4 hour. | ||
| itemOperationTimeout: 4h |
There was a problem hiding this comment.
Fix grammatical error in documentation.
Line 68 contains a grammatical error: "The default value is 4 hour." should be "The default value is 4 hours." (plural).
Note: The same grammatical issue exists in line 65 ("10 minute" should be "10 minutes"). Consider fixing both for consistency.
📝 Proposed fix
# ItemOperationTimeout specifies the time used to wait for
# asynchronous BackupItemAction operations
- # The default value is 4 hour.
+ # The default value is 4 hours.
itemOperationTimeout: 4hIf fixing line 65 as well:
# CSISnapshotTimeout specifies the time used to wait for
# CSI VolumeSnapshot status turns to ReadyToUse during creation, before
- # returning error as timeout. The default value is 10 minute.
+ # returning error as timeout. The default value is 10 minutes.
csiSnapshotTimeout: 10m📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # ItemOperationTimeout specifies the time used to wait for | |
| # asynchronous BackupItemAction operations | |
| # The default value is 4 hour. | |
| itemOperationTimeout: 4h | |
| # ItemOperationTimeout specifies the time used to wait for | |
| # asynchronous BackupItemAction operations | |
| # The default value is 4 hours. | |
| itemOperationTimeout: 4h |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@site/content/docs/main/api-types/schedule.md` around lines 66 - 69, Update
the grammatical errors in the documentation values: change "The default value is
4 hour." to "The default value is 4 hours." for the itemOperationTimeout entry
(referenced by the symbol itemOperationTimeout) and also change "10 minute" to
"10 minutes" in the related schedule/config entry mentioned earlier (ensure the
corresponding field name remains correct in the same doc), keeping plural
"minutes" and "hours" for the default value descriptions.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: oadp-rebasebot-app[bot], sseago The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary by CodeRabbit
Bug Fixes
Documentation
itemOperationTimeoutfield to Schedule API type documentation.Tests