-
Notifications
You must be signed in to change notification settings - Fork 4.8k
monitortest: add crdversionchecker #30603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: damdo The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Scheduling required tests: |
|
@damdo: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
Risk analysis has seen new tests most likely introduced by this PR. New Test Risks for sha: 1758db2
New tests seen in this PR at sha: 1758db2
|
|
See: and the |
|
/assign @JoelSpeed |
| } | ||
|
|
||
| // CRDCondition captures the condition information for a CRD. | ||
| type CRDCondition struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't use a metav1 condition here?
| if err != nil { | ||
| return fmt.Errorf("unable to determine if cluster is MicroShift: %v", err) | ||
| } | ||
| if isMicroShift { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we support microshift?
| beforeVersions := make(map[string]bool) | ||
| afterVersions := make(map[string]bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, if you're using this as just a set, map[string]struct{} is more idiomatic and more efficient
| // Rationale: When a new API version is introduced during an upgrade, existing | ||
| // data in etcd is still stored in the old format. Setting the new version as | ||
| // the storage version immediately would require a migration. The safe approach | ||
| // is to serve both versions but keep the old version as storage until all | ||
| // existing objects have been migrated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not correct. You have to change the storage version before you can run the storage version migration.
The reason we want the storage version to remain as the old version for 1 release is so that if you roll-back for any reason, the old schema can still decode objects in etcd. If you switched storage version immediately and then rolled back, any object written between the upgrade and rollback would no longer be decodeable
| // - The new version is marked as the storage version | ||
| // - The old version is no longer the storage version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one can be the storage version, so this sounds tautological
| return []*junitapi.JUnitTestCase{{ | ||
| Name: testName, | ||
| SkipMessage: &junitapi.SkipMessage{ | ||
| Message: "Missing CRD snapshots, cannot perform check", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We alway return a skip when not running during an upgrade? Is that what we want vs just outputting an empty JUnit? Not sure what the correct etiquette would be here
| afterStorage := getStorageVersion(afterCRD) | ||
|
|
||
| // Identify newly added versions | ||
| beforeVersionSet := make(map[string]bool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: idiomatically should be map[string]struct{}
|
|
||
| // Check if a new version became the storage version | ||
| for _, newVersion := range newVersions { | ||
| if afterStorage == newVersion && beforeStorage != newVersion { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first part of this if statement is sufficient. If it's a new version and the storage matches the new version that's added, we can fail
Don't need to check the second half (how could the previous storage be a new version? I don't think the second check can ever be false)
Summary
This PR adds a new monitor test crd-version-checker that tracks CustomResourceDefinition (CRD) changes during cluster upgrades. The test captures CRD snapshots before and after an upgrade, generates a summary of changes, and validates that new API versions follow safe upgrade practices.
Motivation
When new API versions are introduced to CRDs during an upgrade, it's important that:
The storage version remains the previous version (not the new one) until data migration is complete
Both old and new versions are served to maintain compatibility
This test helps catch potential issues where a new API version is immediately set as the storage version, which could cause problems with existing data stored in etcd.
What does this include
Summary Generation
This test writes a JSON summary file (crd-version-summary.json) to storage for inspection
Computes differences between before/after snapshots:
Added CRDs
Removed CRDs
Changed CRDs (with details on added/removed versions and storage changes)
Validation Checks
This test defines the following checks:
Fails if a CRD has a new version added AND that version immediately becomes the storage version
This check ensures safe API version introduction practices