OCPBUGS-62626: only report Progressing=True when progressing towards new configuration #1264

flavianmissi · 2025-10-27T11:29:45Z

No description provided.

openshift-ci-robot · 2025-10-27T11:29:52Z

@flavianmissi: This pull request references Jira Issue OCPBUGS-62626, which is invalid:

expected the bug to target the "4.21.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci · 2025-10-27T11:30:46Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: flavianmissi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [flavianmissi]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

flavianmissi · 2025-10-27T12:11:16Z

/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

openshift-ci · 2025-10-27T12:11:25Z

@flavianmissi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/0ac02f00-b32e-11f0-8ae7-8e4f57a461ba-0

flavianmissi · 2025-10-28T12:51:03Z

payload job failed during setup.

/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

openshift-ci · 2025-10-28T12:51:11Z

@flavianmissi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c3b27b70-b3fc-11f0-8a25-dce41c0f7de4-0

flavianmissi · 2025-10-29T08:05:57Z

Looks like the tests covering the Progressing=True issue where merged yesterday, I think that's why they didn't show up on my latest payload run, so I'll have to try again.

/payload-job periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

openshift-ci · 2025-10-29T08:06:00Z

@flavianmissi: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/19d72d10-b49e-11f0-8428-5cc249faa40f-0

hongkailiu · 2025-10-29T23:50:36Z

Let us try this:

/payload-job-with-prs periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade openshift/origin#30438

openshift-ci · 2025-10-29T23:50:40Z

@hongkailiu: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

periodic-ci-openshift-release-master-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/11b2ee60-b522-11f0-9eef-fe936f49c238-0

hongkailiu · 2025-10-30T13:10:01Z

The result from the job #1264 (comment) is looking good:

$ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/openshift-origin-30438-openshift-cluster-image-registry-operator-1264-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade/1983682929573761024/artifacts/e2e-gcp-ovn-rt-upgrade/openshift-e2e-test/artifacts/junit/e2e-monitor-tests__20251030-013050.xml | rg 'clusteroperator/image-registry should stay Progressing=False' -A1 -B1
    <testcase name="[Monitor:legacy-cvo-invariants][bz-Etcd] clusteroperator/etcd should stay Progressing=False while MCO is Progressing=True" time="0"></testcase>
    <testcase name="[Monitor:legacy-cvo-invariants][bz-Image Registry] clusteroperator/image-registry should stay Progressing=False while MCO is Progressing=True" time="0"></testcase>
    <testcase name="[Monitor:legacy-cvo-invariants][bz-Routing] clusteroperator/ingress should stay Progressing=False while MCO is Progressing=True" time="4153.084">

And https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/openshift-origin-30438-openshift-cluster-image-registry-operator-1264-ci-4.21-upgrade-from-stable-4.20-e2e-gcp-ovn-rt-upgrade/1983682929573761024/artifacts/e2e-gcp-ovn-rt-upgrade/openshift-e2e-test/artifacts/junit/e2e-timelines_spyglass_20251030-013050.html

flavianmissi · 2025-11-27T13:14:45Z

/retest

flavianmissi · 2025-11-27T13:16:29Z

/retitle OCPBUGS-62626: only report Progressing=True when progressing towards new configuration

flavianmissi · 2025-11-27T13:17:08Z

/jira refresh

openshift-ci-robot · 2025-11-27T13:17:16Z

@flavianmissi: This pull request references Jira Issue OCPBUGS-62626, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.21.0) matches configured target version for branch (4.21.0)
bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (xiuwang+1@redhat.com), skipping review request.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

…duling The OperatorProgressing condition API definition states that operators must not report Progressing when reconciling to previously known state, such as when nodes are rebooted and pods are restarted, or when daemonsets adjust to node reboot or cluster scale-up events. The NodeCADaemonController was violating this API by reporting Progressing=True with reason "Unavailable" whenever ds.Status.NumberUnavailable > 0, which occurs during normal pod rescheduling operations (node reboots, cluster scale-up, etc.). This caused the the IR operator to switch between Progressing=True and Progressing=False during machine-config upgrade windows, generating several unexpected state transitions in CI. This commit removes the logic that reports Progressing=True based on NumberUnavailable. Now the controller only reports Progressing=True when Generation != ObservedGeneration, which indicates an actual daemonset update is in progress, not just a pod rescheduling. Co-Authored-By: Claude <noreply@anthropic.com>

The OperatorProgressing condition API definition states that operators must not report Progressing when reconciling to previously known state, such as when nodes are rebooted and pods are restarted, or when daemonsets adjust to node reboot or cluster scale-up events. The IR operator was violating the OperatorProgressing condition semantics by reporting Progressing=True whenever the image registry Deployment was not complete, even when just reconciling to a previously known state. This commit adds a check for deploy.Generation != deploy.Status.ObservedGeneration before reporting DeploymentNotCompleted. This ensures we only report Progressing=True during actual Deployment updates (when Generation has been bumped but not yet observed), not during normal reconciliation events like: * pod rescheduling after node reboots * pods restarting after crashes * replicas scaling up/down to match existing desired count Co-Authored-By: Claude <noreply@anthropic.com>

openshift-ci · 2025-12-05T19:53:20Z

@flavianmissi: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
ci/prow/e2e-aws-operator	`30eb3f3`	link	true	`/test e2e-aws-operator`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 27, 2025

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Oct 27, 2025

openshift-ci bot requested a review from ricardomaraschini October 27, 2025 11:30

openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 27, 2025

openshift-ci bot mentioned this pull request Oct 29, 2025

TRT-2386: Revert "Revert "Merge pull request #30296 from hongkailiu/OTA-1637-reboot"" openshift/origin#30438

Merged

openshift-ci bot changed the title ~~WIP OCPBUGS-62626: only report Progressing=True when progressing towards new configuration~~ OCPBUGS-62626: only report Progressing=True when progressing towards new configuration Nov 27, 2025

openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 27, 2025

openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Nov 27, 2025

openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Nov 27, 2025

flavianmissi force-pushed the OCPBUGS-62626 branch from e869359 to 5aeceae Compare December 1, 2025 15:41

flavianmissi and others added 2 commits December 5, 2025 18:00

flavianmissi force-pushed the OCPBUGS-62626 branch from 5aeceae to 30eb3f3 Compare December 5, 2025 17:00

OCPBUGS-62626: only report Progressing=True when progressing towards new configuration #1264

Are you sure you want to change the base?

OCPBUGS-62626: only report Progressing=True when progressing towards new configuration #1264

Uh oh!

Conversation

flavianmissi commented Oct 27, 2025

Uh oh!

openshift-ci-robot commented Oct 27, 2025

Uh oh!

openshift-ci bot commented Oct 27, 2025

Uh oh!

flavianmissi commented Oct 27, 2025

Uh oh!

openshift-ci bot commented Oct 27, 2025

Uh oh!

flavianmissi commented Oct 28, 2025

Uh oh!

openshift-ci bot commented Oct 28, 2025

Uh oh!

flavianmissi commented Oct 29, 2025

Uh oh!

openshift-ci bot commented Oct 29, 2025

Uh oh!

hongkailiu commented Oct 29, 2025

Uh oh!

openshift-ci bot commented Oct 29, 2025

Uh oh!

hongkailiu commented Oct 30, 2025

Uh oh!

flavianmissi commented Nov 27, 2025

Uh oh!

flavianmissi commented Nov 27, 2025

Uh oh!

flavianmissi commented Nov 27, 2025

Uh oh!

openshift-ci-robot commented Nov 27, 2025

Uh oh!

openshift-ci bot commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants