Skip to content

OCPEDGE-2346: feat(manager): read TLS config from API server#2105

Open
qJkee wants to merge 1 commit intoopenshift:mainfrom
qJkee:OCPEDGE-2346
Open

OCPEDGE-2346: feat(manager): read TLS config from API server#2105
qJkee wants to merge 1 commit intoopenshift:mainfrom
qJkee:OCPEDGE-2346

Conversation

@qJkee
Copy link
Contributor

@qJkee qJkee commented Mar 9, 2026

Read TLS config from API server and use it where applicable. When config changes - trigger graceful shutdown to recreate servers with new TLS configuration

Summary by CodeRabbit

  • New Features

    • Applies cluster TLS profile to server endpoints and enables runtime TLS profile updates with graceful reloads.
  • Chores

    • Updated operator bundle payloads and CRD metadata annotations to newer tooling versions.
  • Dependencies

    • Bumped numerous Go, Kubernetes, and platform-related modules to aligned newer versions.
  • Tests

    • Tests updated to use standard library slice utilities for comparisons.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 9, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 9, 2026

@qJkee: This pull request references OCPEDGE-2346 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Read TLS config from API server and use it where applicable. When config changes - trigger graceful shutdown to recreate servers with new TLS configuration

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested review from jaypoulz and jeff-roche March 9, 2026 12:49
@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 9, 2026
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 9, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: qJkee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 9, 2026
@qJkee
Copy link
Contributor Author

qJkee commented Mar 9, 2026

/retest

1 similar comment
@qJkee
Copy link
Contributor Author

qJkee commented Mar 10, 2026

/retest

@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🚫 Review skipped — only excluded labels are configured. (1)
  • do-not-merge/work-in-progress

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Pro

Run ID: fde2071d-e69e-44a9-8b8c-4244b9be67a3

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Fetch API server TLSProfile at startup, build a tls.Config and inject it into manager and webhook TLS options, add a TLS-profile watcher that cancels the manager context on changes to trigger a restart, bump dependencies, and update CRD annotations and a test import.

Changes

Cohort / File(s) Summary
TLS profile & wiring
cmd/operator/operator.go, cmd/vgmanager/vgmanager.go
Add OpenShift TLSProfile import and controller-runtime-common TLS helper; fetch TLSProfile at startup, build tls.Config (log unsupported ciphers), apply tls.Config to Manager and webhook TLS options.
TLS watcher & shutdown flow
cmd/operator/operator.go, cmd/vgmanager/vgmanager.go
Add TLS profile watcher controller registered with the manager that logs changes and calls cancel() to stop the manager; introduce exported ErrTLSProfileModified in vgmanager and propagate descriptive errors when watcher/setup fails.
Dependency updates
go.mod
Bump multiple direct/indirect modules (OpenShift APIs, controller-runtime-common, controller-runtime, Prometheus, golang.org/x/*, tooling); update replace for topolvm mapping.
CRD manifest annotations
bundle/manifests/.../*.yaml, config/crd/bases/.../*.yaml
Update metadata.annotations.controller-gen.kubebuilder.io/version from v0.17.3 to v0.19.0 across CRD manifests (metadata-only change).
Operator bundle payloads
catalog/lvms-operator/v0.0.1.yaml
Replace base64-encoded payloads in several olm.bundle.object.value.data entries; no other structural changes.
Tests
internal/controllers/vgmanager/validatelvs_test.go
Switch import from k8s.io/utils/strings/slices to standard library slices (usage unchanged).

Sequence Diagram(s)

sequenceDiagram
    participant Supervisor as Process\nSupervisor
    participant Manager as Manager
    participant APIServer as API\nServer
    participant TLSWatcher as TLS\nProfile Watcher

    Supervisor->>Manager: Start manager process
    Manager->>APIServer: GET TLSProfile (startup)
    APIServer-->>Manager: TLSProfile
    Manager->>Manager: Build tls.Config, set Manager & Webhook TLS options
    Supervisor->>Manager: Launch manager with TLS config

    Note over TLSWatcher,APIServer: Runtime monitoring
    TLSWatcher->>APIServer: Watch TLSProfile resource
    APIServer-->>TLSWatcher: Notify on TLSProfile change
    TLSWatcher->>Manager: Log change, call cancel() (ErrTLSProfileModified)
    Manager->>Supervisor: Exit/propagate error to allow restart/reload
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: reading TLS config from the API server and applying it, which is the core feature across modified files.
Stable And Deterministic Test Names ✅ Passed Modified test file uses traditional Go testing with static function names, not Ginkgo patterns. No dynamic values appear in test titles.
Test Structure And Quality ✅ Passed PR introduces no new Ginkgo test files or blocks. Single test file modification is trivial import change. Existing tests follow sound patterns. Test coverage increased from 49.16% to 52.21%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 11, 2026

@qJkee: This pull request references OCPEDGE-2346 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Read TLS config from API server and use it where applicable. When config changes - trigger graceful shutdown to recreate servers with new TLS configuration

Summary by CodeRabbit

Release Notes

  • New Features

  • Added OpenShift TLS profile integration to enable dynamic TLS configuration updates.

  • Implemented automatic TLS configuration reload when profiles change, improving security management without manual restarts.

  • Enhanced TLS handling with HTTP/1.1 support across core components.

  • Dependencies

  • Updated dependencies to align with OpenShift and Kubernetes ecosystem.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@codecov-commenter
Copy link

codecov-commenter commented Mar 11, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.11%. Comparing base (b5e0e11) to head (a60c248).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2105      +/-   ##
==========================================
+ Coverage   51.73%   52.11%   +0.38%     
==========================================
  Files          52       52              
  Lines        3901     3901              
==========================================
+ Hits         2018     2033      +15     
+ Misses       1715     1701      -14     
+ Partials      168      167       -1     
Files with missing lines Coverage Δ
internal/controllers/lvmcluster/controller.go 57.08% <ø> (+2.98%) ⬆️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@cmd/vgmanager/vgmanager.go`:
- Around line 197-205: The TLS watcher currently calls cancel() in
SecurityProfileWatcher.OnProfileChange which triggers an abnormal exit; define a
new sentinel error ErrTLSProfileModified, replace the cancel() call with
cancelWithCause(ErrTLSProfileModified) inside the OnProfileChange closure, and
update the existing shutdown/error-handling logic that checks for
ErrConfigModified to also check for ErrTLSProfileModified so TLS reloads exit
via the clean reload path (status 0) rather than the abnormal path.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 671ed331-b528-4e67-84a7-1dc48d4d0fd0

📥 Commits

Reviewing files that changed from the base of the PR and between 52e0d08 and 7fe89a7.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (3)
  • cmd/operator/operator.go
  • cmd/vgmanager/vgmanager.go
  • go.mod

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 11, 2026

@qJkee: This pull request references OCPEDGE-2346 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Read TLS config from API server and use it where applicable. When config changes - trigger graceful shutdown to recreate servers with new TLS configuration

Summary by CodeRabbit

  • New Features

  • Integrates cluster TLS profile retrieval and applies profile-based TLS configuration to manager and webhook servers.

  • Adds a runtime watcher that detects TLS profile changes and triggers a controlled reload to apply updated TLS settings.

  • Ensures HTTP/1.1 compatibility while using profile-based TLS configs.

  • Dependencies

  • Upgraded multiple Go and Kubernetes/OpenShift dependencies to newer, aligned versions.

  • Tests

  • Updated tests to use standard library utilities for slice comparisons.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
cmd/vgmanager/vgmanager.go (1)

303-312: ⚠️ Potential issue | 🟠 Major

Missing ErrTLSProfileModified check causes abnormal exit on TLS profile changes.

The TLS watcher correctly calls cancelWithCause(ErrTLSProfileModified) at line 205, but the error handling here doesn't check for this error. When a TLS profile change occurs:

  1. cancelWithCause(ErrTLSProfileModified) is called
  2. mgr.Start() returns
  3. The code checks ErrConfigModified (line 303) - no match
  4. The code checks ErrPluginRegistrationFailed (line 306) - no match
  5. Falls through to ctx.Err() check (line 309) - matches because context is cancelled
  6. Logs "exiting abnormally" and calls os.Exit(1)

This results in an abnormal exit instead of the intended clean restart.

🐛 Proposed fix
 	if errors.Is(context.Cause(ctx), ErrConfigModified) {
 		opts.SetupLog.Info("exiting pod due to modified configuration")
 		os.Exit(0)
+	} else if errors.Is(context.Cause(ctx), ErrTLSProfileModified) {
+		opts.SetupLog.Info("exiting pod due to modified TLS profile")
+		os.Exit(0)
 	} else if errors.Is(context.Cause(ctx), icsi.ErrPluginRegistrationFailed) {
 		opts.SetupLog.Error(context.Cause(ctx), "exiting pod due to failed plugin registration")
 		os.Exit(0)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/vgmanager/vgmanager.go` around lines 303 - 312, Add an explicit check for
ErrTLSProfileModified in the shutdown handling so TLS-driven cancellations
perform a clean restart: after the existing errors.Is(context.Cause(ctx),
ErrConfigModified) branch, add a branch that checks
errors.Is(context.Cause(ctx), ErrTLSProfileModified) and log a descriptive info
message via opts.SetupLog (similar style to the ErrConfigModified branch) and
call os.Exit(0); this aligns the handling of
cancelWithCause(ErrTLSProfileModified) from the TLS watcher with the intended
clean exit instead of falling through to the ctx.Err() abnormal-exit path.
go.mod (1)

3-3: ⚠️ Potential issue | 🟠 Major

Update Go version to a valid release.

Go 1.24.11 does not exist. The official Go releases list shows the latest available versions are go1.26.1 and go1.25.8. Update go.mod line 3 to specify a valid Go version.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go.mod` at line 3, The go.mod declares an invalid Go version ("go 1.24.11");
update the module's Go version to a valid release tag (for example "go 1.26" or
"go 1.26.1") by replacing the incorrect version token in go.mod so the file
specifies a real supported Go release.
🧹 Nitpick comments (1)
cmd/operator/operator.go (1)

244-257: Consider using cancelWithCause for consistency with vgmanager.

The TLS watcher in vgmanager.go uses cancelWithCause(ErrTLSProfileModified) to enable specific handling of TLS profile changes (logging and clean exit via os.Exit(0)). Here, using plain cancel() will trigger a shutdown, but the exit path differs:

  • In vgmanager.go: TLS profile changes are explicitly logged and exit cleanly.
  • In operator.go: The function returns nil after mgr.Start(), which should result in a clean exit, but without explicit logging indicating the reason.

This is acceptable if the pod restart mechanism handles both cases identically, but for observability and consistency, consider aligning with the vgmanager pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/operator/operator.go` around lines 244 - 257, Replace the plain context
cancel in the TLS watcher callback with the cancel-with-cause pattern used in
vgmanager: inside the OnProfileChange closure of SecurityProfileWatcher (where
cancel() is called), invoke cancelWithCause(ErrTLSProfileModified) instead so
the shutdown path includes the specific cancellation cause and matches
vgmanager's logging/exit handling; ensure ErrTLSProfileModified and
cancelWithCause are imported/available in this scope or propagate them into
operator.go if needed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cmd/vgmanager/vgmanager.go`:
- Around line 303-312: Add an explicit check for ErrTLSProfileModified in the
shutdown handling so TLS-driven cancellations perform a clean restart: after the
existing errors.Is(context.Cause(ctx), ErrConfigModified) branch, add a branch
that checks errors.Is(context.Cause(ctx), ErrTLSProfileModified) and log a
descriptive info message via opts.SetupLog (similar style to the
ErrConfigModified branch) and call os.Exit(0); this aligns the handling of
cancelWithCause(ErrTLSProfileModified) from the TLS watcher with the intended
clean exit instead of falling through to the ctx.Err() abnormal-exit path.

In `@go.mod`:
- Line 3: The go.mod declares an invalid Go version ("go 1.24.11"); update the
module's Go version to a valid release tag (for example "go 1.26" or "go
1.26.1") by replacing the incorrect version token in go.mod so the file
specifies a real supported Go release.

---

Nitpick comments:
In `@cmd/operator/operator.go`:
- Around line 244-257: Replace the plain context cancel in the TLS watcher
callback with the cancel-with-cause pattern used in vgmanager: inside the
OnProfileChange closure of SecurityProfileWatcher (where cancel() is called),
invoke cancelWithCause(ErrTLSProfileModified) instead so the shutdown path
includes the specific cancellation cause and matches vgmanager's logging/exit
handling; ensure ErrTLSProfileModified and cancelWithCause are
imported/available in this scope or propagate them into operator.go if needed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 936b55ad-2e88-4565-afd9-862ec05ae498

📥 Commits

Reviewing files that changed from the base of the PR and between 7fe89a7 and 489fb6a.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • cmd/operator/operator.go
  • cmd/vgmanager/vgmanager.go
  • go.mod
  • internal/controllers/vgmanager/validatelvs_test.go

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 11, 2026

@qJkee: This pull request references OCPEDGE-2346 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Read TLS config from API server and use it where applicable. When config changes - trigger graceful shutdown to recreate servers with new TLS configuration

Summary by CodeRabbit

  • New Features

  • Adds cluster TLS profile retrieval and applies profile-based TLS to server endpoints, ensuring HTTP/1.1 compatibility and enabling runtime TLS updates.

  • Introduces a runtime watcher that detects TLS profile changes and triggers a controlled reload to apply updated TLS settings.

  • Dependencies

  • Upgraded multiple Go, Kubernetes, and OpenShift-related dependencies to newer aligned versions.

  • Tests

  • Updated tests to use standard library utilities for slice comparisons.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
cmd/vgmanager/vgmanager.go (2)

303-315: ⚠️ Potential issue | 🟡 Minor

Add explicit handling for normal signal-driven shutdown.

Normal SIGTERM or SIGINT from signal.NotifyContext falls through to the generic error branch, logging as "exiting abnormally" and exiting with code 1. This should be a clean shutdown.

Suggested fix
  cause := context.Cause(ctx)
  if errors.Is(cause, ErrConfigModified) {
 		opts.SetupLog.Info("exiting pod due to modified configuration")
 		os.Exit(0)
 	} else if errors.Is(cause, ErrTLSProfileModified) {
 		opts.SetupLog.Info("exiting pod due to modified TLS profile")
 		os.Exit(0)
 	} else if errors.Is(cause, icsi.ErrPluginRegistrationFailed) {
 		opts.SetupLog.Error(cause, "exiting pod due to failed plugin registration")
 		os.Exit(0)
+	} else if errors.Is(cause, context.Canceled) {
+		opts.SetupLog.Info("received shutdown signal")
+		return nil
 	} else if err := ctx.Err(); err != nil {
 		opts.SetupLog.Error(err, "exiting abnormally")
 		os.Exit(1)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/vgmanager/vgmanager.go` around lines 303 - 315, The shutdown handler
treats signal-driven context cancellation as an abnormal exit; update the
branching after calling context.Cause(ctx) to explicitly handle normal signal
shutdown by checking if ctx.Err() is context.Canceled (and optionally
context.DeadlineExceeded if desired) and call opts.SetupLog.Info with a clear
"exiting due to signal" message and os.Exit(0); keep the existing branches for
ErrConfigModified, ErrTLSProfileModified, icsi.ErrPluginRegistrationFailed, and
the fallback error branch that exits with code 1.

148-211: ⚠️ Potential issue | 🔴 Critical

Add RBAC rule for config.openshift.io/apiservers to both vgmanager and operator service accounts.

This PR adds FetchAPIServerTLSProfile (startup read) and SecurityProfileWatcher (runtime watch) for the cluster-scoped APIServer resource to both vgmanager and operator. The respective RBAC ClusterRoles/Roles lack the required permissions (get,list,watch on apiservers.config.openshift.io), causing startup and runtime failures.

  • Add to config/rbac/vg_manager_clusterrole.yaml:
    - apiGroups:
      - config.openshift.io
      resources:
      - apiservers
      verbs:
      - get
      - list
      - watch
  • Add same rule to operator RBAC (config/rbac/role.yaml or applicable operator ClusterRole if separate).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/vgmanager/vgmanager.go` around lines 148 - 211, The vgmanager and
operator lack RBAC for the cluster-scoped Apiserver resource used by
FetchAPIServerTLSProfile (startup read) and SecurityProfileWatcher (runtime
watch); add a rule granting get, list, watch on apiGroups:
["config.openshift.io"] resources: ["apiservers"] to both the vgmanager
ClusterRole (config/rbac/vg_manager_clusterrole.yaml) and the operator's RBAC
(role/ClusterRole used by the operator service account) so the setupClient and
TLS watcher can successfully read/watch the Apiserver resource at startup and
runtime.
cmd/operator/operator.go (1)

191-257: ⚠️ Potential issue | 🔴 Critical

Add RBAC for config.openshift.io/apiservers access.

Both cmd/operator/operator.go and cmd/vgmanager/vgmanager.go call FetchAPIServerTLSProfile (requires get) and register SecurityProfileWatcher (requires list/watch) on the cluster-scoped APIServer resource. The operator's manager-role ClusterRole in config/rbac/role.yaml lacks this rule and must be updated, or the startup will fail with 403 Forbidden at the FetchAPIServerTLSProfile call.

Add the following rule to config/rbac/role.yaml:

- apiGroups:
  - config.openshift.io
  resources:
  - apiservers
  verbs:
  - get
  - list
  - watch
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/operator/operator.go` around lines 191 - 257, The operator currently
calls FetchAPIServerTLSProfile and registers a SecurityProfileWatcher (see
FetchAPIServerTLSProfile, SecurityProfileWatcher, tlsWatcherController) against
the cluster-scoped APIServer resource but the manager-role ClusterRole in
config/rbac/role.yaml lacks permissions; update config/rbac/role.yaml to add a
rule that grants get, list, and watch on the config.openshift.io apiservers
resource so FetchAPIServerTLSProfile and tlsWatcherController.SetupWithManager
can succeed without 403 errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@cmd/operator/operator.go`:
- Around line 191-257: The operator currently calls FetchAPIServerTLSProfile and
registers a SecurityProfileWatcher (see FetchAPIServerTLSProfile,
SecurityProfileWatcher, tlsWatcherController) against the cluster-scoped
APIServer resource but the manager-role ClusterRole in config/rbac/role.yaml
lacks permissions; update config/rbac/role.yaml to add a rule that grants get,
list, and watch on the config.openshift.io apiservers resource so
FetchAPIServerTLSProfile and tlsWatcherController.SetupWithManager can succeed
without 403 errors.

In `@cmd/vgmanager/vgmanager.go`:
- Around line 303-315: The shutdown handler treats signal-driven context
cancellation as an abnormal exit; update the branching after calling
context.Cause(ctx) to explicitly handle normal signal shutdown by checking if
ctx.Err() is context.Canceled (and optionally context.DeadlineExceeded if
desired) and call opts.SetupLog.Info with a clear "exiting due to signal"
message and os.Exit(0); keep the existing branches for ErrConfigModified,
ErrTLSProfileModified, icsi.ErrPluginRegistrationFailed, and the fallback error
branch that exits with code 1.
- Around line 148-211: The vgmanager and operator lack RBAC for the
cluster-scoped Apiserver resource used by FetchAPIServerTLSProfile (startup
read) and SecurityProfileWatcher (runtime watch); add a rule granting get, list,
watch on apiGroups: ["config.openshift.io"] resources: ["apiservers"] to both
the vgmanager ClusterRole (config/rbac/vg_manager_clusterrole.yaml) and the
operator's RBAC (role/ClusterRole used by the operator service account) so the
setupClient and TLS watcher can successfully read/watch the Apiserver resource
at startup and runtime.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Pro

Run ID: a92d242a-30fe-4e7e-a1ac-b0b4a0111384

📥 Commits

Reviewing files that changed from the base of the PR and between 489fb6a and d631cc6.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (4)
  • cmd/operator/operator.go
  • cmd/vgmanager/vgmanager.go
  • go.mod
  • internal/controllers/vgmanager/validatelvs_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • internal/controllers/vgmanager/validatelvs_test.go

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 11, 2026
@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 11, 2026

@qJkee: This pull request references OCPEDGE-2346 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Read TLS config from API server and use it where applicable. When config changes - trigger graceful shutdown to recreate servers with new TLS configuration

Summary by CodeRabbit

  • New Features

  • Applies cluster TLS profile to server endpoints and enables runtime TLS updates with graceful reloads.

  • Chores

  • Updated operator bundle payloads and CRD metadata annotations to newer tooling versions.

  • Dependencies

  • Bumped numerous Go, Kubernetes, and OpenShift-related modules to aligned newer versions.

  • Tests

  • Tests updated to use standard library slice utilities for comparisons.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 11, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cmd/operator/operator.go (1)

142-142: Consider using context.WithCancelCause for consistency with vgmanager.

The operator uses context.WithCancel while vgmanager uses context.WithCancelCause. When the TLS watcher calls cancel() at Line 251, the shutdown reason isn't tracked. This works (returns nil from run), but differs from vgmanager's explicit cause-based exit handling.

♻️ Suggested alignment with vgmanager pattern
 func run(cmd *cobra.Command, _ []string, opts *Options) error {
-	ctx, cancel := context.WithCancel(cmd.Context())
-	defer cancel()
+	ctx, cancelWithCause := context.WithCancelCause(cmd.Context())
+	defer cancelWithCause(nil)

Then update the TLS watcher callback (Line 251) to use cancelWithCause(ErrTLSProfileModified) and add cause-based exit handling after mgr.Start.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/operator/operator.go` at line 142, The code creates ctx and cancel via
context.WithCancel (ctx, cancel := context.WithCancel(cmd.Context())) but should
mirror vgmanager by using context.WithCancelCause so shutdown reasons are
preserved; change to use context.WithCancelCause and replace the cancel variable
with cancelWithCause (or similar), update the TLS watcher callback to call
cancelWithCause(ErrTLSProfileModified) instead of cancel(), and after mgr.Start
add cause-aware exit handling that inspects the cancel cause
(ErrTLSProfileModified) to drive the appropriate shutdown path; reference the
symbols ctx, cancel/cancelWithCause, ErrTLSProfileModified, the TLS watcher
callback, and mgr.Start when making the edits.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cmd/operator/operator.go`:
- Line 142: The code creates ctx and cancel via context.WithCancel (ctx, cancel
:= context.WithCancel(cmd.Context())) but should mirror vgmanager by using
context.WithCancelCause so shutdown reasons are preserved; change to use
context.WithCancelCause and replace the cancel variable with cancelWithCause (or
similar), update the TLS watcher callback to call
cancelWithCause(ErrTLSProfileModified) instead of cancel(), and after mgr.Start
add cause-aware exit handling that inspects the cancel cause
(ErrTLSProfileModified) to drive the appropriate shutdown path; reference the
symbols ctx, cancel/cancelWithCause, ErrTLSProfileModified, the TLS watcher
callback, and mgr.Start when making the edits.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 59b2567d-40f1-49a6-be48-690ac98fc702

📥 Commits

Reviewing files that changed from the base of the PR and between d631cc6 and fd377cf.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • bundle/manifests/lvm.topolvm.io_lvmclusters.yaml
  • bundle/manifests/lvm.topolvm.io_lvmvolumegroupnodestatuses.yaml
  • bundle/manifests/lvm.topolvm.io_lvmvolumegroups.yaml
  • catalog/lvms-operator/v0.0.1.yaml
  • cmd/operator/operator.go
  • cmd/vgmanager/vgmanager.go
  • config/crd/bases/lvm.topolvm.io_lvmclusters.yaml
  • config/crd/bases/lvm.topolvm.io_lvmvolumegroupnodestatuses.yaml
  • config/crd/bases/lvm.topolvm.io_lvmvolumegroups.yaml
  • go.mod
  • internal/controllers/vgmanager/validatelvs_test.go
✅ Files skipped from review due to trivial changes (2)
  • bundle/manifests/lvm.topolvm.io_lvmclusters.yaml
  • config/crd/bases/lvm.topolvm.io_lvmclusters.yaml

@openshift-ci-robot
Copy link

openshift-ci-robot commented Mar 11, 2026

@qJkee: This pull request references OCPEDGE-2346 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Read TLS config from API server and use it where applicable. When config changes - trigger graceful shutdown to recreate servers with new TLS configuration

Summary by CodeRabbit

  • New Features

  • Applies cluster TLS profile to server endpoints and enables runtime TLS profile updates with graceful reloads.

  • Chores

  • Updated operator bundle payloads and CRD metadata annotations to newer tooling versions.

  • Dependencies

  • Bumped numerous Go, Kubernetes, and platform-related modules to aligned newer versions.

  • Tests

  • Tests updated to use standard library slice utilities for comparisons.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
cmd/operator/operator.go (1)

244-257: Consider using context.WithCancelCause for consistency with vgmanager.

Operator uses plain cancel() while vgmanager uses cancelWithCause(ErrTLSProfileModified). The operator's exit path at line 381 returns nil without distinguishing the shutdown reason. This works but makes debugging harder—logs won't indicate the restart was due to TLS profile changes.

If the current behavior is intentional (relying on container restart), this is acceptable. Otherwise, consider aligning with vgmanager's pattern.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/operator/operator.go` around lines 244 - 257, The OnProfileChange handler
in SecurityProfileWatcher currently calls cancel(), losing the shutdown cause;
replace the plain context cancellation with context.WithCancelCause (or your
project's cancelWithCause helper) and invoke
cancelWithCause(ErrTLSProfileModified) (or equivalent) inside the
OnProfileChange closure so the operator’s shutdown reason is preserved and can
be checked/returned from the main run loop (adjust the function that receives
the context so it surfaces the cancellation cause instead of just returning
nil). Ensure references: SecurityProfileWatcher, OnProfileChange, cancel ->
cancelWithCause/ErrTLSProfileModified so callers can log or return the
TLS-profile-modified cause.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@cmd/operator/operator.go`:
- Around line 244-257: The OnProfileChange handler in SecurityProfileWatcher
currently calls cancel(), losing the shutdown cause; replace the plain context
cancellation with context.WithCancelCause (or your project's cancelWithCause
helper) and invoke cancelWithCause(ErrTLSProfileModified) (or equivalent) inside
the OnProfileChange closure so the operator’s shutdown reason is preserved and
can be checked/returned from the main run loop (adjust the function that
receives the context so it surfaces the cancellation cause instead of just
returning nil). Ensure references: SecurityProfileWatcher, OnProfileChange,
cancel -> cancelWithCause/ErrTLSProfileModified so callers can log or return the
TLS-profile-modified cause.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Pro

Run ID: 9c2daa23-9f75-4ad9-a71a-d7511a416972

📥 Commits

Reviewing files that changed from the base of the PR and between fd377cf and c50a9b1.

⛔ Files ignored due to path filters (1)
  • go.sum is excluded by !**/*.sum
📒 Files selected for processing (11)
  • bundle/manifests/lvm.topolvm.io_lvmclusters.yaml
  • bundle/manifests/lvm.topolvm.io_lvmvolumegroupnodestatuses.yaml
  • bundle/manifests/lvm.topolvm.io_lvmvolumegroups.yaml
  • catalog/lvms-operator/v0.0.1.yaml
  • cmd/operator/operator.go
  • cmd/vgmanager/vgmanager.go
  • config/crd/bases/lvm.topolvm.io_lvmclusters.yaml
  • config/crd/bases/lvm.topolvm.io_lvmvolumegroupnodestatuses.yaml
  • config/crd/bases/lvm.topolvm.io_lvmvolumegroups.yaml
  • go.mod
  • internal/controllers/vgmanager/validatelvs_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • internal/controllers/vgmanager/validatelvs_test.go
  • bundle/manifests/lvm.topolvm.io_lvmvolumegroups.yaml
  • config/crd/bases/lvm.topolvm.io_lvmclusters.yaml
  • bundle/manifests/lvm.topolvm.io_lvmvolumegroupnodestatuses.yaml

@qJkee
Copy link
Contributor Author

qJkee commented Mar 11, 2026

/retest

@suleymanakbas91
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 11, 2026
@suleymanakbas91
Copy link
Contributor

/retest

Read TLS config from API server and use it where applicable. When
config changes - trigger graceful shutdown to recreate servers
with new TLS configuration
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 12, 2026
@qJkee
Copy link
Contributor Author

qJkee commented Mar 12, 2026

/retest

2 similar comments
@qJkee
Copy link
Contributor Author

qJkee commented Mar 16, 2026

/retest

@qJkee
Copy link
Contributor Author

qJkee commented Mar 16, 2026

/retest

@qJkee
Copy link
Contributor Author

qJkee commented Mar 17, 2026

/retest
hypershift is failing due to infra failures

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 17, 2026

@qJkee: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-hypershift a60c248 link true /test e2e-aws-hypershift

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@suleymanakbas91
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants