-
Notifications
You must be signed in to change notification settings - Fork 265
OCPBUGS-60532: Allowlist controller: add node reconciler to sync allowlist on node creation #2858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
karampok
commented
Dec 11, 2025
WalkthroughRemoved the in-cluster allowlist ConfigMap manifest, refactored the allowlist controller (init, naming, and ConfigMap/DaemonSet handling), added a node reconciler that applies allowlist via per-node Jobs with tests, and renamed related public constants. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Cache: Disabled due to data retention organization setting Knowledge base: Disabled due to ⛔ Files ignored due to path filters (6)
📒 Files selected for processing (7)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
🧰 Additional context used📓 Path-based instructions (1)**⚙️ CodeRabbit configuration file
Files:
🔇 Additional comments (9)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: karampok The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
🧹 Nitpick comments (4)
pkg/controller/allowlist/nodeReconciler_test.go (1)
395-398: Environment variable manipulation is not parallel-test safe.Using
os.Setenv/os.Unsetenvin tests modifies global state, which can cause flaky behavior if tests run in parallel. Consider usingt.Setenv()which automatically handles cleanup and marks the test as non-parallel.func TestNewAllowlistJobFor(t *testing.T) { - os.Setenv("MULTUS_IMAGE", "quay.io/openshift/multus:latest") - defer os.Unsetenv("MULTUS_IMAGE") + t.Setenv("MULTUS_IMAGE", "quay.io/openshift/multus:latest")pkg/controller/allowlist/allowlist_controller.go (2)
67-72: Predicate filter usingstrings.Containsmay match unintended ConfigMaps.The filter
strings.Contains(object.GetName(), names.AllowlistConfigName)will match any ConfigMap containing "cni-sysctl-allowlist" in its name, not just the two expected ones. Consider using exact name matching for more precise filtering.predicate.NewPredicateFuncs(func(object crclient.Object) bool { - // Only care about cni-sysctl-allowlist, but also watching for default-cni-sysctl-allowlist - // as a trigger for creating cni-sysctl-allowlist if it doesn't exist - // NOTE: the cni-sysctl-allowlist is hardcoded in pkg/network/multus.go:91 - return (strings.Contains(object.GetName(), names.AllowlistConfigName)) + // Watch both the allowlist ConfigMap and the default ConfigMap + // The default CM triggers creation of the allowlist CM if it doesn't exist + name := object.GetName() + return name == names.AllowlistConfigName || name == names.DefaultAllowlistConfigName }),
241-251: Redundant nil check in allowlistConfigMapExists.After a successful
Get()call,cmwill always be a valid pointer (not nil). The checkcm != nilis redundant.func allowlistConfigMapExists(ctx context.Context, client cnoclient.Client) (bool, error) { - cm, err := client.Default().Kubernetes().CoreV1().ConfigMaps(names.MultusNamespace).Get( + _, err := client.Default().Kubernetes().CoreV1().ConfigMaps(names.MultusNamespace).Get( ctx, names.AllowlistConfigName, metav1.GetOptions{}) if err != nil { if apierrors.IsNotFound(err) { return false, nil } return false, err } - return cm != nil, nil + return true, nil }pkg/controller/allowlist/nodeReconciler.go (1)
194-198: Consider validating MULTUS_IMAGE environment variable or providing a default fallback.The container image is set directly from
os.Getenv("MULTUS_IMAGE")without validation. While this environment variable is consistently set in the operator deployment manifests, defensive programming could prevent confusing failures if the deployment configuration is incomplete. Similar validation could be applied to other image environment variables for consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (1)
vendor/modules.txtis excluded by!vendor/**,!**/vendor/**
📒 Files selected for processing (6)
bindata/allowlist/config/configmap.yaml(0 hunks)pkg/controller/add_networkconfig.go(1 hunks)pkg/controller/allowlist/allowlist_controller.go(11 hunks)pkg/controller/allowlist/nodeReconciler.go(1 hunks)pkg/controller/allowlist/nodeReconciler_test.go(1 hunks)pkg/names/names.go(1 hunks)
💤 Files with no reviewable changes (1)
- bindata/allowlist/config/configmap.yaml
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
pkg/names/names.gopkg/controller/allowlist/nodeReconciler_test.gopkg/controller/allowlist/nodeReconciler.gopkg/controller/add_networkconfig.gopkg/controller/allowlist/allowlist_controller.go
🧬 Code graph analysis (3)
pkg/controller/allowlist/nodeReconciler.go (1)
pkg/names/names.go (3)
DefaultAllowlistConfigName(39-39)MultusNamespace(33-33)AllowlistConfigName(36-36)
pkg/controller/add_networkconfig.go (1)
pkg/controller/allowlist/nodeReconciler.go (1)
AddNodeReconciler(53-70)
pkg/controller/allowlist/allowlist_controller.go (2)
pkg/names/names.go (2)
MultusNamespace(33-33)AllowlistConfigName(36-36)pkg/render/render.go (2)
MakeRenderData(25-30)RenderDir(34-36)
🔇 Additional comments (6)
pkg/controller/add_networkconfig.go (1)
29-31: LGTM!The node reconciler registration follows the existing pattern and is correctly placed after the main allowlist controller.
pkg/controller/allowlist/nodeReconciler_test.go (1)
42-280: Good test coverage for the node reconciler.The test scenarios comprehensively cover the key reconciliation paths: missing default ConfigMap, identical ConfigMaps skipping job creation, different ConfigMaps triggering job creation, and the various job lifecycle states (succeeded, failed with different reasons, active). The use of table-driven tests with descriptive names makes the test intent clear.
pkg/names/names.go (1)
35-39: LGTM!The renamed constants follow Go naming conventions, and the new
DefaultAllowlistConfigNameconstant clearly distinguishes between the default and custom allowlist ConfigMaps.pkg/controller/allowlist/nodeReconciler.go (2)
50-70: LGTM on the reconciler setup.The controller setup correctly watches for Node creation events only (via
nodePredicate), which aligns with the PR's goal of syncing allowlist config when new nodes join the cluster.
83-96: The comparison behavior whenallowlistCMis not found should be clarified.When the
AllowlistConfigNameConfigMap doesn't exist,IgnoreNotFound(err)suppresses the error andallowlistCMremains an emptyConfigMapwith nilData. The subsequentDeepEqual(allowlistCM.Data, defaultCM.Data)comparesnilagainst the defaultData, which returnsfalse, causing a job to be created even when the allowlist is missing.This behavior appears intentional (missing allowlist CM triggers configuration), but it's undocumented in the code and lacks test coverage. Consider either: (1) adding explicit handling and comments for the missing CM case, or (2) adding a test case to document this behavior.
Likely an incorrect or invalid review comment.
pkg/controller/allowlist/allowlist_controller.go (1)
46-51: LGTM on the simplified controller initialization.The refactored
Addfunction with inline reconciler creation is cleaner and follows the same pattern as the newAddNodeReconciler.
| for _, obj := range manifests { | ||
|
|
||
| err = createObject(ctx, client, obj) | ||
| if err != nil { | ||
| return err | ||
| if err := client.Default().CRClient().Create(ctx, obj); err != nil { | ||
| return errors.Wrapf(err, "error creating %s %s/%s", obj.GroupVersionKind(), obj.GetNamespace(), obj.GetName()) | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
createObjectsFrom fails on AlreadyExists instead of being idempotent.
When creating objects, if any already exists, the function returns an error. This could cause issues during reconciliation if the ConfigMap was partially created. Consider ignoring AlreadyExists errors to make the operation idempotent.
for _, obj := range manifests {
- if err := client.Default().CRClient().Create(ctx, obj); err != nil {
+ if err := client.Default().CRClient().Create(ctx, obj); err != nil && !apierrors.IsAlreadyExists(err) {
return errors.Wrapf(err, "error creating %s %s/%s", obj.GroupVersionKind(), obj.GetNamespace(), obj.GetName())
}
}🤖 Prompt for AI Agents
In pkg/controller/allowlist/allowlist_controller.go around lines 160 to 163, the
loop that creates manifests returns on any error causing AlreadyExists to fail
instead of being idempotent; change the error handling so that when
client.Default().CRClient().Create returns an AlreadyExists error you ignore it
and continue (use the API server error check for AlreadyExists), but for all
other errors return the wrapped error as before; this makes object creation
idempotent during reconciliation.
| func newAllowlistJobFor(nodeName string, configMapVersion string) *batchv1.Job { | ||
| // Truncate ResourceVersion to prevent exceeding Kubernetes name limits. | ||
| // Format: cni-sysctl-allowlist-{nodeName}-{first8chars} | ||
| jobName := fmt.Sprintf("cni-sysctl-allowlist-%s-%.8s", nodeName, configMapVersion) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Job name may exceed Kubernetes 63-character limit for long node names.
The job name format cni-sysctl-allowlist-{nodeName}-{8chars} has a 30-character prefix/suffix overhead. Kubernetes DNS-1123 subdomain names are limited to 63 characters, but node names can be up to 253 characters. For nodes with names longer than ~33 characters, job creation will fail.
Consider truncating the node name portion with a hash suffix to ensure uniqueness while staying within limits.
func newAllowlistJobFor(nodeName string, configMapVersion string) *batchv1.Job {
- // Truncate ResourceVersion to prevent exceeding Kubernetes name limits.
- // Format: cni-sysctl-allowlist-{nodeName}-{first8chars}
- jobName := fmt.Sprintf("cni-sysctl-allowlist-%s-%.8s", nodeName, configMapVersion)
+ // Truncate nodeName and ResourceVersion to prevent exceeding Kubernetes name limits.
+ // Job names must be <= 63 characters.
+ // Format: cni-sysctl-allowlist-{truncatedNodeName}-{first8chars}
+ truncatedNodeName := nodeName
+ if len(nodeName) > 33 {
+ truncatedNodeName = nodeName[:33]
+ }
+ jobName := fmt.Sprintf("cni-sysctl-allowlist-%s-%.8s", truncatedNodeName, configMapVersion)Note: If uniqueness is a concern when truncating, consider appending a short hash of the full node name.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| func newAllowlistJobFor(nodeName string, configMapVersion string) *batchv1.Job { | |
| // Truncate ResourceVersion to prevent exceeding Kubernetes name limits. | |
| // Format: cni-sysctl-allowlist-{nodeName}-{first8chars} | |
| jobName := fmt.Sprintf("cni-sysctl-allowlist-%s-%.8s", nodeName, configMapVersion) | |
| func newAllowlistJobFor(nodeName string, configMapVersion string) *batchv1.Job { | |
| // Truncate nodeName and ResourceVersion to prevent exceeding Kubernetes name limits. | |
| // Job names must be <= 63 characters. | |
| // Format: cni-sysctl-allowlist-{truncatedNodeName}-{first8chars} | |
| truncatedNodeName := nodeName | |
| if len(nodeName) > 33 { | |
| truncatedNodeName = nodeName[:33] | |
| } | |
| jobName := fmt.Sprintf("cni-sysctl-allowlist-%s-%.8s", truncatedNodeName, configMapVersion) |
🤖 Prompt for AI Agents
In pkg/controller/allowlist/nodeReconciler.go around lines 161-164, the
generated job name "cni-sysctl-allowlist-{nodeName}-{8chars}" can exceed the
63-character Kubernetes DNS-1123 limit for long node names; change the name
generation to truncate the nodeName portion to a safe length so the full string
<=63 chars and append a short deterministic hash (e.g., first 6-8 hex chars of
SHA256) of the full node name to preserve uniqueness, ensure the truncated part
and hash use only DNS-1123 characters (lowercase letters, digits, and '-'), and
compute lengths based on the fixed prefix/suffix so the final name always fits.
| // MultusNamespace is the namespace where applied configuration | ||
| // configmaps are stored. | ||
| // Should match 00_namespace.yaml | ||
| const MULTUS_NAMESPACE = "openshift-multus" | ||
| const MultusNamespace = "openshift-multus" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incorrect/misleading comment for MultusNamespace.
The comment was copied from APPLIED_NAMESPACE and describes "applied configuration configmaps", but MultusNamespace is for multus-related components, not applied configs. The comment should accurately describe the namespace's purpose.
-// MultusNamespace is the namespace where applied configuration
-// configmaps are stored.
-// Should match 00_namespace.yaml
+// MultusNamespace is the namespace where multus components are deployed.
const MultusNamespace = "openshift-multus"🤖 Prompt for AI Agents
In pkg/names/names.go around lines 30 to 33, the comment for MultusNamespace
incorrectly describes "applied configuration configmaps" (copied from
APPLIED_NAMESPACE); replace it with a concise, accurate description stating that
MultusNamespace is the Kubernetes namespace where Multus CNI-related components
and resources are installed/managed (e.g., used for Multus daemonsets, configs,
and supporting objects), and keep or update any reference to the corresponding
00_namespace.yaml if applicable.
bb547d1 to
1c989c7
Compare
|
@karampok: This pull request references Jira Issue OCPBUGS-60532, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
1c989c7 to
50b66c5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/controller/allowlist/nodeReconciler.go (1)
72-143: Consider explicitly handling “job not found” on the AlreadyExists path.If the Job is deleted between
Createreturning AlreadyExists and the subsequentGet, you’ll return an error and requeue with backoff. If you want smoother behavior, you canIgnoreNotFound(err)there and fall back to creating again on next reconcile.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge base: Disabled due to Reviews -> Disable Knowledge Base setting
⛔ Files ignored due to path filters (6)
vendor/github.com/google/go-cmp/cmp/cmpopts/equate.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/google/go-cmp/cmp/cmpopts/ignore.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/google/go-cmp/cmp/cmpopts/sort.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/google/go-cmp/cmp/cmpopts/struct_filter.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/google/go-cmp/cmp/cmpopts/xform.gois excluded by!vendor/**,!**/vendor/**vendor/modules.txtis excluded by!vendor/**,!**/vendor/**
📒 Files selected for processing (5)
go.mod(2 hunks)pkg/controller/add_networkconfig.go(1 hunks)pkg/controller/allowlist/nodeReconciler.go(1 hunks)pkg/controller/allowlist/nodeReconciler_test.go(1 hunks)pkg/names/names.go(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
- pkg/controller/allowlist/nodeReconciler_test.go
- go.mod
- pkg/names/names.go
🧰 Additional context used
📓 Path-based instructions (1)
**
⚙️ CodeRabbit configuration file
-Focus on major issues impacting performance, readability, maintainability and security. Avoid nitpicks and avoid verbosity.
Files:
pkg/controller/add_networkconfig.gopkg/controller/allowlist/nodeReconciler.go
🧬 Code graph analysis (2)
pkg/controller/add_networkconfig.go (1)
pkg/controller/allowlist/nodeReconciler.go (1)
AddNodeReconciler(53-70)
pkg/controller/allowlist/nodeReconciler.go (4)
pkg/client/types.go (1)
Client(19-31)pkg/controller/statusmanager/status_manager.go (2)
StatusManager(98-130)New(132-153)pkg/apply/apply.go (1)
Object(22-25)pkg/names/names.go (3)
DefaultAllowlistConfigName(39-39)MultusNamespace(33-33)AllowlistConfigName(36-36)
🔇 Additional comments (2)
pkg/controller/add_networkconfig.go (1)
19-32: Controller registration ordering looks correct (Add before AddNodeReconciler).pkg/controller/allowlist/nodeReconciler.go (1)
50-70: Watch predicate matches the “node create only” objective.Also applies to: 145-159
| func newAllowlistJobFor(nodeName string, configMapVersion string) *batchv1.Job { | ||
| jobName := fmt.Sprintf("cni-sysctl-allowlist-%.32s-%.8s", nodeName, configMapVersion) | ||
| return &batchv1.Job{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: jobName, | ||
| Namespace: names.MultusNamespace, | ||
| Labels: map[string]string{ | ||
| "app": "cni-sysctl-allowlist-job", | ||
| "node": nodeName, | ||
| }, | ||
| }, | ||
| Spec: batchv1.JobSpec{ | ||
| BackoffLimit: ptr.To(int32(3)), | ||
| TTLSecondsAfterFinished: ptr.To(int32(allowlistJobTTL)), | ||
| ActiveDeadlineSeconds: ptr.To(int64(allowlistJobActiveDeadline)), | ||
| Template: corev1.PodTemplateSpec{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Labels: map[string]string{ | ||
| "app": "cni-sysctl-allowlist-job", | ||
| "node": nodeName, | ||
| }, | ||
| Annotations: map[string]string{ | ||
| "target.workload.openshift.io/management": `{"effect": "PreferredDuringScheduling"}`, | ||
| }, | ||
| }, | ||
| Spec: corev1.PodSpec{ | ||
| RestartPolicy: corev1.RestartPolicyNever, | ||
| PriorityClassName: "openshift-user-critical", | ||
| NodeSelector: map[string]string{ | ||
| "kubernetes.io/hostname": nodeName, | ||
| }, | ||
| Containers: []corev1.Container{ | ||
| { | ||
| Name: "kube-multus-additional-cni-plugins", | ||
| Image: os.Getenv("MULTUS_IMAGE"), | ||
| Command: []string{"/bin/bash", "-c", "cp /entrypoint/allowlist.conf /host/etc/cni/tuning/"}, | ||
| Resources: corev1.ResourceRequirements{ | ||
| Requests: corev1.ResourceList{ | ||
| corev1.ResourceCPU: resource.MustParse("10m"), | ||
| corev1.ResourceMemory: resource.MustParse("10Mi"), | ||
| }, | ||
| }, | ||
| SecurityContext: &corev1.SecurityContext{ | ||
| Privileged: ptr.To(true), | ||
| }, | ||
| TerminationMessagePolicy: corev1.TerminationMessageFallbackToLogsOnError, | ||
| VolumeMounts: []corev1.VolumeMount{ | ||
| { | ||
| Name: "cni-sysctl-allowlist", | ||
| MountPath: "/entrypoint", | ||
| }, | ||
| { | ||
| Name: "tuning-conf-dir", | ||
| MountPath: "/host/etc/cni/tuning/", | ||
| ReadOnly: false, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| Volumes: []corev1.Volume{ | ||
| { | ||
| Name: "cni-sysctl-allowlist", | ||
| VolumeSource: corev1.VolumeSource{ | ||
| ConfigMap: &corev1.ConfigMapVolumeSource{ | ||
| LocalObjectReference: corev1.LocalObjectReference{ | ||
| Name: names.AllowlistConfigName, | ||
| }, | ||
| DefaultMode: ptr.To(int32(0644)), | ||
| }, | ||
| }, | ||
| }, | ||
| { | ||
| Name: "tuning-conf-dir", | ||
| VolumeSource: corev1.VolumeSource{ | ||
| HostPath: &corev1.HostPathVolumeSource{ | ||
| Path: "/etc/cni/tuning/", | ||
| Type: ptr.To(corev1.HostPathDirectoryOrCreate), | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix long node names + robust node pinning (current labels/nodeSelector can make Job invalid / unschedulable).
- Label value limit:
Labels["node"] = nodeName(Job + PodTemplate) breaks whennodeName> 63 chars. - Scheduling:
NodeSelector["kubernetes.io/hostname"]=nodeNameassumes nodeName matches that label value (and inherits the 63-char constraint);PodSpec.NodeNamepins correctly.
Suggested fix (hash for names/labels; keep full nodeName in annotations):
diff --git a/pkg/controller/allowlist/nodeReconciler.go b/pkg/controller/allowlist/nodeReconciler.go
@@
import (
"context"
+ "crypto/sha256"
+ "encoding/hex"
"fmt"
"os"
"time"
@@
func newAllowlistJobFor(nodeName string, configMapVersion string) *batchv1.Job {
- jobName := fmt.Sprintf("cni-sysctl-allowlist-%.32s-%.8s", nodeName, configMapVersion)
+ sum := sha256.Sum256([]byte(nodeName))
+ nodeHash := hex.EncodeToString(sum[:])[:10]
+ jobName := fmt.Sprintf("cni-sysctl-allowlist-%s-%.8s", nodeHash, configMapVersion)
return &batchv1.Job{
ObjectMeta: metav1.ObjectMeta{
Name: jobName,
Namespace: names.MultusNamespace,
Labels: map[string]string{
"app": "cni-sysctl-allowlist-job",
- "node": nodeName,
+ "node": nodeHash,
},
+ Annotations: map[string]string{
+ "allowlist.openshift.io/node-name": nodeName,
+ },
},
@@
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: map[string]string{
"app": "cni-sysctl-allowlist-job",
- "node": nodeName,
+ "node": nodeHash,
},
Annotations: map[string]string{
"target.workload.openshift.io/management": `{"effect": "PreferredDuringScheduling"}`,
+ "allowlist.openshift.io/node-name": nodeName,
},
},
Spec: corev1.PodSpec{
RestartPolicy: corev1.RestartPolicyNever,
PriorityClassName: "openshift-user-critical",
- NodeSelector: map[string]string{
- "kubernetes.io/hostname": nodeName,
- },
+ NodeName: nodeName,
Containers: []corev1.Container{|
This PR stacks on top of #2846 (just for the name refactoring) |
Rename constants and functions to make it easier for new users to read the code. Variables within the package do not need to be prefixed with "allowlist" since the package context already provides that scope. Also name in GO should not use ALL_CAPS but use CamelCase instead. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Konstantinos Karampogias <karampok@gmail.com>
The bindata/allowlist/config/configmap.yaml file is not used by any code. The controller uses bindata/network/multus/004-sysctl-configmap.yaml as the template for creating the cni-sysctl-allowlist ConfigMap when it doesn't exist. This template is shared with multus which uses it to create the default-cni-sysctl-allowlist ConfigMap. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Konstantinos Karampogias <karampok@gmail.com>
The runtime.Scheme field was never used - controller creates objects directly via client.Create() without needing type conversion or owner references. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Konstantinos Karampogias <karampok@gmail.com>
Remove createObject() and deleteDaemonSet() helper functions that were each called only once. Inlining them at their call sites improves readability by eliminating unnecessary indirection. This also removes the unused unstructured import. Signed-off-by: Konstantinos Karampogias <karampok@gmail.com>
…reation
When a node joins the cluster, create a Job on that specific node to sync
the CNI sysctl allowlist configuration.
Implementation:
- Watch node creation events
- Skip any action if ConfigMap unchanged from the default (multus daemon
installs default)
- Create Job with nodeSelector targeting specific node
- If job AlreadyExists, follow up the result
- Delete successful Jobs, preserve failed Jobs for debugging
Why not the daemonset approach:
1. Jobs run only on the new node (1 pod) while DaemonSet runs on all
nodes (100+ pods), so 10 new nodes create 10 Jobs vs 1000 pods.
2. Each Job succeeds or fails independently with logs kept for 24
hours, while DaemonSet waits for all pods to be ready so one stuck
pod blocks the entire update.
3. Multiple Jobs run in parallel without conflicts, while DaemonSet
needs complex logic to handle multiple node events with delays and
state checking.
4. ConfigMap changes use DaemonSet (update all nodes), node additions
use Jobs (update one node), each approach fits its purpose.
Known issues:
1. On restart both those jobs and the daemonset will at the same time.
2. On restart (or if new node is master node) job will fail after 10
minutes because we do not tolerate by design
Assisted-By: Claude <noreply@anthropic.com>
Signed-off-by: Konstantinos Karampogias <karampok@gmail.com>
50b66c5 to
f709c58
Compare
|
@karampok: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |