feat: implement node locking for NodeSet worker pods by giuliocalzo · Pull Request #130 · SlinkyProject/slurm-operator

giuliocalzo · 2026-02-24T13:35:37Z

Summary

Add lockNodes and lockNodeLifetime fields to NodeSetSpec to pin worker pods to their assigned Kubernetes nodes. When enabled, the controller records each pod-to-node mapping in NodeSetStatus and injects a requiredDuringSchedulingIgnoredDuringExecution NodeAffinity on pod recreation so each worker always returns to the same physical node.

The lockNodeLifetime field controls how long the lock persists: 0 means permanent, and a positive value (in seconds) causes the lock to expire after the pod stops running, allowing it to reschedule freely. Running pods continuously refresh their assignment timestamp so the countdown only begins once the pod is no longer active on the node.

Breaking Changes

none

Testing Notes

local testing with kind

status:
  nodeAssignments:
    "0":
      node: node-gpu-1
      at: 1740384000
    "1":
      node: node-gpu-2
      at: 1740384000

Additional Context

giuliocalzo · 2026-03-03T08:26:13Z

good morning @vivian-hafener I rebase and adjust based on the last pre-commit checks, feel free to review it

vivian-hafener · 2026-03-09T21:30:52Z

Good afternoon @giuliocalzo,

Was this resolved by 0ce9ef7?

Best regards,
Vivian Hafener

giuliocalzo · 2026-03-10T06:35:26Z

hi @vivian-hafener the deamonset implement natively the 1:1 node/pod affinity, but the statefulset does not have this feature, this PR is implementing this feature

giuliocalzo · 2026-03-11T13:42:50Z

@SkylerMalinowski I've just rebased and sort some conflict

SkylerMalinowski

Let's assume LockNodes=true, LockNodeLifetime=0s, ScalingMode=StatefulSet, and pod-0 is assigned to node-0. If I were to node autoscale my Kubernetes cluster down such that node-0 is scaled-in, the node lock on node-0 would never expire and the scaling logic would create pod-0 with strict affinity for node-0 which no longer exists, hence will never run.

With respect to the NodeAssignments map, is it better to clear a node assignment where the Kube node is NotFound, or have the scaling logic not create a pod with strict affinity for a node that is NotFound? I'm thinking the former case is best to avoid defunct NodeAssignments ever increasing the map and potentially going over etcd entry limits, but the latter case would best respect LockNodeLifetime=0s.

SkylerMalinowski · 2026-04-03T20:10:59Z

+	// Only used when lockNodes is true.
+	// +optional
+	// +default:=0
+	LockNodeLifetime int32 `json:"lockNodeLifetime,omitempty"`


Use a Duration for rich value expressions.

Suggested change

LockNodeLifetime int32 `json:"lockNodeLifetime,omitempty"`

LockNodeLifetime metav1.Duration `json:"lockNodeLifetime,omitempty"`

SkylerMalinowski · 2026-04-03T20:25:12Z

+	validOrdinals := make(map[string]struct{}, replicaCount)
+	for i := range replicaCount {
+		validOrdinals[strconv.Itoa(i)] = struct{}{}
+	}


This is not correct. Unlike appsv1.StatefulSet, NodeSet does not strictly scale-in from highest ordinal to lowest in order. Gaps are allowed and frequent during scale-in.

Let's assume replicas=3 with an initial state of pod-{0,1,2}. Let's say you set replicas=2 and the NodeSet controller gracefully terminated pod-1. Your code would say that pod-2 is invalid despite being perfectly valid, and would be erroneously cleared from assignments map.

SkylerMalinowski · 2026-04-14T21:45:18Z

This functionality was merged in 34f08a2 , but the implementation significantly deviated from your PR.

vivian-hafener self-assigned this Mar 2, 2026

vivian-hafener self-requested a review March 2, 2026 22:44

vivian-hafener removed their request for review March 3, 2026 15:13

vivian-hafener removed their assignment Mar 3, 2026

SkylerMalinowski self-requested a review March 9, 2026 21:38

feat: implement node locking for NodeSet worker pods

6a2f1fa

giuliocalzo force-pushed the sync branch from b834cc7 to 6a2f1fa Compare March 11, 2026 13:42

SkylerMalinowski self-assigned this Apr 3, 2026

SkylerMalinowski reviewed Apr 3, 2026

View reviewed changes

SkylerMalinowski closed this Apr 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: implement node locking for NodeSet worker pods#130

feat: implement node locking for NodeSet worker pods#130
giuliocalzo wants to merge 1 commit intoSlinkyProject:mainfrom
giuliocalzo:sync

giuliocalzo commented Feb 24, 2026 •

edited

Loading

Uh oh!

giuliocalzo commented Mar 3, 2026

Uh oh!

vivian-hafener commented Mar 9, 2026

Uh oh!

giuliocalzo commented Mar 10, 2026

Uh oh!

giuliocalzo commented Mar 11, 2026

Uh oh!

SkylerMalinowski left a comment

Uh oh!

SkylerMalinowski Apr 3, 2026

Uh oh!

SkylerMalinowski Apr 3, 2026

Uh oh!

SkylerMalinowski commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	LockNodeLifetime int32 `json:"lockNodeLifetime,omitempty"`
	LockNodeLifetime metav1.Duration `json:"lockNodeLifetime,omitempty"`

Conversation

giuliocalzo commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Breaking Changes

Testing Notes

Additional Context

Uh oh!

giuliocalzo commented Mar 3, 2026

Uh oh!

vivian-hafener commented Mar 9, 2026

Uh oh!

giuliocalzo commented Mar 10, 2026

Uh oh!

giuliocalzo commented Mar 11, 2026

Uh oh!

SkylerMalinowski left a comment

Choose a reason for hiding this comment

Uh oh!

SkylerMalinowski Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

SkylerMalinowski Apr 3, 2026

Choose a reason for hiding this comment

Uh oh!

SkylerMalinowski commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

giuliocalzo commented Feb 24, 2026 •

edited

Loading