Skip to content

Conversation

@jparraga-stackav
Copy link

@jparraga-stackav jparraga-stackav commented Nov 6, 2025

What type of PR is this?

Bug fix

What this PR does / why we need it:

This pull request updates armada to respect node limits. Without this change it is possible for Armada to try and schedule pods to nodes which do not have capacity in regards to concurrent running pods. When this happens the pods get stuck in a pending state and end up getting lease returned. Presumably Armada considers these pods towards fair share even though they cannot run.

  1. Configures default pod limits in the Armada server to include pods
  2. Configures the Armada scheduler to consider pods resource in scheduling decisions by default
  3. Updates the executor to sanitize pod resources before submitting pods to k8s
  4. Configures the executor to sanitize pod pods resources by default

Which issue(s) this PR fixes:

Fixes #4515

Special notes for your reviewer:

In order to rollout safely the executors must be updated first. After that the scheduler/server can be rolled out in any order.

Jason Parraga added 2 commits November 6, 2025 00:47
Signed-off-by: Jason Parraga <jparraga@stackav.com>
Signed-off-by: Jason Parraga <jparraga@stackav.com>
FailedPodChecks podchecks.FailedChecks
PendingPodChecks *podchecks.Checks
FatalPodSubmissionErrors []string
ResourcesToSanitize []string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add docs for the new field?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our docs, we often link to config in https://pkg.go.dev, i.e. for the executor it will be https://pkg.go.dev/github.com/armadaproject/armada/internal/executor/configuration#ApplicationConfiguration, it is easier for the end-users if they can see the config field docs there.

}

// Sanitizes pod resources that may be used during scheduling but are invalid at runtime.
func (submitService *SubmitService) sanitizePodResources(pod *v1.Pod) {
Copy link
Member

@dejanzele dejanzele Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this function can be simplified to a simple sanitizePodResources function, no need for it to be a receiver method on SubmitService, same for sanitizeResourceList

@JamesMurkin
Copy link
Contributor

Hey so generally a good direction but I there are a few things I think we'd like to consider / possibly change

  1. Generally we're trying to move to a world where all mutation happens on the executor API, and the executor does less edits. This PR does not move us in that direction
  2. We already basically have this functionality in the executor API
  1. I think we should debate if we add pods to more of the config:
  • indexedResources - I think add it here
  • dominantResourceFairnessResourcesToConsider - possibly add it here. It'd mean we fairshare on pods, which is probably correct in some cases (happy to leave this for now)
  1. Could we confirm if you overwrite the config it does actually remove pods (I can't remember if it merges or overwrites and I don't have time to test now)
  2. I'm somewhat debating if we should add it as defaultJobLimits as it means now the api + scheduler need to be configured in unison. It could just be a scheduler concern (tbh all of this PR could be just a scheduler concern) but I think this is probably the most configurable option for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scheduler does not respect node pod limits

3 participants