Skip to content

feat: run aks node controller at boot time faster by 15s#8082

Merged
awesomenix merged 1 commit intomainfrom
nishp/nocse
Mar 23, 2026
Merged

feat: run aks node controller at boot time faster by 15s#8082
awesomenix merged 1 commit intomainfrom
nishp/nocse

Conversation

@awesomenix
Copy link
Contributor

@awesomenix awesomenix commented Mar 12, 2026

Summary

  • move scriptless AKS node controller startup earlier by switching generated custom data to cloud-boothook
  • Savings of 30s
  • update the baked aks-node-controller.service ordering to match the earlier-start model while keeping the VHD enable path intact
  • update the e2e hack path to mirror the same boothook-driven startup pattern

Details

  • aks-node-controller/pkg/nodeconfigutils/utils.go now writes the config from boothook and starts aks-node-controller.service immediately
  • parts/linux/cloud-init/artifacts/aks-node-controller.service now waits on network-online.target and stays active (exited) after the one-shot run
  • e2e/vmss.go switches the hack flow from runcmd to a boothook-dropped service and wrapper
  • generate-testdata was run to refresh generated snapshot data impacted by the pkg change

Timings

Latest rerun timings: 16CPU, 32GB RAM

   - boot → CSE start: 13.000s
   - CSE start: +0.000s
   - configureKubeletAndKubectl done: +1.627s
    - installKubeletKubectlFromURL: 5ms
   - ensureContainerd done: +2.137s
   - ensureKubelet done: +4.548s
   - ensureNoDupOnPromiscuBridge done: +7.934s
   - configureNodeExporter done: +8.768s
   - CSE finish: +10.874s
   - containerd started: +7.000s
   - kubelet started: +7.000s
   - first kubelet log: +7.000s
   - runtime initialized: +12.000s
   - main sync loop: +12.000s
   - node registered: +12.000s
   - NodeReady: +13.000s


   Latest rerun: 2CPU, 8GB RAM

   - boot → CSE start: 16.000s
   - CSE start: +0.000s
   - configureKubeletAndKubectl done: +2.910s
    - installKubeletKubectlFromURL: 7ms
   - ensureContainerd done: +3.677s
   - ensureKubelet done: +6.281s
   - ensureNoDupOnPromiscuBridge done: +12.761s
   - configureNodeExporter done: +14.115s
   - CSE finish: +17.789s
   - containerd started: +12.000s
   - kubelet started: +12.000s
   - runtime initialized: +20.000s
   - node registered: +20.000s
   - NodeReady: +21.000s

Copilot AI review requested due to automatic review settings March 12, 2026 02:18
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adjusts the e2e VMSS provisioning flow to run the aks-node-controller hack in the foreground (synchronously) and alters how provisioning status is validated during scenario setup.

Changes:

  • Run /opt/azure/bin/aks-node-controller-hack provision ... synchronously in cloud-init instead of backgrounding it.
  • Stop setting the VMSS CustomScript commandToExecute when provisioning via AKSNodeConfig (commented out).
  • Disable the post-create Custom Script Extension status check (commented out).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
e2e/vmss.go Runs aks-node-controller-hack provision synchronously; comments out CSE command wiring when using AKSNodeConfig.
e2e/test_helpers.go Comments out the VMSS Custom Script Extension status validation after VMSS creation.
Comments suppressed due to low confidence (1)

e2e/vmss.go:158

  • cse is no longer set when s.Runtime.AKSNodeConfig != nil. In the DisableScriptLessCompilation path this results in generating CustomData that only writes the aks-node-controller config file, but does not execute aks-node-controller (the VMSS CustomScript extension is skipped because cseCmd is empty). This will prevent the node from being provisioned. Restore wiring so scriptless mode still executes /opt/azure/containers/aks-node-controller provision-wait (or alternatively ensure CustomDataWithHack runs the equivalent of provision-wait). Note: leaving cse empty can also lead to a nil dereference in getBaseVMSSModel for Windows, which assumes an ExtensionProfile exists.
	if s.Runtime.AKSNodeConfig != nil {
		//cse = nodeconfigutils.CSE
		customData = func() string {
			if config.Config.DisableScriptLessCompilation {
				data, err := nodeconfigutils.CustomData(s.Runtime.AKSNodeConfig)
				require.NoError(s.T, err, "failed to generate custom data from AKSNodeConfig")

You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.


You can also share your feedback on Copilot code review. Take the survey.

Key components:

1. `aks-node-controller.service`: systemd unit that is triggered once cloud-init is complete (guaranteeing that config is present on disk) and then kickstarts bootstrapping.
1. `aks-node-controller.service`: systemd unit that can be started directly by cloud-boothook as soon as the config file is written, while remaining enabled on the VHD as a fallback boot hook.
Copy link

Copilot AI Mar 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The README states aks-node-controller.service remains enabled on the VHD as a fallback boot hook, but this PR’s unit file change removes the [Install] section (so systemctl enable aks-node-controller.service fails during VHD build). Either update this doc to match the new enable/start model, or restore an enable-able unit definition so the fallback claim is accurate.

Suggested change
1. `aks-node-controller.service`: systemd unit that can be started directly by cloud-boothook as soon as the config file is written, while remaining enabled on the VHD as a fallback boot hook.
1. `aks-node-controller.service`: systemd unit that is started directly by cloud-boothook as soon as the config file is written; it is started explicitly by the provisioning flow rather than being persistently enabled on the VHD as a fallback boot hook.

Copilot uses AI. Check for mistakes.
@awesomenix awesomenix changed the title feat: run aks node controller in sync feat: run aks node controller at boot time Mar 12, 2026
@awesomenix awesomenix changed the title feat: run aks node controller at boot time feat: run aks node controller at boot time faster by 30s Mar 12, 2026
@awesomenix awesomenix force-pushed the nishp/nocse branch 2 times, most recently from c874fa5 to 0ac573b Compare March 13, 2026 08:01
Copilot AI review requested due to automatic review settings March 13, 2026 08:01
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 84 changed files in this pull request and generated 7 comments.


You can also share your feedback on Copilot code review. Take the survey.

Comment on lines 162 to 230
return &aksnodeconfigv1.Configuration{
Version: "v1",
BootstrappingConfig: bootstrappingConfig,
DisableCustomData: nbc.AgentPoolProfile.IsFlatcar() || nbc.AgentPoolProfile.IsACL(),
DisableCustomData: true,
LinuxAdminUsername: "azureuser",
VmSize: config.Config.DefaultVMSKU,
ClusterConfig: &aksnodeconfigv1.ClusterConfig{
Location: nbc.ContainerService.Location,
ResourceGroup: nbc.ResourceGroupName,
VmType: aksnodeconfigv1.VmType_VM_TYPE_VMSS,
ClusterNetworkConfig: &aksnodeconfigv1.ClusterNetworkConfig{
SecurityGroupName: cs.Properties.GetNSGName(),
VnetName: cs.Properties.GetVirtualNetworkName(),
VnetResourceGroup: cs.Properties.GetVNetResourceGroupName(),
Subnet: cs.Properties.GetSubnetName(),
RouteTable: cs.Properties.GetRouteTableName(),
},
CloudProviderConfig: &aksnodeconfigv1.CloudProviderConfig{
Backoff: cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderBackoff,
BackoffMode: cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderBackoffMode,
BackoffRetries: to.Ptr(int32(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderBackoffRetries)),
BackoffExponent: to.Ptr(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderBackoffExponent),
BackoffDuration: to.Ptr(int32(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderBackoffDuration)),
BackoffJitter: to.Ptr(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderBackoffJitter),
RateLimit: cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderRateLimit,
RateLimitQps: to.Ptr(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderRateLimitQPS),
RateLimitQpsWrite: to.Ptr(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderRateLimitQPSWrite),
RateLimitBucket: to.Ptr(int32(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderRateLimitBucket)),
RateLimitBucketWrite: to.Ptr(int32(cs.Properties.OrchestratorProfile.KubernetesConfig.CloudProviderRateLimitBucketWrite)),
},
PrimaryScaleSet: nbc.PrimaryScaleSetName,
},
ApiServerConfig: &aksnodeconfigv1.ApiServerConfig{
ApiServerName: cs.Properties.HostedMasterProfile.FQDN,
},
AuthConfig: &aksnodeconfigv1.AuthConfig{
ServicePrincipalId: cs.Properties.ServicePrincipalProfile.ClientID,
ServicePrincipalSecret: cs.Properties.ServicePrincipalProfile.Secret,
TenantId: nbc.TenantID,
SubscriptionId: nbc.SubscriptionID,
AssignedIdentityId: nbc.UserAssignedIdentityClientID,
},
NetworkConfig: &aksnodeconfigv1.NetworkConfig{
NetworkPlugin: aksnodeconfigv1.NetworkPlugin_NETWORK_PLUGIN_KUBENET,
CniPluginsUrl: nbc.CloudSpecConfig.KubernetesSpecConfig.CNIPluginsDownloadURL,
VnetCniPluginsUrl: cs.Properties.OrchestratorProfile.KubernetesConfig.AzureCNIURLLinux,
},
GpuConfig: &aksnodeconfigv1.GpuConfig{
ConfigGpuDriver: true,
GpuDevicePlugin: false,
},
EnableUnattendedUpgrade: true,
KubernetesVersion: cs.Properties.OrchestratorProfile.OrchestratorVersion,
ContainerdConfig: &aksnodeconfigv1.ContainerdConfig{
ContainerdDownloadUrlBase: nbc.CloudSpecConfig.KubernetesSpecConfig.ContainerdDownloadURLBase,
},
OutboundCommand: helpers.GetDefaultOutboundCommand(),
KubernetesCaCert: base64.StdEncoding.EncodeToString([]byte(cs.Properties.CertificateProfile.CaCertificate)),
KubeBinaryConfig: &aksnodeconfigv1.KubeBinaryConfig{
KubeBinaryUrl: cs.Properties.OrchestratorProfile.KubernetesConfig.CustomKubeBinaryURL,
PodInfraContainerImageUrl: nbc.K8sComponents.PodInfraContainerImageURL,
},
KubeProxyUrl: cs.Properties.OrchestratorProfile.KubernetesConfig.CustomKubeProxyImage,
HttpProxyConfig: &aksnodeconfigv1.HttpProxyConfig{
NoProxyEntries: *nbc.HTTPProxyConfig.NoProxy,
},
LocalDnsProfile: &aksnodeconfigv1.LocalDnsProfile{
EnableLocalDns: true,
EnableLocalDns: false,
CpuLimitInMilliCores: to.Ptr(int32(2008)),
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 33 out of 85 changed files in this pull request and generated 13 comments.


You can also share your feedback on Copilot code review. Take the survey.

Copilot AI review requested due to automatic review settings March 13, 2026 19:41
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 89 changed files in this pull request and generated 7 comments.


You can also share your feedback on Copilot code review. Take the survey.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

@awesomenix awesomenix changed the title feat: run aks node controller at boot time faster by 30s feat: run aks node controller at boot time faster by 10s Mar 21, 2026
@awesomenix awesomenix marked this pull request as ready for review March 21, 2026 02:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 3 comments.

"SERVICE_ACCOUNT_IMAGE_PULL_DEFAULT_TENANT_ID": config.GetServiceAccountImagePullProfile().GetDefaultTenantId(),
"IDENTITY_BINDINGS_LOCAL_AUTHORITY_SNI": config.GetServiceAccountImagePullProfile().GetLocalAuthoritySni(),
"CSE_TIMEOUT": getCSETimeout(config),
"SKIP_WALA_HOLD": "true",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wala lol

nit: SKIP_WAAGENT_HOLD? more clear what ur referring to

return base64.StdEncoding.EncodeToString([]byte(customDataYAML)), nil
}

func writeMIMEPart(writer *multipart.Writer, contentType, content string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does MIME stand for? can we had a comment explaining

data, err := nodeconfigutils.CustomData(s.Runtime.AKSNodeConfig)
var data string
var err error
if s.VHD.Flatcar {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment explaining why different for flatcar?

@@ -1,21 +1,4 @@
echo $(date),$(hostname) > ${PROVISION_OUTPUT};
{{if not .GetDisableCustomData}}
CLOUD_INIT_STATUS_SCRIPT="/opt/azure/containers/cloud-init-status-check.sh";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the effort in this PR is somewhat remove the hard dependency with cloud-init status ready. However, the cloud-init-status-check.sh was added by a repair item for some intermittent sev2. Not meaning we can't remove it, just need to be aware that it could cause intermittent sev2.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we run the service before even cloud init is finished, so waiting for it doesnt make sense.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may be add this part of provision-wait?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aks-node-controller doesn't depend on it. The status check is more for the provisioning scripts cse_*.sh, which was what I saw from that sev2.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sync'd offline. Nishchay checked with the original owner of cloud-init-status-check.sh and confirmed this is no longer needed.

"SERVICE_ACCOUNT_IMAGE_PULL_DEFAULT_TENANT_ID": config.GetServiceAccountImagePullProfile().GetDefaultTenantId(),
"IDENTITY_BINDINGS_LOCAL_AUTHORITY_SNI": config.GetServiceAccountImagePullProfile().GetLocalAuthoritySni(),
"CSE_TIMEOUT": getCSETimeout(config),
"SKIP_WALA_HOLD": "true",
Copy link
Collaborator

@Devinwong Devinwong Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this not cause the sev2 that @SriHarsha001 encountered with lower end VMs in early Feb?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we arent dependent on WAAGENT anymore since we run way early

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

aks-node-controller doesn't depends on waagent, but the scripts may still depend on it? Anyway, if we can use the lower end vm to do multiple runs, then we may have a better understanding of the impact.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will test on A series VM. Also we are no dependent on WAagent like phase1 since cse scripts are not executed as part of waagent and there was a worry there might be a restart of waagent in middle of execution of cse, right now we only wait.

writer := multipart.NewWriter(&customData)

fmt.Fprintf(&customData, "MIME-Version: 1.0\r\n")
fmt.Fprintf(&customData, "Content-Type: multipart/mixed; boundary=%q\r\n\r\n", writer.Boundary())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: what is the reason we need it to be multipart MIME?
IIUC, the first part is the cloud-boothook which tries to write the aks-node-config to the node asap. Is the second part only to print a message currently? Are we going to use this second part for other purpose in the future?

Copy link
Contributor Author

@awesomenix awesomenix Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idea is to bundle our current nodecustomdata.yaml probably for hotfixing doesnt work since hotfixing needs to use boothook as well.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Comment on lines 3 to +13
ConditionPathExists=/opt/azure/containers/aks-node-controller-config.json
After=cloud-init.target
After=oem-cloudinit.service enable-oem-cloudinit.service
Wants=cloud-init.target
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
ExecStart=/opt/azure/containers/aks-node-controller-wrapper.sh
RemainAfterExit=No
RemainAfterExit=yes

[Install]
WantedBy=cloud-init.target
WantedBy=oem-cloudinit.service
WantedBy=basic.target
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching the unit to WantedBy=basic.target while keeping ConditionPathExists=/opt/azure/containers/aks-node-controller-config.json can prevent the service from ever running on flows where the config is only written later by cloud-init (e.g., older/custom data that uses write_files). systemd will skip the unit when the condition fails at basic.target time and won’t automatically retry when the file appears. Consider keeping the enable/fallback path tied to cloud-init.target (as before) and rely on the boothook’s explicit systemctl start for early-start, or add a path/trigger that starts the service when the config file is created.

Copilot uses AI. Check for mistakes.
Comment on lines +129 to 131
systemctl daemon-reload
systemctl enable aks-node-controller-hack.service
`
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boothook creates and enables aks-node-controller-hack.service but never starts it. systemctl enable does not start the unit, and depending on when the boothook runs relative to basic.target, the unit may not run on the first boot at all. To mirror the main boothook flow and make this deterministic, start the unit explicitly (e.g., systemctl start --no-block ...) or use systemctl enable --now ....

Suggested change
systemctl daemon-reload
systemctl enable aks-node-controller-hack.service
`
systemctl daemon-reload
systemctl enable --now aks-node-controller-hack.service
`

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants