Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 28, 2025

Pods stuck in delete loop when HNS endpoint never created during failed ADD calls in Stateless CNI Windows SwiftV1.

Change

Modified deleteEndpointImpl in network/endpoint_windows.go to return nil instead of error when ep.HnsId is empty:

if ep.HnsId == "" {
    logger.Info("No HNS id found. Skip endpoint deletion", zap.Any("nicType", ep.NICType), zap.String("containerId", ep.ContainerID))
    return nil  // Previously returned error
}

This matches existing behavior in both deleteEndpointImplHnsV1 and deleteEndpointImplHnsV2 where missing endpoints return nil rather than blocking deletion. Log level changed from Error to Info since empty HNS ID is now treated as expected state.

Original prompt

This section details on the original issue you should resolve

<issue_title>Return non-nil in case there is no HNS id found for an endpoint.</issue_title>
<issue_description>What happened:
POd will stuck in a delete loop when hns ifor endpoint never created in a failed ADD call.

The issue is related to this line:

return fmt.Errorf("No HNS id found. Skip endpoint deletion for nicType %v, containerID %s", ep.NICType, ep.ContainerID) //nolint

This should return nil instead of error.
What you expected to happen:
The endpoint should be able to be deleted if HNS id never created for the pod

How to reproduce it:
HAve add call that failed to created HNS endpoint in Stateless CNI Windows SwiftV1

Orchestrator and Version (e.g. Kubernetes, Docker):

Operating System (Linux/Windows):

Kernel (e.g. uanme -a for Linux or $(Get-ItemProperty -Path "C:\windows\system32\hal.dll").VersionInfo.FileVersion for Windows):

Anything else we need to know?:
[Miscellaneous information that will assist in solving the issue.]
</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI and others added 2 commits October 28, 2025 01:42
…oint deletion

Co-authored-by: behzad-mir <13154712+behzad-mir@users.noreply.github.com>
Co-authored-by: behzad-mir <13154712+behzad-mir@users.noreply.github.com>
Copilot AI changed the title [WIP] Fix return value for missing HNS id in endpoint deletion Return nil when HNS endpoint ID is empty during deletion Oct 28, 2025
Copilot AI requested a review from behzad-mir October 28, 2025 01:48
Copilot finished work on behalf of behzad-mir October 28, 2025 01:48
@behzad-mir behzad-mir marked this pull request as ready for review October 28, 2025 01:54
@behzad-mir behzad-mir requested a review from QxBytes as a code owner October 28, 2025 01:54
Copilot AI review requested due to automatic review settings October 28, 2025 01:54
@behzad-mir behzad-mir requested review from a team and santhoshmprabhu as code owners October 28, 2025 01:54
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR resolves an issue where pods get stuck in a delete loop when HNS endpoints are never created during failed ADD calls in Stateless CNI Windows SwiftV1. The fix changes the error handling to treat missing HNS endpoint IDs as a valid state rather than a blocking error.

  • Changes deleteEndpointImpl to return nil instead of an error when HnsId is empty
  • Downgrades log level from Error to Info since missing HNS ID is now expected
  • Aligns behavior with existing deleteEndpointImplHnsV1 and deleteEndpointImplHnsV2 implementations

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@behzad-mir behzad-mir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove the changes form go.sum.
Only need to make changes on endpoint_windows.go

@behzad-mir behzad-mir enabled auto-merge October 28, 2025 21:39
@paulyufan2
Copy link
Contributor

/azp run Azure Container Networking PR

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@behzad-mir behzad-mir added this pull request to the merge queue Oct 29, 2025
Any commits made after this event will not be merged.
@paulyufan2 paulyufan2 removed this pull request from the merge queue due to a manual request Oct 29, 2025
@paulyufan2 paulyufan2 added this pull request to the merge queue Oct 29, 2025
Any commits made after this event will not be merged.
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 29, 2025
@paulyufan2 paulyufan2 added this pull request to the merge queue Oct 30, 2025
Any commits made after this event will not be merged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Return non-nil in case there is no HNS id found for an endpoint.

4 participants