Skip to content

Conversation

@henrybear327
Copy link
Contributor

@henrybear327 henrybear327 commented Jan 2, 2026

This change makes watch IDs unique per etcd member rather than per
stream, which improves debugging by allowing storage to identify
clients by watch IDs across connections.

The auto-assignment logic now uses a global atomic counter while
still checking for conflicts with user-specified IDs on the same
stream, avoiding allocating IDs that have
been occupied by user requests.

Continuation of the work #20282.

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: henrybear327
Once this PR has been reviewed and has the lgtm label, please assign siyuanfoundation for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@henrybear327 henrybear327 requested review from fuweid and serathius and removed request for fuweid January 2, 2026 05:02
@henrybear327
Copy link
Contributor Author

henrybear327 commented Jan 2, 2026

@serathius Is there a reason that we should keep the tracking of watchID per stream?

This PR doesn't address the case where the user can allocate the same watchID across different stream now.

@henrybear327 henrybear327 self-assigned this Jan 2, 2026
@henrybear327 henrybear327 requested a review from fuweid January 2, 2026 05:05
@codecov
Copy link

codecov bot commented Jan 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.57%. Comparing base (a438759) to head (84e3ae3).

Additional details and impacted files
Files with missing lines Coverage Δ
server/storage/mvcc/watcher.go 100.00% <100.00%> (+3.27%) ⬆️

... and 21 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #21069      +/-   ##
==========================================
+ Coverage   68.44%   68.57%   +0.12%     
==========================================
  Files         429      429              
  Lines       35281    35293      +12     
==========================================
+ Hits        24149    24201      +52     
+ Misses       9732     9699      -33     
+ Partials     1400     1393       -7     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a438759...84e3ae3. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@henrybear327 henrybear327 force-pushed the pr-20282 branch 3 times, most recently from ab6ee28 to cd94024 Compare January 3, 2026 04:14
@henrybear327 henrybear327 requested a review from serathius January 3, 2026 04:15
@henrybear327
Copy link
Contributor Author

/retest

@henrybear327 henrybear327 changed the title Watch: make watch ID unique per member [Watch] Make watch ID unique per member Jan 3, 2026
@serathius
Copy link
Member

LGTM

/cc @fuweid @ahrtr

@k8s-ci-robot k8s-ci-robot requested a review from ahrtr January 3, 2026 07:33
Comment on lines 131 to 138
// ensure that the nextWatchID check is done in a critical section
nextWatchIDMutex.Lock()
for ws.watchers[WatchID(nextWatchID)] != nil {
nextWatchID++
}
id = ws.nextID
ws.nextID++
id = WatchID(nextWatchID)
nextWatchID++
nextWatchIDMutex.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is problematic. The nextWatchID is global, but ws.watchers is local to each watchStream. It can't guarantee unique watchID across all watch streams.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also let alone multiple etcd servers may have same watch IDs

This change makes watch IDs unique per etcd member rather than per
stream, which improves debugging by allowing storage to identify
clients by watch IDs across connections.

The auto-assignment logic now uses a global atomic counter while
still checking for conflicts with user-specified IDs on the same
stream, avoiding allocating IDs that have
been occupied by user requests.

Continuation of the work etcd-io#20282.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Signed-off-by: Chun-Hung Tseng <henrytseng@google.com>
The previous implementation checked ws.watchers (per-stream) against
the global nextWatchID counter, which could not guarantee unique watch
IDs across all streams on a member.

Bug scenario:
- Stream A: User specifies ID=5 → added to ws_A.watchers[5]
- Stream B: Auto-assign when counter=5 → checks ws_B.watchers[5] (nil)
- Result: Both streams have ID=5, violating per-member uniqueness

Fix: Add a global registry (assignedWatchIDs) with reference counting
that tracks all watch IDs across all streams. The auto-assignment loop
now checks this global registry instead of the per-stream map.

Reference counting is needed because user-specified IDs can be
duplicated across streams (for backward compatibility), so we only
mark an ID as free when no stream is using it.

Signed-off-by: Chun-Hung Tseng <henrytseng@google.com>
@k8s-ci-robot
Copy link

@henrybear327: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-coverage-report 84e3ae3 link true /test pull-etcd-coverage-report

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Development

Successfully merging this pull request may close these issues.

4 participants