Skip to content

Conversation

@Nikita-Shupletsov
Copy link
Contributor

This PR fixes a bug when KS doesn't close stores if the shutdown was triggered during rebalance where an active tasks gets converted to a standby one and put into pendingTasksToInit

  • Added logic to close pending tasks to init.
  • Made standby task closure similar to the one for active tasks.
  • Added a separate method for getting standby tasks from task registry.
  • Added an integration test that reproduces the issue.

Reviewers: Matthias J. Sax matthias@confluent.io
Conflicts:
streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java

@Nikita-Shupletsov
Copy link
Contributor Author

cherry pick of #21365

This PR fixes a bug when KS doesn't close stores if the shutdown was
triggered during rebalance where an active tasks gets converted to a
standby one and put into pendingTasksToInit

* Added logic to close pending tasks to init.
* Made standby task closure similar to the one for active tasks.
* Added a separate method for getting standby tasks from task registry.
* Added an integration test that reproduces the issue.

Reviewers: Matthias J. Sax <matthias@confluent.io>

---------

Co-authored-by: Matthias J. Sax <mjsax@apache.org>
 Conflicts:
	streams/src/test/java/org/apache/kafka/streams/processor/internals/TaskManagerTest.java
@clolov clolov merged commit f50ddc4 into apache:4.2 Feb 9, 2026
3 checks passed
@clolov
Copy link
Contributor

clolov commented Feb 10, 2026

For visibility, after this PR StreamThreadTest.shouldRecordCommitLatency(boolean, boolean).stateUpdaterEnabled=false, processingThreadsEnabled=false is failing on 4.2. I will try to fix it today to unblock 4.2

@clolov
Copy link
Contributor

clolov commented Feb 10, 2026

I applied f9969a0 inspired by 50e2ffb. I then fixed an issue I introduced 😅 (76a7e18). We should be good to go for 4.2, but do let me know in case I have missed something obvious!

@mjsax
Copy link
Member

mjsax commented Feb 10, 2026

Thanks -- but this PR did pass the CI run, right? -- Was the test failing 100% or was it flaky?

@Nikita-Shupletsov
Copy link
Contributor Author

@clolov thank you for fixing after me.
@mjsax the test is not flaky, looks like we didn't run any tests in the approval workflow. taking a look why

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants