Fix bug in CHASM activity schedule-to-close timer task validation #8720

dandavison · 2025-11-29T01:26:34Z

What changed?

Fix bug: schedule-to-close timer task validator was incorrectly requiring activity attempt at task execution time to be equal to activity attempt at task creation
Add test of schedule-to-close timeout that fails with the bug fix reverted
Do not set empty struct as outcome failure on attempt failure when retries are exhausted.
Improve doc comments

Why?

Standalone activity schedule-to-close was incorrect: would not have fired after attempt 1 without this fix
Setting empty struct on attempt failure when retries are exhausted should not be necessary and it is fragile to introduce special values that code might start to rely on.

How did you test it?

built
added new functional test(s)

Note

Decouples schedule-to-close timeout from attempt matching, updates proto to empty task payload, and adds a retry-based timeout test.

Activity execution:
- Relax schedule-to-close timeout validation to only require TransitionTimedOut to be possible (remove attempt check) in activity_tasks.go.
- When scheduling ScheduleToCloseTimeoutTask, stop setting Attempt in statemachine.go.
Proto/Generated code:
- Make activity.proto.v1.ScheduleToCloseTimeoutTask an empty message; regenerate Go (tasks.pb.go).
Tests:
- Add TestScheduleToCloseTimeout_WithRetry to verify schedule-to-close timeout across a retry.

^{Written by Cursor Bugbot for commit 58f13fc. This will update automatically on new commits. Configure here.}

dandavison · 2025-11-29T01:28:02Z

chasm/lib/activity/library.go

 func (l *library) Tasks() []*chasm.RegistrableTask {
 	return []*chasm.RegistrableTask{
-		chasm.NewRegistrableSideEffectTask[*Activity, *activitypb.ActivityDispatchTask](
+		chasm.NewRegistrableSideEffectTask(


I removed this because these types are inferred.

I had added it cuz my IDE had trouble inferring it, though everything compiles.

Ah OK. Well shout out if you think we should keep it. I'd prefer to have the code be in line with Go rather than tracking IDE deficiencies, but then my IDE doesn't have a problem with it :)

yea we can remove it. Just a slight annoyance for me, hopefully the IDE will address the issue soon.

dandavison · 2025-11-29T01:28:40Z

chasm/lib/activity/activity_tasks.go

-
-	valid := TransitionTimedOut.Possible(activity) && task.Attempt == attempt.Count
-	return valid, nil
+	return TransitionTimedOut.Possible(activity), nil


This is the bug fix

I believe you're right since there could be multiple retry attempts within a scheduleToClose tasks. Good catch.

fretz12 · 2025-11-29T16:47:34Z

chasm/lib/activity/activity_tasks.go

-
-	valid := TransitionTimedOut.Possible(activity) && task.Attempt == attempt.Count
-	return valid, nil
+	return TransitionTimedOut.Possible(activity), nil


I believe you're right since there could be multiple retry attempts within a scheduleToClose tasks. Good catch.

chasm/lib/activity/activity_tasks.go

fretz12 · 2025-11-29T16:51:46Z

chasm/lib/activity/library.go

 func (l *library) Tasks() []*chasm.RegistrableTask {
 	return []*chasm.RegistrableTask{
-		chasm.NewRegistrableSideEffectTask[*Activity, *activitypb.ActivityDispatchTask](
+		chasm.NewRegistrableSideEffectTask(


I had added it cuz my IDE had trouble inferring it, though everything compiles.

fretz12 · 2025-11-29T16:54:08Z

chasm/lib/activity/proto/v1/tasks.proto


+// ScheduleToCloseTimeoutTask is a pure task that enforces a timeout across the sequence of activity
+// attempts.
 message ScheduleToCloseTimeoutTask {


I believe you can remove this altogether now. In place of the interface arg you can use _ any

Keep it because we will have validation later when we support activity resets.

fretz12 · 2025-11-29T16:55:09Z

Thanks for catching this.

dandavison · 2025-11-29T22:33:54Z

chasm/lib/activity/activity.go

-	// If the activity has exhausted retries, mark the outcome failure as well but don't store duplicate failure info.
-	// Also reset the retry interval as there won't be any more retries.
 	if noRetriesLeft {
-		outcome.Variant = &activitypb.ActivityOutcome_Failed_{}


@fretz12 thanks for reviewing. After you reviewed, I added this commit which gets rid of this set-to-empty-struct on this line. Would you mind looking and seeing if you agree that it's unnecessary? I'd prefer not to do it because I don't want code relying on a special value set here -- I feel that the code should just be able to take the failure from the right place without any special empty value here. 0cbe6a9

I'm ok with that as long as the outcome is filled out on any GET API responses if an activity has reached terminal state, whether from the Attempt or Outcome field stored internally.

I've removed this change from the PR so that we can discuss it separately.

bergundy

The test you added is okay but I am slightly worried that it will be flaky. I would recommend unit testing instead where you have more control over timing.

bergundy · 2025-12-01T14:44:55Z

chasm/lib/activity/proto/v1/tasks.proto


+// ScheduleToCloseTimeoutTask is a pure task that enforces a timeout across the sequence of activity
+// attempts.
 message ScheduleToCloseTimeoutTask {


Keep it because we will have validation later when we support activity resets.

bergundy · 2025-12-01T14:52:18Z

tests/standalone_activity_test.go

 	require.Error(t, err)
 }

+func (s *standaloneActivityTestSuite) Test_ScheduleToCloseTimeout_WithRetry() {


This test could be flaky when CI is under load.
2 seconds is fairly short to ensure at least one attempt is issued.

bergundy · 2025-12-01T14:53:22Z

tests/standalone_activity_test.go

+			Message: "Retryable failure",
+			FailureInfo: &failurepb.Failure_ApplicationFailureInfo{ApplicationFailureInfo: &failurepb.ApplicationFailureInfo{
+				NonRetryable:   false,
+				NextRetryDelay: durationpb.New(1 * time.Second),


Depending on timing, this may prevent the schedule to close timeout from firing because we would know there's not enough time for the next attempt and avoid scheduling it.

We should have a test for this behavior if we don't yet.

bergundy · 2025-12-01T14:55:15Z

tests/standalone_activity_test.go

+
 // TestStartToCloseTimeout tests that a start-to-close timeout is recorded after the activity is started.
-func (s *standaloneActivityTestSuite) TestStartToCloseTimeout() {
+func (s *standaloneActivityTestSuite) Test_StartToCloseTimeout() {


We typically don't put an underscore after the word Test in the codebase.

Suggested change

func (s *standaloneActivityTestSuite) Test_StartToCloseTimeout() {

func (s *standaloneActivityTestSuite) TestStartToCloseTimeout() {

dandavison · 2025-12-01T15:08:01Z

The test you added is okay but I am slightly worried that it will be flaky. I would recommend unit testing instead where you have more control over timing.

Agreed, I'm aware that some of the functional tests I've been writing involving timer tasks may be flaky. Let's address this in follow-on PRs. (They are doing their immediate job of verifying that the intended algorithm has been implemented.)

dandavison requested review from a team as code owners November 29, 2025 01:26

dandavison commented Nov 29, 2025

View reviewed changes

dandavison requested review from bergundy and fretz12 November 29, 2025 01:30

dandavison mentioned this pull request Nov 29, 2025

Add timeout tasks for Standalone Activities #8573

Merged

5 tasks

fretz12 approved these changes Nov 29, 2025

View reviewed changes

dandavison commented Nov 29, 2025

View reviewed changes

bergundy approved these changes Dec 1, 2025

View reviewed changes

Base automatically changed from saa-id-policy to standalone-activity December 5, 2025 23:52

Fix ScheduleToClose bug

4325002

dandavison force-pushed the saa-schedule-to-close-bug branch from 333da11 to 4325002 Compare December 6, 2025 00:11

Lint

58f13fc

dandavison merged commit 8ec58d8 into standalone-activity Dec 6, 2025
11 checks passed

dandavison deleted the saa-schedule-to-close-bug branch December 6, 2025 00:21

	func (s *standaloneActivityTestSuite) Test_StartToCloseTimeout() {
	func (s *standaloneActivityTestSuite) TestStartToCloseTimeout() {

Fix bug in CHASM activity schedule-to-close timer task validation #8720

Fix bug in CHASM activity schedule-to-close timer task validation #8720

Conversation

dandavison commented Nov 29, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed?

Why?

How did you test it?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fretz12 Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fretz12 commented Nov 29, 2025

Uh oh!

dandavison Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fretz12 Nov 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dandavison Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bergundy left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dandavison commented Dec 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dandavison commented Nov 29, 2025 •

edited by cursor bot

Loading

fretz12 Nov 29, 2025 •

edited

Loading

dandavison Nov 29, 2025 •

edited

Loading

fretz12 Nov 29, 2025 •

edited

Loading

dandavison Nov 30, 2025 •

edited

Loading