Skip to content

docs: update blob size limit error troubleshooting page#4379

Open
lennessyy wants to merge 3 commits intomainfrom
feat/blob-size-limit-error-update
Open

docs: update blob size limit error troubleshooting page#4379
lennessyy wants to merge 3 commits intomainfrom
feat/blob-size-limit-error-update

Conversation

@lennessyy
Copy link
Copy Markdown
Contributor

@lennessyy lennessyy commented Apr 2, 2026

Summary

  • Updates the blob size limit error troubleshooting page to recommend External Storage as the primary solution for large payloads
  • Reorders resolution strategies to put External Storage first, followed by compression and batching
  • Adds link to the Python SDK large payload storage guide

Split from #4333 (PR 3 of 3).

Test plan

  • Verify the blob-size-limit-error.mdx page renders correctly
  • Confirm links to External Storage and Python SDK pages resolve properly

🤖 Generated with Claude Code

┆Attachments: EDU-6149 docs: update blob size limit error troubleshooting page

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@lennessyy lennessyy requested a review from a team as a code owner April 2, 2026 01:22
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
temporal-documentation Error Error Apr 3, 2026 8:17pm

Request Review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 2, 2026

📖 Docs PR preview links

There are multiple strategies you can use to avoid this error:

1. Use compression with a [custom payload codec](/payload-codec) for large payloads.
1. (Recommended) Use [External Storage](/external-storage) to offload large payloads to an object store like S3.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Recommended" is a bit strong given that it's in pre-release. But if you add a pre-release note, that could address gaps in terms of language support and stability

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll keep this as the first item but remove (Recommended) for now. We can add it back either during Public Preview or GA

title: Troubleshoot the blob size limit error
sidebar_label: Blob size limit error
description: The BlobSizeLimitError occurs when a Workflow's payload exceeds the 2 MB request limit or the 4 MB Event History transaction limit set by Temporal. Reduce blob size via compression or batching.
description:
Copy link
Copy Markdown
Contributor

@drewhoskins-temporal drewhoskins-temporal Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I know you aren't adding this, so no need to act now, but I wonder if anything about this is still accurate. If it's referring to server side errors, I think we throw GrpcMessageTooLarge in the latter case (but we can check w/ eng to be sure.) and BadAttributes in the former case. At least that's what I discussed with @simvlad and @jmaeagle99 but maybe I'm confused about what this means.

Update: according to https://github.com/search?q=org%3Atemporalio%20BlobSizeLimitError&type=code, I think "BlobSizeLimitError" is the actual size limit rather than the error that's thrown...

Copy link
Copy Markdown

@simvlad simvlad Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case if it exceeds 2MB, we terminated the workflow with one of bad** errors, like the following:

BadScheduleActivityAttributes: ScheduleActivityTaskCommandAttributes.Input exceeds size limit.

but this is specific to the large payloads in the activity input. For example, we also throw WORKFLOW_TASK_FAILED_CAUSE_BAD_UPDATE_WORKFLOW_EXECUTION_MESSAGE.

As for the GrpcMessageTooLarge, it would be on the client and look like this:

rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5767664 vs. 4194304

- [Python: Large payload storage](/develop/python/data-handling/large-payload-storage)

2. Break larger batches of commands into smaller batch sizes:
2. Use compression with a [custom Payload Codec](/payload-codec) for large payloads. This addresses the immediate issue,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
2. Use compression with a [custom Payload Codec](/payload-codec) for large payloads. This addresses the immediate issue,
2. Use compression with a [custom Payload Codec](/payload-codec) for large payloads. But even if this addresses the immediate issue,

1. (Recommended) Use [External Storage](/external-storage) to offload large payloads to an object store like S3.
Currently available in the [Python SDK](/develop/python/data-handling/large-payload-storage). When a payload exceeds a size threshold,
a storage driver uploads it to your external store and replaces it with a small reference token in the Event History.
Your Workflow and Activity code doesn't need to change. Even if your payloads are within the limit today, consider
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "Even if your payloads are within the limit today" -- the user probably wouldn't be in this doc if that were the case. =) But I like the advice in general.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking they may have multiple Workflows. They ran into this error on one of them, but this could nudge them to refactor others even if those aren't reporting errors now.

implementing External Storage if their size could grow over time.

- This addresses the immediate issue of the blob size limit; however, if blob sizes continue to grow this problem can arise again.
For SDK-specific guides, see:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's too early to wipe the old advice. (a) cos pre-release/limited language support, (b) since we don't provide the ability to create payload handles yet, the advice to "Pass references to the stored payloads within the Workflow instead of the actual data." is still valid no matter what.

Maybe we just list these as alternatives?

2. Retrieve the payloads from the object store when needed during execution.
2. Introduce brief pauses or sleeps between batches.

## Workflow termination due to oversized response
Copy link
Copy Markdown

@simvlad simvlad Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I tried it again. In the case workflow task response exceeds 4MB gRPC limit, the workflow would keep retrying with:

{
  "cause": "WORKFLOW_TASK_FAILED_CAUSE_GRPC_MESSAGE_TOO_LARGE",
  "failure": {
    "message": "rpc error: code = ResourceExhausted desc = grpc: received message larger than max (5243837 vs. 4194304)",
    "applicationFailureInfo": {
      "type": "GrpcMessageTooLargeError"
    }
  }
}

So this is not correct

Keep only substantive content changes, remove line-wrapping noise.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Split into two sections: payload size limit and gRPC message size limit
- Fix incorrect claim that gRPC oversized response terminates the Workflow
- Add error message examples for both limit types
- Add External Storage and claim check pattern as resolution
- Clarify gRPC limit applies to Client-Service and Worker-Service communication
- Note payload size limit is configurable on self-hosted

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
See the [gRPC Message Too Large error reference](/references/errors#grpc-message-too-large) for more details.
### Error messages

- `WORKFLOW_TASK_FAILED_CAUSE_GRPC_MESSAGE_TOO_LARGE`: When a Workflow Worker completes a Workflow Task, it sends all the commands the Workflow produced (such as Activity schedules and their inputs) back to the Temporal Service. If that response exceeds 4 MB, the SDK catches the gRPC error and sends a failed Workflow Task response with this cause. Because replay produces the same oversized response, the Workflow gets stuck in a retry loop that isn't visible in the Event History.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a new error message that is analogous to this one called WORKFLOW_TASK_FAILED_CAUSE_PAYLOADS_TOO_LARGE.

The SDK will get the payload limits from the server and intentionally fail workflow tasks if it contains payloads that are over the payload size limit. This enables the workflow to be retried instead of failing it so that other solutions (such as external storage) may be applied to alleviate the problem and allow the workflow to continue.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I'm a little confused by this. What makes this error different from the other payload errors (in the above section)? Why does this allow retries while other payload errors terminate the workflow per Vlad's testing?

Or do you mean this is a new error message we are adding together with the release of external storage?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a new error code. Prior to adding this code and the new behavior to the SDK, if the SDK submitted payloads that were greater than 2 MB but less than 4 MB (the gRPC limit), the server would fail the workflow (not just the workflow task, but the entire workflow). The SDK does the best effort size check on the worker and intentionally fails the workflow task instead of uploading the large payload that would fail the workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants