Skip to content

Conversation

@maxdml
Copy link
Collaborator

@maxdml maxdml commented Oct 27, 2025

This PR:

  • Allows users to plug a custom encoder for their workflow and steps input/output
  • Switch to using JSON as the default encoder (instead of Gob)

The rationale for switching to JSON by default is that Gob requires users to explicitly register their data types, and DBOS is limited in how it can automatically do the registration for users. If an application reads any workflow/step input/output where the types have not been registered by the current code (e.g., previous version or recover a workflow which does ListWorkflows for other workflows that have not been seen by the runtime yet.), the read will error because we can't decode.

The only way to handle these corner case is for the user to manually register their types, which makes for a poor UX and bad surprises.

PR details

  • New serializer property to DBOSContext
  • JSON serializer
  • More automated registration for Gob serializer
  • More tests

JSON serializer

JSON does lose schema information when encoding a data structure. This means that when decoding, JSON can only return a generic map[string]interface. We address this by automating a round of marshal/unmarshal in places where we know the data type. However, this is not possible on the ListWorkflows and GetWorkflowSteps path, which are not generic. On this path, the user must reconvert the input/output to their known type if they wish to use it typed (which they can do with a marshaling round, or using github.com/mitchellh/mapstructure.

More automated registration for Gob serializer

  • We now lazily perform a registration step when calling serialize, which fixes the cases where a workflow signature has an interface input and/or output. At registration time, we do not know the underlying concrete value of the interface and cannot register it properly
  • We also perform a registration step in the wrapped function stored in the registry. This means that recovered / dequeue workflows can know the underlying concrete type dynamically, even if the workflow didn't execute in this process first.

Tests

We test both serializer on all path where they're used:

  • Normal workflow execution
  • Polling handles
  • List workflows, Retrieve workflow, Get workflow steps
  • Send/Recv set/get event also encode the message
  • Workflow recovery & queued workflows (tests that the JSON re-encoding / gob registration works in the wrapped workflows)

With the JSON serializer, it is know possible to have workflow signatures with any. We test this case, in which the nil values should be stored as empty strings in the database.

Comment on lines -21 to -26
if inputStr, ok := workflow.Input.(string); ok {
if strings.Contains(inputStr, "Failed to decode") {
ctx.logger.Warn("Skipping workflow recovery due to input decoding failure", "workflow_id", workflow.ID, "name", workflow.Name)
continue
}
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused

@maxdml maxdml force-pushed the custom-serializer branch from f9c14ce to c609278 Compare October 28, 2025 20:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants