Skip to content

Conversation

@icbd
Copy link
Contributor

@icbd icbd commented Oct 4, 2025

Description

Task outputs cannot contain the unicode null character \\u0000
Please see this Discord thread: https://discord.com/channels/1088927970518909068/1384324576166678710/1386714014565928992
Relevant Postgres documentation: https://www.postgresql.org/docs/current/datatype-json.html
Use `hatchet_sdk.{remove_null_unicode_character.__name__}` to sanitize your output if you'd like to remove the character.

This comment has explained the significance of this method, but the implementation of this method has problems.
This PR fixes the value of this check.

Fixes # (issue)

When the object returned normally contains "\u0000", the task will be reset to failure.

Source Code:

if request.EventType == contracts.StepActionEventType_STEP_EVENT_TYPE_COMPLETED {
if err := repository.ValidateJSONB([]byte(request.EventPayload), "taskOutput"); err != nil {
request.EventPayload = err.Error()
request.EventType = contracts.StepActionEventType_STEP_EVENT_TYPE_FAILED
}
}

When the object returned error contains "\u0000", the error content is displayed normally (No check for ValidateJSONB).

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Documentation change (pure documentation change)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Refactor (non-breaking changes to code which doesn't change any behaviour)
  • CI (any automation pipeline changes)
  • Chore (changes which are not directly related to any business logic)
  • Test changes (add, refactor, improve or change a test)
  • This change requires a documentation update

What's Changed

  • Add a list of tasks or features here...

@vercel
Copy link

vercel bot commented Oct 4, 2025

@icbd is attempting to deploy a commit to the Hatchet Team on Vercel.

A member of the Team first needs to authorize it.

s.l.Warn().Msg("retry count is nil, using task's current retry count")
}

if request.EventType == contracts.StepActionEventType_STEP_EVENT_TYPE_COMPLETED {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to double check here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably only want this for completed / failed?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a full development environment, so I've only run unit tests on this change.

Could you help run an E2E test to confirm the change? Thanks

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, will do tomorrow AM - CI should be a pretty good indicator though 👍

Copy link
Contributor Author

@icbd icbd Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the core of this PR is changing \\u0000 to \u0000

Using bytes just improves performance.

I'm not sure if a couple new commits are directly related to this PR.
If it's not relevant, we can submit it first and rebase it.

WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the issue here isn't that the Validate method isn't working (it is) but we just weren't calling it every place we needed to. Is that not your understanding? It's easy to verify - if you change the string thing to bytes, those python tests will fail and you'll see errors on the engine. I'm not sure why it doesn't work, but it doesn't 🤷

Also, I'm not too worried about the bytes -> string conversion for performance here, although it's a fair point

Copy link
Contributor Author

@icbd icbd Oct 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

b0bee2b

This commit is what I want, everything else is optional.

JSON encoder will filter and resolve NUL(JSON encoder transfers \u0000 to \\u0000

\u0000 is one character,means NUL,pg will also reject it.

\\u0000 are six characters,are normal string.

so,you may check the invalid character NUL,instead of a normal str

@icbd icbd force-pushed the fix/ValidateJSONB branch from 1377b16 to 381272c Compare October 5, 2025 14:00
@vercel
Copy link

vercel bot commented Oct 7, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
hatchet-docs Ready Ready Preview Comment Oct 7, 2025 3:03am
hatchet-v0-docs Ready Ready Preview Comment Oct 7, 2025 3:03am

}

if strings.Contains(string(jsonb), "\u0000") {
if bytes.Contains(jsonb, []byte("\u0000")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi @icbd this didn't work so I reverted it - not completely sure why though tbh, it seemed fine at first glance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this broke a unit test btw, but I validated the behavior with an E2E test. Let me know if you disagree!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants