Skip to content

Fix _dump_str crash and wrong encoding for hex escape sequences#453

Open
bysiber wants to merge 1 commit intouiri:masterfrom
bysiber:fix/dump-str-hex-escape
Open

Fix _dump_str crash and wrong encoding for hex escape sequences#453
bysiber wants to merge 1 commit intouiri:masterfrom
bysiber:fix/dump-str-hex-escape

Conversation

@bysiber
Copy link

@bysiber bysiber commented Feb 20, 2026

Summary

_dump_str in encoder.py crashes with an IndexError when encoding strings that start with a control character (like \x01), and silently produces invalid TOML for strings with control characters in the middle.

Problem

The \x escape handling in _dump_str uses a split-and-rejoin approach that has two issues:

  1. When the \x escape occurs at the very start of the string, split("\\x") produces an empty first element. The code removes this empty element with v = v[1:], but then unconditionally accesses v[1] which no longer exists → IndexError.

  2. The joiner logic is inverted. When the character before \x is NOT a backslash (meaning it's a real hex escape from repr()), the code uses "x" as the joiner — which just drops the backslash and puts the literal character x back. This produces wrong output like "ax01b" instead of "a\u0001b".

>>> import toml
>>> toml.dumps({"key": "\x01hello"})
# IndexError: list index out of range

>>> toml.dumps({"key": "a\x01b"})
'key = "ax01b"\n'
# Expected: 'key = "a\\u0001b"\n'

>>> toml.loads(toml.dumps({"key": "a\x01b"}))
{"key": "ax01b"}
# Roundtrip is broken — data is corrupted

Fix

Replace the fragile split/loop approach with a straightforward character-by-character scan that processes backslash sequences left-to-right:

  • \\ (escaped backslash pair) → kept as-is
  • \x (hex escape) → converted to \u00 for TOML compatibility
  • Other \X sequences → kept as-is

This correctly handles all combinations including \\x (literal backslash + x), \\\x (literal backslash + hex escape), and strings starting with \x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant