Skip to content

Test Key Encoding and Unicode in Parser #319

@lalitc375

Description

@lalitc375

Background

When replacing the JSON parsing engine in PR #292, one of the riskiest areas is the handling of text encoding and Unicode, particularly within JSON keys.

Description

We must create extensive test cases to cover various key encoding and Unicode scenarios. These tests must run against both the legacy (json-iterator) and the new (json/v2) implementations to guarantee they produce the exact same in-memory values.

The scenarios that need coverage include:

  • Escaped/encoded keys: e.g., keys formatted like "\u0041".
  • Multi-byte Unicode keys: e.g., "Iñtërnâtiônàlizætiøn,💝🐹🌇⛔".
  • Invalid Unicode characters: Keys containing byte sequences that do not form valid UTF-8/Unicode characters. We need to verify how both implementations handle/reject these.

Code Pointers

Metadata

  • Priority: P0
  • Complexity: Easy

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions