Detailed format rules, syntax, and examples for TOON (Token-Oriented Object Notation).
TOON uses indentation-based structure like YAML for nested objects and tabular format like CSV for uniform arrays. This document explains the complete syntax and formatting rules.
Objects use key: value pairs with indentation for nesting.
{"name": "Alice", "age": 30, "active": True}name: Alice
age: 30
active: true
{
"user": {
"name": "Alice",
"settings": {
"theme": "dark"
}
}
}user:
name: Alice
settings:
theme: dark
Keys follow identifier rules or must be quoted:
{
"simple_key": 1,
"with-dash": 2,
"123": 3, # Numeric key
"with space": 4, # Spaces require quotes
"": 5 # Empty key requires quotes
}simple_key: 1
with-dash: 2
"123": 3
"with space": 4
"": 5
All arrays include length indicator [N] for validation.
Arrays of primitives use inline format with comma separation:
[1, 2, 3, 4, 5][5]: 1,2,3,4,5
["alpha", "beta", "gamma"][3]: alpha,beta,gamma
Note: Comma delimiter is hidden in primitive arrays: [5]: not [5,]:
Uniform objects with primitive-only fields use CSV-like format:
[
{"id": 1, "name": "Alice", "age": 30},
{"id": 2, "name": "Bob", "age": 25},
{"id": 3, "name": "Charlie", "age": 35}
][3,]{id,name,age}:
1,Alice,30
2,Bob,25
3,Charlie,35
Tabular Format Rules:
- All objects must have identical keys
- All values must be primitives (no nested objects/arrays)
- Field order in header determines column order
- Delimiter appears in header:
[N,]or[N|]or[N\t]
Non-uniform or nested arrays use list format with - markers:
[
{"name": "Alice"},
42,
"hello"
][3]:
- name: Alice
- 42
- hello
{
"matrix": [
[1, 2, 3],
[4, 5, 6]
]
}matrix[2]:
- [3]: 1,2,3
- [3]: 4,5,6
{"items": []}items[0]:
Three delimiter options for array values:
encode([1, 2, 3]) # Default delimiter[3]: 1,2,3
For tabular arrays, delimiter shown in header:
users[2,]{id,name}:
1,Alice
2,Bob
encode([1, 2, 3], {"delimiter": "\t"})[3 ]: 1 2 3
Tabular with tab:
users[2 ]{id,name}:
1 Alice
2 Bob
encode([1, 2, 3], {"delimiter": "|"})[3|]: 1|2|3
Tabular with pipe:
users[2|]{id,name}:
1|Alice
2|Bob
Strings are quoted only when necessary to avoid ambiguity.
"hello" # Simple identifier
"hello world" # Internal spaces OK
"user_name" # Underscores OK
"hello-world" # Hyphens OKhello
hello world
user_name
hello-world
Empty strings:
""""
Reserved keywords:
"null"
"true"
"false""null"
"true"
"false"
Numeric-looking strings:
"42"
"-3.14"
"1e5"
"0123" # Leading zero"42"
"-3.14"
"1e5"
"0123"
Leading/trailing whitespace:
" hello"
"hello "
" hello "" hello"
"hello "
" hello "
Structural characters:
"key: value" # Colon
"[array]" # Brackets
"{object}" # Braces
"- item" # Leading hyphen"key: value"
"[array]"
"{object}"
"- item"
Delimiter characters:
# When using comma delimiter
"a,b""a,b"
Control characters:
"line1\nline2"
"tab\there""line1\nline2"
"tab\there"
Inside quoted strings:
| Sequence | Meaning |
|---|---|
\" |
Double quote |
\\ |
Backslash |
\n |
Newline |
\r |
Carriage return |
\t |
Tab |
\uXXXX |
Unicode character (4 hex digits) |
Example:
{
"text": "Hello \"world\"\nNew line",
"path": "C:\\Users\\Alice"
}text: "Hello \"world\"\nNew line"
path: "C:\\Users\\Alice"
Integers:
42
-17
042
-17
0
Floats:
3.14
-0.5
0.03.14
-0.5
0
Special Numbers:
- Scientific notation accepted in decoding:
1e5,-3.14E-2 - Encoders must NOT use scientific notation - always decimal form
- Negative zero normalized:
-0.0β0 - Non-finite values β null:
Infinity,-Infinity,NaNβnull
Large integers (>2^53-1):
9007199254740993 # Exceeds JS safe integer"9007199254740993" # Quoted for JS compatibility
True # true in TOON (lowercase)
False # false in TOON (lowercase)true
false
None # null in TOON (lowercase)null
Default: 2 spaces per level (configurable)
{
"level1": {
"level2": {
"level3": "value"
}
}
}level1:
level2:
level3: value
With 4-space indent:
encode(data, {"indent": 4})level1:
level2:
level3: value
Strict mode rules:
- Indentation must be consistent multiples of
indentvalue - Tabs not allowed in indentation
- Mixing spaces and tabs causes errors
All arrays include [N] to indicate element count for validation.
items[3]: a,b,c
users[2,]{id,name}:
1,Alice
2,Bob
encode(data, {"lengthMarker": "#"})items[#3]: a,b,c
users[#2,]{id,name}:
1,Alice
2,Bob
The # prefix makes length indicators more explicit for validation-focused use cases.
Within arrays: Blank lines are not allowed in strict mode
# β Invalid (blank line in array)
items[3]:
- a
- b
- c
# β
Valid (no blank lines)
items[3]:
- a
- b
- c
Between top-level keys: Blank lines are allowed and ignored
# β
Valid (blank lines between objects)
name: Alice
age: 30
TOON does not support comments. The format prioritizes minimal syntax for token efficiency.
If you need to document TOON data, use surrounding markdown or separate documentation files.
Trailing whitespace on lines is allowed and ignored.
Leading/trailing whitespace in string values requires quoting:
{"text": " value "}text: " value "
Object key order and array element order are always preserved during encoding and decoding.
from collections import OrderedDict
data = OrderedDict([("z", 1), ("a", 2), ("m", 3)])
toon = encode(data)z: 1
a: 2
m: 3
Decoding preserves order:
decoded = decode(toon)
list(decoded.keys()) # ['z', 'a', 'm']{
"app": "myapp",
"version": "1.0.0",
"debug": False,
"port": 8080
}app: myapp
version: "1.0.0"
debug: false
port: 8080
{
"metadata": {
"version": 2,
"author": "Alice"
},
"items": [
{"id": 1, "name": "Item1", "qty": 10},
{"id": 2, "name": "Item2", "qty": 5}
],
"tags": ["alpha", "beta", "gamma"]
}metadata:
version: 2
author: Alice
items[2,]{id,name,qty}:
1,Item1,10
2,Item2,5
tags[3]: alpha,beta,gamma
{
"data": [
{"type": "user", "id": 1},
{"type": "user", "id": 2, "extra": "field"}, # Non-uniform
42,
"hello"
]
}data[4]:
- type: user
id: 1
- type: user
id: 2
extra: field
- 42
- hello
JSON (177 chars):
{"users":[{"id":1,"name":"Alice","age":30,"active":true},{"id":2,"name":"Bob","age":25,"active":true},{"id":3,"name":"Charlie","age":35,"active":false}]}TOON (85 chars, 52% reduction):
users[3,]{id,name,age,active}:
1,Alice,30,true
2,Bob,25,true
3,Charlie,35,false
- API Reference - Complete function documentation
- LLM Integration - Best practices for LLM usage
- Official Specification - Normative spec