Skip to content

Commit 39d8db6

Browse files
authored
Update 2025-03-20-tiny-allocationless-json-parser-in-c.md
1 parent 3382aad commit 39d8db6

File tree

1 file changed

+5
-124
lines changed

1 file changed

+5
-124
lines changed
Lines changed: 5 additions & 124 deletions
Original file line numberDiff line numberDiff line change
@@ -1,129 +1,10 @@
11
---
22
layout: post
3-
title: "Tiny allocationless JSON parser in C"
3+
title: "Allocationless C89 JSON parser"
44
date: 2025-03-20 12:00:00 +0100
5-
redirect_from: /2025/03/20/tiny-allocationless-json-parser-in-c.html
5+
redirect_from:
6+
- /2025/03/20/tiny-allocationless-json-parser-in-c.html
7+
- https://mynameistrez.github.io/blog/tiny-allocationless-json-parser-in-c
68
---
79

8-
I wrote the library [Tiny allocationless JSON parser in C](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c), which parses a subset of [JSON](https://en.wikipedia.org/wiki/JSON) in 533 lines of C. Only arrays, objects and strings are handled.
9-
10-
I wrote this JSON parser for my tiny programming language called [grug](https://mynameistrez.github.io/2024/02/29/creating-the-perfect-modding-language.html).
11-
12-
I was inspired by null program's [Minimalist C Libraries](https://nullprogram.com/blog/2018/06/10/) blog post, describing how C libraries never really need to allocate any memory themselves. The trick is to expect the user to pass `void *buffer` and `size_t buffer_capacity`:
13-
14-
```c
15-
int main() {
16-
char buffer[420];
17-
18-
// If json_init() fails, just increase the starting size
19-
assert(!json_init(buffer, sizeof(buffer)));
20-
21-
struct json_node node;
22-
23-
enum json_status status = json("foo.json", &node, buffer, sizeof(buffer));
24-
if (status) {
25-
// Handle error here
26-
exit(EXIT_FAILURE);
27-
}
28-
29-
// You can now recursively walk the JSON data in the node variable here
30-
}
31-
```
32-
33-
Instead of using a fixed size buffer, you can use `realloc()` to keep retrying the call with a bigger buffer:
34-
35-
```c
36-
int main() {
37-
size_t size = 420;
38-
void *buffer = malloc(size);
39-
40-
// If json_init() fails, just increase the starting size
41-
assert(!json_init(buffer, size));
42-
43-
struct json_node node;
44-
45-
enum json_status status;
46-
do {
47-
status = json("foo.json", &node, buffer, size);
48-
if (status == JSON_OUT_OF_MEMORY) {
49-
size *= 2;
50-
buffer = realloc(buffer, size);
51-
}
52-
} while (status == JSON_OUT_OF_MEMORY);
53-
54-
if (status) {
55-
// Handle error here
56-
exit(EXIT_FAILURE);
57-
}
58-
59-
// You can now recursively walk the JSON data in the node variable here
60-
}
61-
```
62-
63-
## How it works
64-
65-
The `json_init()` function puts an internal struct at the start of the buffer [here](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c/blob/7d5bb76d11aa32da22c39a186ed2f721959abf64/json.c#L539-L543). `json()` uses the remaining buffer bytes to allocate the arrays it needs for parsing [here](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c/blob/7d5bb76d11aa32da22c39a186ed2f721959abf64/json.c#L465).
66-
67-
If one of the internal arrays is too small, it'll double the array's capacity [here](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c/blob/c02215b1239f9a9c2f832f817ea5e6bab7eb6a19/json.c#L99-L123).
68-
69-
The parser uses an [array-based hash table](https://mynameistrez.github.io/2024/06/19/array-based-hash-table-in-c.html) to detect duplicate object keys, and `longjmp()` to [keep the clutter of error handling at bay](https://mynameistrez.github.io/2024/03/21/setjmp-plus-longjmp-equals-goto-but-awesome.html).
70-
71-
The [JSON spec](https://www.json.org/json-en.html) specifies that the other value types are `number`, `true`, `false` and `null`, but they can all be stored as strings. You could easily support these however by adding just a few dozen lines to `json.c`, and a handful of tests, so feel free to. This JSON parser also does not allow the `\` character to escape the `"` character in strings.
72-
73-
## Simpler version: restart on reallocation
74-
75-
If you don't mind the first JSON file taking a bit longer to be parsed, you can use the branch called [restart-on-reallocation](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c/tree/restart-on-reallocation). It is 473 lines of code.
76-
77-
If one of the internal arrays is too small, it'll automatically restart the parsing, where the array's capacity is doubled [here](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c/blob/1e5dd1ae77e3f247f28026cc10abedd876aa43f0/json.c#L375-L376). So the first parsed JSON file will take a few iterations to be parsed successfully, while the JSON files after that will usually just take a single iteration.
78-
79-
## Even simpler version: structless
80-
81-
If you don't need to have several JSON files open at the same, so if you don't mind the code being stateful, you can use the branch called [structless](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c/tree/structless):
82-
83-
```c
84-
int main() {
85-
char buffer[420];
86-
87-
json_init();
88-
89-
struct json_node node;
90-
91-
enum json_status status = json("foo.json", &node, buffer, sizeof(buffer));
92-
if (status) {
93-
// Handle error here
94-
exit(EXIT_FAILURE);
95-
}
96-
97-
// You can now recursively walk the JSON data in the node variable here
98-
}
99-
```
100-
101-
Its `json_init()` can't fail, and it is 461 lines of code.
102-
103-
## Simplest version: static arrays
104-
105-
Originally `json.c` was 397 lines of code, which you can still view in the branch called [static-arrays](https://github.com/MyNameIsTrez/tiny-allocationless-json-parser-in-c/tree/static-arrays):
106-
107-
```c
108-
int main() {
109-
struct json_node node;
110-
if (json("foo.json", &node)) {
111-
// Handle error here
112-
exit(EXIT_FAILURE);
113-
}
114-
115-
// You can now recursively walk the JSON data in the node variable here
116-
}
117-
```
118-
119-
It used static arrays with hardcoded sizes, which I described the advantages of in my blog post titled [Static arrays are the best vectors](https://mynameistrez.github.io/2024/04/09/static-arrays-are-the-best-vectors.html).
120-
121-
There were two problems with it:
122-
1. It didn't give the user control over how the memory was allocated. So you'd have to manually edit `#define` statements in `json.c`, if you wanted to increase say the maximum number of tokens that a JSON file is allowed to contain.
123-
2. Whenever `json_parse()` was called, its static arrays would be reset. This meant that calling the function a second time would overwrite the previous call's JSON result. This was fine if you didn't need to open more than one JSON file at a time, though. But even if you did, you could just manually copy around the arrays containing the JSON data.
124-
125-
At the moment, [grug](https://mynameistrez.github.io/2024/02/29/creating-the-perfect-modding-language.html) uses this static arrays approach.
126-
127-
## Final thoughts
128-
129-
Most of the work went into adding tons of tests to ensure it has as close to 100% coverage as possible; `tests.c` is almost as large as `json.c`! The tests also act as documentation. I've fuzzed the program with libFuzzer and documented it in [the repository](https://github.com/MyNameIsTrez/tiny-json-parser-in-c)'s README. Enjoy! 🙂
10+
I wrote the library [Allocationless C89 JSON parser](https://github.com/MyNameIsTrez/allocationless-c89-json-parser). See its readme. Enjoy! 🙂

0 commit comments

Comments
 (0)