[CHIA-3823] Add interning API. by richardkiss · Pull Request #684 · Chia-Network/clvm_rs

richardkiss · 2026-01-26T21:03:00Z

This PR adds API for interning that will be necessary for the hard fork that changes the generator identity and cost to use the contents of the generator rather than the serialization of it.

More information about the ideas behind this hard fork can be read here: https://github.com/richardkiss/generator-identity-hf-analysis/

This is the first PR of several. The next PR is in chia_rs which will depend upon a release of clvm_rs having this new API. See https://github.com/richardkiss/generator-identity-hf-analysis/#installation for more explanation of the various PRs.

Note

Medium Risk
Although mostly additive, this introduces a new public API that manipulates allocator/node identity and is intended for consensus-adjacent cost/identity work, so subtle correctness/performance issues could have downstream impact.

Overview
Adds a new serde::intern API (intern + InternedTree) that rebuilds a CLVM tree into a fresh allocator while deduplicating identical atoms (by bytes) and pairs (by interned child tuple), and exposes the interned root plus ordered lists of unique atoms/pairs.

Introduces coverage for this behavior via unit tests (including hex fixtures and ordering expectations), a new libFuzzer target that checks serialization/tree-hash invariants and that interning doesn’t increase unique/allocated node counts, and a Criterion benchmark (benches/intern.rs) wired into Cargo.toml.

^{Written by Cursor Bugbot for commit 3a520de. This will update automatically on new commits. Configure here.}

Copilot

Pull request overview

This PR introduces a CLVM tree interning API to deduplicate atoms and pairs, expose structured statistics for cost calculation, and provide associated tests and fuzzing to support an upcoming generator-identity hard fork.

Changes:

Added a new serde::intern module with InternedTree, InternedStats, and an intern function that builds a deduplicated allocator plus helper APIs (stats, tree hash, indices).
Expanded serde exports and tests to cover the new interning behavior, including hex-based structural tests that assert serialization and tree-hash equivalence, and atom/pair dedup counts.
Added a dedicated fuzz target for the interning API and wired it into the fuzz crate, checking serialization equality and deduplication invariants under random trees.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/serde/intern.rs`	Implements the core interning algorithm, stats helpers, tree hashing, node index mapping, and unit tests for correctness and deduplication behavior.
`src/serde/mod.rs`	Wires the new `intern` module into the serde public API and exposes `Bytes32`, while registering the new test module.
`src/serde/test_intern.rs`	Adds hex-based integration tests that deserialize trees, intern them, and verify serialization equality, tree-hash equality, and expected unique atom/pair counts across various shapes.
`fuzz/fuzz_targets/intern.rs`	Introduces a fuzz target for interning that generates random trees, asserts serialization invariants and deduplication properties, and exercises the tree-hash path.
`fuzz/Cargo.toml`	Registers the new `intern` fuzz target as a binary for inclusion in the fuzzing suite.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fuzz/fuzz_targets/intern.rs

src/serde/intern.rs

Copilot · 2026-01-26T21:17:38Z

@richardkiss I've opened a new pull request, #685, to work on those changes. Once the pull request is ready, I'll request review from you.

coveralls-official · 2026-01-26T21:26:04Z

Pull Request Test Coverage Report for Build 22120814463

Details

149 of 158 (94.3%) changed or added relevant lines in 2 files are covered.
No unchanged relevant lines lost coverage.
Overall coverage increased (+0.2%) to 90.66%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
src/serde/intern.rs	123	126	97.62%
src/serde/test_intern.rs	26	32	81.25%

Totals
Change from base Build 22099149151:	0.2%
Covered Lines:	6824
Relevant Lines:	7527

💛 - Coveralls

arvidn

I'm worried about the following DoS vectors:

a tree with a very large number of small (and different) atoms, making book keeping costly
a tree with many atoms of moderate size (say 100 bytes or so) that makes them costly to hash, in order to look up in the hash map.
a tree that's large, much larger than we allow, but still causes atoms to be duplicated, using 2x the RAM
a large tree with small (different) atoms, with no deduplication opportunities. Computing the tree hash would cause an (almost) 32x memory usage, assuming I understand correctly that every node's tree hash is cached.

src/serde/intern.rs

fuzz/fuzz_targets/intern.rs

src/serde/intern.rs

src/serde/test_intern.rs

fuzz/fuzz_targets/intern.rs

Copilot · 2026-02-07T01:35:20Z

@richardkiss I've opened a new pull request, #692, to work on those changes. Once the pull request is ready, I'll request review from you.

src/serde/test_intern.rs

arvidn · 2026-02-12T15:16:56Z

sorry, I broke this by changing some names. atom_count() and pair_count() are now the total number, including "ghost" ones. These are the counters that are constrained by the limits.

Now you can also ask for allocated_atom_count() and allocated_pair_count() which tells you how much RAM we're using. These counters do not affect consensus.

arvidn

would you mind adding a benchmark for intern() as well?
we have a few generators that we benchmark treehash on, you could use those.

I don't see any major problems with this

fuzz/fuzz_targets/intern.rs

src/serde/intern.rs

cursor

Cursor Bugbot has reviewed your changes and found 1 potential issue.

^{Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.}

fuzz/fuzz_targets/intern.rs

src/serde/intern.rs

Copilot AI review requested due to automatic review settings January 26, 2026 21:03

Copilot started reviewing on behalf of richardkiss January 26, 2026 21:03 View session

richardkiss requested a review from arvidn January 26, 2026 21:04

Copilot AI reviewed Jan 26, 2026

View reviewed changes

fuzz/fuzz_targets/intern.rs Show resolved Hide resolved

src/serde/intern.rs Show resolved Hide resolved

richardkiss force-pushed the generator-identity-hf branch from 7a82fda to 4241748 Compare January 26, 2026 21:17

Copilot AI mentioned this pull request Jan 26, 2026

Verify post-order traversal in test_pairs_in_post_order #685

Draft

arvidn reviewed Jan 27, 2026

View reviewed changes

cursor bot reviewed Feb 6, 2026

View reviewed changes

fuzz/fuzz_targets/intern.rs Show resolved Hide resolved

richardkiss force-pushed the generator-identity-hf branch from a0ae7ed to c8328c3 Compare February 7, 2026 00:47

Copilot AI mentioned this pull request Feb 7, 2026

Clarify tree hash comparison is already implemented in intern fuzzer #692

Closed

richardkiss force-pushed the generator-identity-hf branch from aabf999 to fd8698b Compare February 9, 2026 23:10

cursor bot reviewed Feb 10, 2026

View reviewed changes

src/serde/test_intern.rs Outdated Show resolved Hide resolved

danieljperry changed the title ~~Add interning API.~~ [CHIA-3823] Add interning API. Feb 12, 2026

arvidn reviewed Feb 12, 2026

View reviewed changes

fuzz/fuzz_targets/intern.rs Outdated Show resolved Hide resolved

fuzz/fuzz_targets/intern.rs Outdated Show resolved Hide resolved

src/serde/intern.rs Outdated Show resolved Hide resolved

src/serde/intern.rs Show resolved Hide resolved

src/serde/intern.rs Show resolved Hide resolved

cursor bot reviewed Feb 16, 2026

View reviewed changes

fuzz/fuzz_targets/intern.rs Outdated Show resolved Hide resolved

richardkiss force-pushed the generator-identity-hf branch 2 times, most recently from cc35f6b to 55f1753 Compare February 16, 2026 23:32

arvidn reviewed Feb 17, 2026

View reviewed changes

arvidn previously approved these changes Feb 17, 2026

View reviewed changes

richardkiss force-pushed the generator-identity-hf branch from 55f1753 to 0f29983 Compare February 17, 2026 19:55

richardkiss dismissed arvidn’s stale review via 3251a4e February 18, 2026 00:00

richardkiss force-pushed the generator-identity-hf branch from 0f29983 to 3251a4e Compare February 18, 2026 00:00

Add interning API.

3a520de

richardkiss force-pushed the generator-identity-hf branch from 3251a4e to 3a520de Compare February 18, 2026 00:05

arvidn approved these changes Feb 18, 2026

View reviewed changes

richardkiss merged commit f80ab73 into main Feb 19, 2026
32 checks passed

richardkiss deleted the generator-identity-hf branch February 19, 2026 02:26

richardkiss mentioned this pull request Feb 25, 2026

Generator identity hf richardkiss/clvm_rs#42

Closed

Conversation

richardkiss commented Jan 26, 2026 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Jan 26, 2026

Uh oh!

coveralls-official bot commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Test Coverage Report for Build 22120814463

Details

💛 - Coveralls

Uh oh!

arvidn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI commented Feb 7, 2026

Uh oh!

Uh oh!

arvidn commented Feb 12, 2026

Uh oh!

arvidn left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

richardkiss commented Jan 26, 2026 •

edited by cursor bot

Loading

coveralls-official bot commented Jan 26, 2026 •

edited

Loading