[CHIA-3823] rename intern → intern_tree; add intern_tree_with_flags to propagate allocator flags#738
[CHIA-3823] rename intern → intern_tree; add intern_tree_with_flags to propagate allocator flags#738richardkiss wants to merge 5 commits intomainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Renames the CLVM tree interning API to be more explicit and extends it to accept a caller-provided destination allocator, enabling heap-limited interning (important for mempool/DoS hardening scenarios like LIMIT_HEAP).
Changes:
- Rename
intern→intern_treeand update re-exports/call sites. - Add
dest: Allocatorparameter so callers can enforce allocator limits during interning. - Update tests, fuzz target, and benchmarks to use the new API.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/serde/intern.rs | Renames the function and adds a destination allocator parameter used to allocate the interned tree. |
| src/serde/mod.rs | Updates public re-export from intern to intern_tree. |
| src/serde/test_intern.rs | Updates test helper to call intern_tree(..., Allocator::new()). |
| fuzz/fuzz_targets/intern.rs | Updates fuzz target to use intern_tree with an explicit destination allocator. |
| benches/intern.rs | Updates benchmark to call intern_tree with an explicit destination allocator. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/serde/intern.rs
Outdated
| /// # Errors | ||
| /// | ||
| /// Returns an error if allocator limits are exceeded when creating new nodes. | ||
| pub fn intern(allocator: &Allocator, node: NodePtr) -> Result<InternedTree> { | ||
| let mut new_allocator = Allocator::new(); | ||
| pub fn intern_tree(source: &Allocator, node: NodePtr, dest: Allocator) -> Result<InternedTree> { | ||
| let mut new_allocator = dest; |
There was a problem hiding this comment.
The doc comment still implies this function always "build[s] a new allocator" and the error docs mention "allocator limits" generically, but the implementation now uses the caller-provided dest allocator. Please update the docs to describe source vs dest and clarify that allocation-limit errors come from dest when creating interned nodes.
src/serde/intern.rs
Outdated
| let node = allocator.new_atom(&[1, 2, 3]).unwrap(); | ||
|
|
||
| let tree = intern(&allocator, node).unwrap(); | ||
| let tree = intern_tree(&allocator, node, Allocator::new()).unwrap(); |
There was a problem hiding this comment.
Given the motivation is enforcing heap limits via a caller-supplied allocator, it would be good to add a regression test that passes Allocator::new_limited(...) as dest and asserts intern_tree fails with OutOfMemory once the limit is exceeded. This ensures the DoS fix remains covered.
src/serde/mod.rs
Outdated
| pub use identity_hash::RandomState; | ||
| pub use incremental::{Serializer, UndoState}; | ||
| pub use intern::{InternedTree, intern}; | ||
| pub use intern::{InternedTree, intern_tree}; |
There was a problem hiding this comment.
serde::intern was previously re-exported as a public API and is now removed/renamed. If this crate intends to avoid breaking downstream users in a patch release, consider keeping a #[deprecated] pub fn intern(...) wrapper that forwards to intern_tree(..., Allocator::new()), or ensure the next release/versioning clearly signals the breaking change.
| pub use intern::{InternedTree, intern_tree}; | |
| pub use intern::{InternedTree, intern_tree}; | |
| #[deprecated(note = "Use serde::intern_tree instead")] | |
| pub use intern::intern_tree as intern; |
Pull Request Test Coverage Report for Build 22917665711Details
💛 - Coveralls |
- Rename intern() → intern_tree() for clarity (pairs naturally with InternedTree) - Add dest: Allocator parameter so callers can supply a heap-limited allocator instead of always getting an unlimited one (Allocator::new() internally) - Update all callers to pass Allocator::new() to preserve existing behavior - All tests pass (787 tests) In chia_rs mempool validation, LIMIT_HEAP enforces a 500MB allocator cap. With this change, chia_rs can pass make_allocator(flags) as dest to enforce the same limit consistently, preventing DoS via backrefs-expanded generators. Made-with: Cursor
…tion Callers that need a heap-limited or otherwise configured destination allocator (e.g. make_allocator(flags)) can now pass a FnOnce() -> Allocator factory instead of a pre-constructed Allocator. The direct intern_tree API is unchanged. Made-with: Cursor
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Factory function pattern overcomplicates simple allocator parameter
- Changed
intern_tree_with_factoryto accept a directAllocatorand updated the default caller to passAllocator::new()so no closure factory is required.
- Changed
Or push these changes by commenting:
@cursor push 7473761bf9
Preview (7473761bf9)
diff --git a/src/serde/intern.rs b/src/serde/intern.rs
--- a/src/serde/intern.rs
+++ b/src/serde/intern.rs
@@ -56,15 +56,12 @@
/// # Errors
///
/// Returns an error if allocator limits are exceeded when creating new nodes.
-pub fn intern_tree_with_factory<F>(
+pub fn intern_tree_with_factory(
source: &Allocator,
node: NodePtr,
- make_dest: F,
-) -> Result<InternedTree>
-where
- F: FnOnce() -> Allocator,
-{
- let mut new_allocator = make_dest();
+ dest: Allocator,
+) -> Result<InternedTree> {
+ let mut new_allocator = dest;
let mut atoms: Vec<NodePtr> = Vec::new();
let mut pairs: Vec<NodePtr> = Vec::new();
@@ -141,7 +138,7 @@
/// Use `intern_tree_with_factory` when you need a heap-limited or otherwise
/// configured allocator.
pub fn intern_tree(source: &Allocator, node: NodePtr) -> Result<InternedTree> {
- intern_tree_with_factory(source, node, Allocator::new)
+ intern_tree_with_factory(source, node, Allocator::new())
}
#[cfg(test)]
arvidn
left a comment
There was a problem hiding this comment.
A malicious spend bundle with a backrefs-expanded generator could build an
arbitrarily large interned tree before cost is checked.
Can you explain how that's possible? When we parse a back-ref serialization, we don't duplicate the trees.
If there is a problem, it seems like a simpler solution would be to parse and intern in a single pass.
src/serde/intern.rs
Outdated
| pub fn intern_tree_with_factory<F>( | ||
| source: &Allocator, | ||
| node: NodePtr, | ||
| make_dest: F, |
There was a problem hiding this comment.
this looks like questionable solution. the LIMIT_HEAP flag, that's used in mempool-mode, is specifically not consensus critical. But if the hard fork starts requiring it, you're making the specific limit we use another consensus constant. Even if a new consensus constant that limits the allocator is correct, it would be best to use a separate constant. one can change without breaking consensus, the other cannot.
Replace intern_tree_with_factory(source, node, || make_allocator(flags)) with intern_tree_with_flags(source, node, flags). The new API is cleaner and constructs the destination allocator internally based on the LIMIT_HEAP flag (500MB limit when set, unlimited otherwise). The old factory-based function is kept for backward compatibility but marked as deprecated. Made-with: Cursor
That motivation was completely wrong. I hadn't read it until now. I edited it with the real motivation. Sorry about that. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Replace intern_tree_with_factory(source, node, || make_allocator(flags)) with intern_tree_limited(source, node, heap_limit). The new API is cleaner and doesn't require clvm_rs to know about policy decisions like the 500MB mempool limit. Callers pass the heap limit size directly. - intern_tree(source, node) - unlimited (uses u32::MAX) - intern_tree_limited(source, node, heap_limit) - custom limit The old factory-based function is kept for backward compatibility. Made-with: Cursor
1e9c521 to
bf6b8b3
Compare
|
the name of this PR still reflects the previous patch. I agree with renaming the function. The motivation for the heap limit isn't really explained. The heap size is already constrained by the input size, how can this ever cause a heap size explosion? The motivation in the PR description describes the current state. We already have this API.
Is the issue that after interning, we want to execute the program in mempool mode, and it will use the same allocator? |
The |

Summary
intern→intern_treefor clarity (internwas ambiguous as verb/noun;intern_treepairs naturally withInternedTree)dest: Allocatorparameter so callers can supply a heap-limited allocator(e.g.
make_allocator(flags)in chia_rs) instead of always getting an unlimited oneMotivation
Provide an API to
chia_rsso it can intern deserialized generators to canonicalize their cost, which is now based only on counts for interned content.Migration
All existing callers pass
Allocator::new()asdestto preserve previous behavior. (Plus, no one uses this yet.)Made with Cursor
Note
Medium Risk
Public API rename and new allocator-limiting path could break downstream callers and introduce limit-related failures if misused, though core logic changes are localized to interning.
Overview
Renames the CLVM interning entrypoint from
interntointern_treeand introducesintern_tree_limited(source, node, heap_limit)to build the interned tree into a heap-limited destination allocator (viaAllocator::new_limited).Updates public exports (
src/serde/mod.rs), benchmarks, fuzz target, and tests to use the new API while preserving the default behavior viaintern_tree(calls the limited version with an effectively-unlimited limit).Written by Cursor Bugbot for commit 2420910. This will update automatically on new commits. Configure here.