feat(zkvm): self-describing framed proof serialization#1267
feat(zkvm): self-describing framed proof serialization#1267
Conversation
tracer/src/emulator/mmu.rs
Outdated
| u64::from_le_bytes(pre_value_bytes) | ||
| }; | ||
|
|
||
| if effective_address <= RAM_START_ADDRESS - 8 { |
There was a problem hiding this comment.
let's remove these changes since these examples should be working now without crashing. A write outside our defined memory regions is almost always just a write to a null ptr, which means something is actually wrong.
Introduce a self-describing framed encoding for proofs and the minimal transport helpers. This removes dependence on brittle enum counts and enables strict parsing with length caps. Co-authored-by: Cursor <cursoragent@cursor.com>
Hoist File import to the top-level import block. Co-authored-by: Cursor <cursoragent@cursor.com>
Add write_framed_section!/framed_section_size!/read_singleton! macros to eliminate repetitive stage serialization in JoltProof. Collapse 6 per-tag size caps into a single MAX_SECTION_LEN, inline thin transport wrapper functions, compact VirtualPolynomial serialized_size to a wildcard match, and move bundle framing constants to transport.rs. Co-authored-by: Cursor <cursoragent@cursor.com>
ba45a3c to
05243f9
Compare
…ntext - Embed format version (v1) in the proof signature so future format changes produce a clear mismatch instead of opaque InvalidData. - Restore per-tag section caps (params: 16 KiB, commitments/claims: 256 MiB, stages: 512 MiB) instead of a single 512 MiB cap. - Add descriptive error messages to all deserialization failure paths: duplicate sections, unknown tags, trailing bytes, missing fields. - Remove unused bundle framing constants from transport.rs (will be re-added when the recursion example PR lands). Co-authored-by: Cursor <cursoragent@cursor.com>
3110af8 to
f3112b3
Compare
Resolve conflict in proof_serialization.rs: update framed format to support C: JoltCurve type param, cfg-gated blindfold_proof/ opening_claims, UniSkipFirstRoundProofVariant, and remove bytecode_K. Made-with: Cursor
…caps, hardened deserialization Replace TLV tag-dispatch with sequential length-prefixed format: no tags, no Option temporaries, no require! macro. Extract shared payload-length helpers to eliminate serialize/serialized_size duplication. Restore exhaustive VirtualPolynomial::serialized_size() match. Harden deserialization: reject trace_length=0, reject duplicate opening-claim keys, tighten all section caps to 128 KiB (params 1 KiB), entry count cap 10K. Separate magic from version byte for specific "unsupported version" errors. Simplify transport.rs: remove dead code (skip_exact, read_u8_opt, frame headers), add read_section helper, add unit tests. Remove leaked SDK re-export of transport module. Made-with: Cursor
There was a problem hiding this comment.
Pull request overview
This PR refactors zkVM proof serialization by introducing a dedicated transport layer (magic/version + varint section lengths) and moving JoltProof/OpeningId serialization from derive-macro-based encoding to explicit, framed encoding intended to be stricter and more DoS-resistant.
Changes:
- Added
zkvm::transportwith varint u64 read/write, magic+version header helpers, and section-length capping. - Rewrote
JoltProofcanonical ser/de to a manual, length-delimited section format with per-section caps and intra-section trailing-byte checks. - Reworked
OpeningIdencoding into a packed header byte with an escape hatch for larger sumcheck IDs; added unit tests.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
jolt-core/src/zkvm/transport.rs |
New transport helpers for magic/version and varint-length-delimited sections (plus tests). |
jolt-core/src/zkvm/proof_serialization.rs |
Manual proof framing + stricter parsing checks; new packed OpeningId encoding; added tests. |
jolt-core/src/zkvm/mod.rs |
Exposes the new transport module. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| pub fn read_varint_u64<R: Read>(r: &mut R) -> io::Result<u64> { | ||
| let mut x = 0u64; | ||
| let mut shift = 0u32; | ||
| for _ in 0..VARINT_U64_MAX_BYTES { | ||
| let mut b = [0u8; 1]; | ||
| r.read_exact(&mut b)?; | ||
| let byte = b[0]; | ||
| x |= ((byte & 0x7F) as u64) << shift; | ||
| if (byte & 0x80) == 0 { | ||
| return Ok(x); | ||
| } | ||
| shift += 7; | ||
| } |
There was a problem hiding this comment.
read_varint_u64 only rejects inputs longer than 10 bytes, but it does not detect numeric overflow within 10 bytes (e.g., a 10th byte with bits beyond the single allowed MSB can wrap when shifted by 63). Add an explicit overflow check for the final byte/shift so invalid encodings that exceed u64::MAX are rejected instead of silently wrapping.
| transport::write_magic_version(&mut writer, PROOF_MAGIC, PROOF_VERSION).map_err(io_err)?; | ||
|
|
||
| let params_len = self.params_payload_len(compress); | ||
| transport::write_varint_u64(&mut writer, params_len).map_err(io_err)?; | ||
| transport::write_varint_u64(&mut writer, self.trace_length as u64).map_err(io_err)?; | ||
| transport::write_varint_u64(&mut writer, self.ram_K as u64).map_err(io_err)?; | ||
| self.rw_config.serialize_with_mode(&mut writer, compress)?; | ||
| self.one_hot_config | ||
| .serialize_with_mode(&mut writer, compress)?; | ||
| self.dory_layout | ||
| .serialize_with_mode(&mut writer, compress)?; | ||
|
|
||
| let commitments_len = self.commitments_payload_len(compress); | ||
| transport::write_varint_u64(&mut writer, commitments_len).map_err(io_err)?; | ||
| transport::write_varint_u64(&mut writer, self.commitments.len() as u64).map_err(io_err)?; | ||
| for c in &self.commitments { | ||
| c.serialize_with_mode(&mut writer, compress)?; | ||
| } | ||
| match &self.untrusted_advice_commitment { | ||
| None => writer.write_all(&[0]).map_err(io_err)?, | ||
| Some(c) => { | ||
| writer.write_all(&[1]).map_err(io_err)?; | ||
| c.serialize_with_mode(&mut writer, compress)?; | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
The PR description and wire-format spec describe a TLV framing ([tag: u8][len: varint][payload]) with unknown-tag/duplicate detection, but the implementation here writes a fixed sequence of varint-length-prefixed sections without any per-section tag. Either update the implementation to include tags and parse frames by tag (enabling strict unknown/duplicate handling), or update the PR description/title/spec to match the actual on-wire format.
| const MAX_PARAMS_LEN: u64 = 1024; | ||
| const MAX_SECTION_LEN: u64 = 128 * 1024; | ||
|
|
There was a problem hiding this comment.
MAX_SECTION_LEN is set to 128 * 1024 (128 KiB), which doesn’t match the PR description’s stated 512 MiB per-section cap and may be too small for real proofs as trace sizes grow. Please reconcile the cap with the intended wire-format spec (and ideally make caps per-section / documented) so valid proofs aren’t rejected during deserialization.
| Ok(Self { | ||
| commitments, | ||
| stage1_uni_skip_first_round_proof, | ||
| stage1_sumcheck_proof, | ||
| stage2_uni_skip_first_round_proof, | ||
| stage2_sumcheck_proof, | ||
| stage3_sumcheck_proof, | ||
| stage4_sumcheck_proof, | ||
| stage5_sumcheck_proof, | ||
| stage6_sumcheck_proof, | ||
| stage7_sumcheck_proof, | ||
| #[cfg(feature = "zk")] | ||
| blindfold_proof, | ||
| joint_opening_proof, | ||
| untrusted_advice_commitment, | ||
| #[cfg(not(feature = "zk"))] | ||
| opening_claims, | ||
| trace_length, | ||
| ram_K, | ||
| rw_config, | ||
| one_hot_config, | ||
| dory_layout, | ||
| }) | ||
| } |
There was a problem hiding this comment.
deserialize_with_mode enforces trailing-byte consumption within each length-delimited section, but it doesn’t validate that the overall proof input ends exactly after the final section (extra bytes after the last section would be ignored). If the goal is a strict, malformed-input-resistant wire format, add a final EOF check after parsing all expected sections.
…fication, ZK discriminator - Propagate I/O errors in EOF check instead of silently swallowing them - Unify standalone Claims ser/de with JoltProof's varint-based format; delegate via write_section!/read_single_section! instead of duplicating - Add debug_assert! in write_section! to catch ser/de cap mismatches - Add flags byte to wire header (bit 0 = is_zk) with clear cross-mode error messages; shorten magic from b"JOLTPRF" to b"JOLT" - Move `use std::fs::File` back inside function scope - Expand OpeningId roundtrip tests to cover all parameterized variants (CommittedPolynomial, VirtualPolynomial) x multiple SumcheckId values Made-with: Cursor
|
let's see if this works: /claude-review |
|
@claude review this PR |
|
I'll analyze this and get back to you. |
|
@claude review please. you seem to be stuck. |
|
I'll analyze this and get back to you. |
| macro_rules! write_section { | ||
| ($w:expr, $c:expr, $($item:expr),+ $(,)?) => {{ | ||
| let len: u64 = 0 $(+ $item.serialized_size($c) as u64)+; | ||
| debug_assert!(len <= MAX_SECTION_LEN, "section size {len} exceeds MAX_SECTION_LEN"); |
There was a problem hiding this comment.
A release-mode prover can produce proofs it can't read back. Should be assert! or return Err.
| let n = transport::read_varint_u64(&mut limited).map_err(io_err)?; | ||
| let n_usize = usize::try_from(n).map_err(|_| SerializationError::InvalidData)?; | ||
| if n_usize > 10_000 { | ||
| return Err(SerializationError::InvalidData); |
There was a problem hiding this comment.
Should be a named constant like MAX_CLAIMS_COUNT. Also 10,000 × 384 bytes/GT = ~3.8 MB which exceeds the 128 KiB section cap enforced by read_section above. The section cap is the real constraint, no? (~341 commitments max).
| let small = header & 0x3F; | ||
|
|
||
| let sumcheck_u64 = if small == OPENING_ID_SUMCHECK_ESCAPE { | ||
| transport::read_varint_u64(&mut reader).map_err(io_err)? |
There was a problem hiding this comment.
Accepts non-canonical encodings. i.e sumcheck_id=0 can be encoded inline (1 byte) or via escape+varint (2+ bytes). Add after the varint read:
if sumcheck_u64 < OPENING_ID_SUMCHECK_ESCAPE as u64 {
return Err(SerializationError::InvalidData);
} | pub struct Claims<F: JoltField>(pub Openings<F>); | ||
|
|
||
| #[cfg(not(feature = "zk"))] | ||
| const MAX_CLAIMS_COUNT: u64 = 10_000; |
There was a problem hiding this comment.
10,000 × ~33 bytes/claim = ~330 KB which exceeds MAX_SECTION_LEN (128 KiB). The section cap is the binding constraint (~3,700 claims max). These should be harmonized.
jolt-core/src/zkvm/mod.rs
Outdated
| pub mod ram; | ||
| pub mod registers; | ||
| pub mod spartan; | ||
| pub mod transport; |
There was a problem hiding this comment.
should be pub(crate)? it's internal?
jolt-core/src/zkvm/transport.rs
Outdated
|
|
||
| #[inline] | ||
| pub fn read_magic_version<R: Read>(r: &mut R, magic: &[u8]) -> io::Result<u8> { | ||
| let mut buf = vec![0u8; magic.len()]; |
There was a problem hiding this comment.
Heap-allocates for a 4-byte read. Use let mut buf = [0u8; 4] instead.
| } | ||
|
|
||
| #[cfg(test)] | ||
| mod tests { |
There was a problem hiding this comment.
We should have proper e2e tests - maybe also modify examples to use this new formatting.
Side note: the Dory crate does Vec::with_capacity(num_rounds) where num_rounds is an untrusted u32 from the stream, before any data hits the Take reader. An attacker can trigger a ~7.8 TB allocation. The section cap limits I/O but not the upfront allocation. Worth adding a sanity check on the num_rounds in the proof.
|
Posted by Codex assistant (model: GPT-5.4, reasoning effort: xhigh) on behalf of the user (Quang Dao) with approval. I pushed a pass addressing the active review comments. Changes:
Checks run:
I didn’t touch the old |
…rmat # Conflicts: # jolt-core/src/zkvm/proof_serialization.rs

Summary
Replaces derive-macro-based
CanonicalSerialize/CanonicalDeserializeforJoltProofwith a manual, length-delimited section format intended for strictness and DoS resistance.Wire format
`[magic: 4B "JOLT"][version: 1B][flags: 1B][section₀][section₁]…`
Each section is `[varint payload_len][payload bytes]`. Sections are sequential and untagged — the deserializer reads them in the fixed order defined by the proof schema.
Header (6 bytes):
b"JOLT"(4 bytes)1(1 byte)is_zk(ZK mode vs standard), bits 1–7 reserved (must be 0)Changes
zkvm::transport— new module with varint u64 read/write, magic+version header helpers, and capped section-length readingJoltProofser/de — manual, length-delimited section format with:debug_assert!on write to catch cap mismatches during developmentOpeningIdencoding — packed header byte (2-bit kind + 6-bit sumcheck_id) with varint escape for sumcheck_id ≥ 63Claimsunification — standaloneClaimsser/de now uses varint count (matching JoltProof's inline format); JoltProof delegates to it viawrite_section!/read_single_section!Security / DoS properties
Test plan
cargo clippyclean in both--features hostand--features host,zkmuldive2e test passes in both modes