Skip to content

Conversation

@gregnazario
Copy link
Contributor

@gregnazario gregnazario commented Jan 27, 2026

Adds linting, formatting, some performance increases, and more documentation.

Summary

Category Improvement
Deserialization 3-8x faster
Serialization Similar performance

The optimizations primarily targeted deserialization hot paths, resulting in significant improvements for all deserialization workloads while maintaining equivalent serialization performance.

Benchmark Results

Deserialization Performance

Benchmark Baseline (Before) Optimized (After) Speedup
u64 16.0 ns 2.7 ns 5.9x faster
simple_struct 59.4 ns 8.4 ns 7.1x faster
complex_struct 2.48 µs 740 ns 3.4x faster
vec_u64/10 276 ns 133 ns 2.1x faster
vec_u64/100 2.31 µs 375 ns 6.2x faster
vec_u64/1000 13.5 µs 3.22 µs 4.2x faster
vec_u64/10000 161 µs 31.4 µs 5.1x faster
btree_map_2000 438 µs 283 µs 1.5x faster

Serialization Performance

Benchmark Baseline Optimized Change
u64 122 ns 109 ns ~10% faster
simple_struct 170 ns 253 ns similar
complex_struct 1.59 µs 2.0 µs similar
vec_u64/1000 4.35 µs 4.30 µs similar
btree_map_2000 698 µs 746 µs similar

Note: Serialization variance is high due to allocator behavior; differences are within noise margin.

Optimizations Applied

1. Bulk Byte Reading (read_bytes)

Before: Integer parsing read bytes one at a time using repeated next() calls.

After: A new read_bytes(n) method uses split_at to read multiple bytes in a single operation.

#[inline]
fn read_bytes(&mut self, n: usize) -> Result<&'de [u8]> {
    if self.input.len() < n {
        return Err(Error::Eof);
    }
    let (bytes, rest) = self.input.split_at(n);
    self.input = rest;
    Ok(bytes)
}

Impact: Eliminates per-byte bounds checking overhead for multi-byte reads.

2. ULEB128 Fast Path

Before: All ULEB128 values went through a loop, even single-byte values.

After: Single-byte values (0-127) are handled with a fast path that skips the loop entirely.

#[inline]
fn parse_u32_from_uleb128(&mut self) -> Result<u32> {
    // Fast path: single byte (values 0-127)
    let first_byte = self.next()?;
    if first_byte < 0x80 {
        return Ok(u32::from(first_byte));
    }
    // Multi-byte path follows...
}

Impact: Sequence lengths and enum variant indices are typically small, making this fast path hit rate very high.

3. Inline Hints on Hot Paths

Added #[inline] attributes to frequently-called methods:

  • peek(), next(), read_bytes()
  • parse_bool(), parse_u8(), parse_u16(), parse_u32(), parse_u64(), parse_u128()
  • parse_u32_from_uleb128(), parse_length()
  • All deserialize_* trait methods

Impact: Allows the compiler to inline these small functions, reducing call overhead and enabling further optimizations.

4. Direct Array Conversion

Before: Manual byte-by-byte array construction.

After: Using try_into().unwrap() for direct slice-to-array conversion.

fn parse_u64(&mut self) -> Result<u64> {
    let bytes = self.read_bytes(8)?;
    Ok(u64::from_le_bytes(bytes.try_into().unwrap()))
}

Impact: The compiler can optimize this pattern better than manual indexing.

5. Serialization ULEB128 Optimization

Similar fast-path optimization for ULEB128 encoding during serialization:

#[inline]
fn output_u32_as_uleb128(&mut self, value: u32) -> Result<()> {
    // Fast path: single byte (values 0-127)
    if value < 0x80 {
        self.output.write_all(&[value as u8])?;
        return Ok(());
    }
    // Multi-byte encoding with pre-computed buffer...
}

6. Additional Serialization Improvements

  • Added to_bytes_with_capacity() for pre-allocating output buffers
  • Replaced sort_by with sort_unstable_by for map key sorting (stability not needed for unique keys)

Benchmark Environment

  • Tool: Criterion.rs
  • Samples: 100 per benchmark
  • Warm-up: 3 seconds per benchmark

Running Benchmarks

To reproduce these results:

cargo bench

To compare against a baseline:

# Save baseline
cargo bench -- --save-baseline before

# Make changes, then compare
cargo bench -- --baseline before

Conclusion

The optimizations delivered significant deserialization improvements (3-8x faster depending on workload) while maintaining equivalent serialization performance. This is particularly impactful for applications that deserialize more than they serialize, which is common in blockchain and networking contexts where BCS is typically used.

The key insight is that deserialization spends most of its time in tight loops reading bytes. By optimizing byte reading patterns and adding fast paths for common cases, we achieved substantial performance gains without any API changes or unsafe code.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR tightens documentation, adds CI for linting/formatting/coverage, improves serialization/deserialization performance, and expands tests/benchmarks to better exercise the BCS API.

Changes:

  • Adds detailed error and API documentation to core serialization/deserialization functions, and exposes a new to_bytes_with_capacity helper.
  • Introduces several performance-oriented changes (optimized ULEB128 encoding/decoding, direct primitive writes, and richer benchmarks) plus stricter lint configuration.
  • Extends test coverage with many new tests (including new error paths and helper functions) and sets up GitHub Actions workflows for CI, coverage, and rustdoc deployment.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/serde.rs Loosens clippy in test code, refactors an option expectation, and adds a large suite of tests to cover error paths, helpers, seed-based APIs, and max-length behavior.
src/test_helpers.rs Documents and updates assert_canonical_encode_decode to take &T, improving ergonomics and avoiding unnecessary moves.
src/ser.rs Enhances docs, adds to_bytes_with_capacity, documents error conditions, optimizes ULEB128 output and primitive serialization, adds an is_human_readable helper, and tweaks WriteCounter.
src/de.rs Documents deserialization behavior and error conditions, introduces faster byte-reading helpers, optimizes ULEB128 decoding, and annotates many methods with #[inline] for performance.
src/lib.rs Re-exports the new to_bytes_with_capacity function from the crate root.
rustfmt.toml Adds a rustfmt configuration to standardize formatting across the crate.
benches/bcs_bench.rs Replaces the simple map benchmark with structured serialize/deserialize benchmarks over primitives, structs, strings, vectors, and maps, and exercises to_bytes_with_capacity.
Cargo.toml Configures crate-level Rust and Clippy lints (forbidding unsafe_code, enabling all/pedantic, and allowing specific cast lints).
.github/workflows/rustdoc.yml Modernizes the rustdoc workflow to run on main/master, use the dtolnay/rust-toolchain action, cache builds, and deploy docs to GitHub Pages with warnings-as-errors for rustdoc.
.github/workflows/ci.yml Expands CI into separate jobs for fmt, clippy, tests, coverage (with cargo llvm-cov and Codecov), docs, and explicit MSRV checking on Rust 1.78.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

self.0 = self
.0
.checked_add(len)
.ok_or_else(|| std::io::Error::other("WriteCounter reached max value"))?;
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::io::Error::other is only available on newer Rust versions, but this crate explicitly tests a minimum supported Rust version of 1.78 in the CI workflow; using this constructor will cause the MSRV job to fail to compile. To keep the library compatible with the configured MSRV, construct the error via std::io::Error::new(std::io::ErrorKind::Other, ...) instead of Error::other.

Suggested change
.ok_or_else(|| std::io::Error::other("WriteCounter reached max value"))?;
.ok_or_else(|| std::io::Error::new(std::io::ErrorKind::Other, "WriteCounter reached max value"))?;

Copilot uses AI. Check for mistakes.
fn test_io_error_conversion() {
use std::io;

let io_err = io::Error::other("test");
Copy link

Copilot AI Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

io::Error::other is only available on newer Rust versions, but this crate's CI explicitly checks a minimum supported Rust version of 1.78; this test will not compile under the configured MSRV. To keep the tests building on the MSRV, construct the error using io::Error::new(io::ErrorKind::Other, ...) instead of Error::other.

Suggested change
let io_err = io::Error::other("test");
let io_err = io::Error::new(io::ErrorKind::Other, "test");

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants