Skip to content

treap, ffldb: improve pool recycling, add batched Delete#2513

Open
Roasbeef wants to merge 4 commits intomasterfrom
treap-pool-improvements
Open

treap, ffldb: improve pool recycling, add batched Delete#2513
Roasbeef wants to merge 4 commits intomasterfrom
treap-pool-improvements

Conversation

@Roasbeef
Copy link
Copy Markdown
Member

Change Description

In this PR, we improve the sync.Pool-based node recycling introduced in #2425 and extend the same optimization to Delete operations.

Faster Recycling Check in Put()

The original Put() recycling loop called newTreap.get(node.key) for each intermediate node to check if it was still live in the latest treap version. This worked correctly but cost O(depth × log n) key comparisons per batch iteration. We replace this with a pointer-set approach: put() now returns both the set of newly created nodes and the set of original source nodes that were cloned (the "replaced" set). The recycling loop checks pointer identity via nodeInSet(), which is just a linear scan of a [staticDepth]*treapNode array. This brings the cost down to O(depth²) pointer comparisons, i.e. ~400 pointer equality checks for a typical depth of 20, versus ~8000 key comparisons before.

We also document the MVCC snapshot safety argument directly in the Put() godoc. The core insight is that intermediate clones are always fresh pool allocations, never shared with any pre-existing snapshot. Only clones that were subsequently re-cloned (and thus replaced in the new treap) are candidates for recycling. Clones still structurally shared with the new treap are left alone.

To gain confidence in this safety argument, we built a Quint formal model (TLA+-inspired) that models the treap as a set of node IDs, non-deterministically applies batched puts, deletes, snapshot captures, and snapshot releases, then checks three invariants across 30,000 randomized traces: no recycled node appears in any snapshot, in the current treap, or anywhere live. All three invariants passed. The model confirmed that fresh clone IDs are always >= nextId at allocation time, so they can never collide with any pre-existing snapshot, which is exactly what allows us to use the simpler pointer-set check instead of a full treap traversal.

Batched Delete() with Pool Recycling

Previously, Delete accepted a single key and allocated cloned nodes from the pool via cloneTreapNode without ever returning them. During IBD, commitTx issued many individual Delete calls inside ForEach loops, creating a one-way pool drain where nodes were pulled from the pool but never recycled back.

We extend Delete to accept variadic keys (Delete(keys ...[]byte)), mirroring Put's batched API. The internal delete() method returns the same (created, replaced) arrays, enabling the same inter-step recycling. The commitTx callsite in dbcache.go now collects deletion keys into slices and issues single batched calls. We also fix the pendingRemove path to pass nil as the value in KVPair rather than carrying the actual value, since the remove treap only tracks keys.

Test Fixes

We fix three test error messages that printed i+1 as the expected value when the actual expected value was (i+1)*keyCount. We also add single-key and small-batch (3 keys) snapshot immutability tests that exercise the recycling logic at the granularity where intermediate node recycling actually occurs. The small-batch test additionally verifies that all 1000 original keys in the snapshot remain accessible with correct values after a Put with recycling.

Benchmarks

goos: darwin
goarch: arm64
cpu: Apple M4 Max

                                   │    old     │              new               │
                                   │   B/op     │    B/op      vs base           │
ImmutableDeleteBatch/batch=10      │  2.932Ki   │  1.251Ki   -57.33% (p=0.002)  │
ImmutableDeleteBatch/batch=100     │  67.66Ki   │  42.03Ki   -37.89% (p=0.002)  │
ImmutableDeleteBatch/batch=1000    │  1323.5Ki  │  925.6Ki   -30.07% (p=0.002)  │
ImmutableMixedPutDelete            │  605.8Ki   │  524.3Ki   -13.46% (p=0.002)  │

                                   │  allocs/op │  allocs/op   vs base           │
ImmutableDeleteBatch/batch=10      │    44.50   │    23.00   -48.31% (p=0.002)  │
ImmutableDeleteBatch/batch=100     │    935.5   │    607.5   -35.06% (p=0.002)  │
ImmutableDeleteBatch/batch=1000    │   17.63k   │   12.53k   -28.90% (p=0.002)  │
ImmutableMixedPutDelete            │   8.274k   │   7.230k   -12.61% (p=0.002)  │

The batched Delete shows 30-57% reduction in bytes allocated and 29-48% fewer allocations. The mixed Put+Delete benchmark (which mirrors the commitTx pattern) shows a 13% reduction in both memory and allocations.

Put performance is roughly neutral, as expected: the pointer-set approach trades O(log n) key comparisons for O(depth) pointer comparisons, which are similar in practice for typical treap sizes.

See each commit message for a detailed description w.r.t the incremental changes.

Steps to Test

go test ./database/internal/treap/ -v -race -count=1
go test ./database/ffldb/ -v -race -short -count=1
go test ./database/internal/treap/ -bench=BenchmarkImmutable -benchmem

Pull Request Checklist

Testing

  • Your PR passes all CI checks.
  • Tests covering the positive and negative (error paths) are included.

Code Style and Documentation

  • The change obeys the Code Documentation and Commenting guidelines, and lines wrap at 80.
  • Commits follow the Ideal Git Commit Structure.

In this commit, we improve the node recycling mechanism in the immutable
treap's batched Put operation and extend the same optimization to Delete.

The original recycling check called newTreap.get(node.key) for each
intermediate node to determine whether it was still live in the latest
treap. This involved O(depth * log n) key comparisons per batch
iteration. We replace this with a pointer-set approach: put() now
returns both the set of newly created nodes and the set of original
source nodes that were cloned (the "replaced" set). The recycling loop
then checks pointer identity via nodeInSet(), reducing the cost to
O(depth^2) pointer comparisons (~400 ops for a typical depth of 20,
versus ~8000 key comparisons previously).

Additionally, we document the MVCC snapshot safety argument directly in
the Put() godoc. The key insight is that intermediate clones are always
fresh pool allocations, never shared with any pre-existing snapshot.
Only clones that were subsequently re-cloned (and thus replaced) are
recycled, while clones still structurally shared with the new treap are
left untouched.

We also extend Delete to accept variadic keys, mirroring Put's batched
API. The internal delete() method returns the same (created, replaced)
arrays, enabling the same inter-step recycling. Previously, Delete only
accepted a single key and allocated cloned nodes from the pool without
ever returning them, creating asymmetric pool drain during IBD when
commitTx issued many individual Delete calls.

Finally, Put now guards against zero-length input with an early return.
In this commit, we update commitTx to collect deletion keys into slices
and issue single batched Delete calls rather than calling Delete
individually inside the ForEach loop. This allows the treap's
inter-step node recycling to reclaim intermediate clones that were
previously leaked back to the GC.

We also fix the pendingRemove path to pass nil as the value in KVPairs
rather than the actual value from the mutable treap. The remove treap
only tracks keys for deletion and never uses the stored values, so
retaining references to them was unnecessary.
In this commit, we fix three test error messages that printed the wrong
expected value on failure. The Len check compared against (i+1)*keyCount
but the error message only printed i+1, which would have been misleading
if the assertion ever fired.

We also add two new snapshot immutability tests to the existing
TestImmutableSnapshot function:

A single-element Put test verifies that inserting one key via
Put(KVPair{k,v}) does not corrupt a snapshot taken before the
insertion. This exercises the recycling path where prevCreated is empty
and no recycling should occur.

A small-batch (3 keys) test exercises the recycling logic at a
granularity where intermediate node recycling occurs between the 2nd
and 3rd insertions. After the batch Put, we verify that all 1000
original keys in the snapshot remain accessible with correct values,
confirming that recycling does not corrupt structurally shared nodes.
In this commit, we add four benchmark functions to measure the
performance of the immutable treap's pool-based node recycling:

BenchmarkImmutablePutBatch exercises batched Put at batch sizes of 1,
10, 100, and 1000 keys using SHA-256 hashed keys to produce unordered
insertion patterns.

BenchmarkImmutableDeleteBatch measures batched Delete at the same
scales, first building a treap and then deleting all keys in one batch.

BenchmarkImmutablePutSequential simulates the IBD hot path by
performing 10 rounds of 100-key batch inserts into a growing treap,
exercising recycling across multiple Put calls.

BenchmarkImmutableMixedPutDelete mirrors the commitTx pattern in
dbcache.go by inserting 500 keys then deleting 250, measuring the
combined allocation overhead of interleaved batch operations.
@coveralls
Copy link
Copy Markdown

Pull Request Test Coverage Report for Build 23569826387

Details

  • 119 of 120 (99.17%) changed or added relevant lines in 2 files are covered.
  • 6 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.07%) to 56.299%

Changes Missing Coverage Covered Lines Changed/Added Lines %
database/internal/treap/immutable.go 108 109 99.08%
Files with Coverage Reduction New Missed Lines %
peer/peer.go 3 72.41%
rpcclient/infrastructure.go 3 47.89%
Totals Coverage Status
Change from base Build 23569410477: 0.07%
Covered Lines: 32040
Relevant Lines: 56910

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants