Skip to content

fix: catch-up stores during init#947

Merged
scarmuega merged 3 commits intomainfrom
fix/catch-up-store-during-init
Mar 18, 2026
Merged

fix: catch-up stores during init#947
scarmuega merged 3 commits intomainfrom
fix/catch-up-store-during-init

Conversation

@scarmuega
Copy link
Member

@scarmuega scarmuega commented Mar 18, 2026

Summary by CodeRabbit

  • New Features
    • Enhanced block indexing with richer transaction metadata, inputs/outputs, scripts, datums, certificates, and redeemers.
    • Added catch-up computation to produce combined UTxO and index deltas for recovery/bootstrapping.
  • Refactor
    • Centralized per-block indexing into a single shared indexing path and commit step.
  • Tests
    • Added integration test validating bootstrap catch-up restores archive and index state.
  • Chores
    • Updated test targets and feature flags for tooling and debugging.

@coderabbitai
Copy link

coderabbitai bot commented Mar 18, 2026

📝 Walkthrough

Walkthrough

Consolidates per-block indexing into a new CardanoIndexDeltaBuilder::index_block and wires it into a catch-up path: CardanoLogic::compute_catchup, bootstrap store replay, and WorkBatch index commit to synchronize archive and index stores from WAL during startup.

Changes

Cohort / File(s) Summary
Chain interface & types
crates/core/src/lib.rs
Adds CatchUpBlockData struct and ChainLogic::compute_catchup trait method to support computing catch-up payloads from WAL blocks.
Cardano index builder
crates/cardano/src/indexes/delta.rs
Adds pub fn index_block(&mut self, block, resolved_inputs) which indexes whole-block data: tx hashes, metadata labels, spent inputs (with resolutions), outputs (addresses, assets, datums, script refs), script hashes (native & Plutus v1/v2/v3), plutus data hashes, certificates, and redeemers.
Cardano chain logic
crates/cardano/src/lib.rs
Implements compute_catchup for CardanoLogic: decodes block & inputs, computes UTxO apply-delta, builds index delta with CardanoIndexDeltaBuilder, and returns CatchUpBlockData.
Batching / Work batch
crates/cardano/src/roll/batch.rs
Replaces manual per-transaction archive/index assembly with a single builder.index_block(&block, &self.utxos_decoded) call; removes duplicated per-tx indexing and adds WorkBatch::commit_indexes to apply index deltas.
Bootstrap catch-up
crates/core/src/bootstrap.rs, tests/bootstrap.rs
Adds catch-up functions (catch_up_stores, catch_up_archive, catch_up_indexes) to replay WAL entries and sync archive/index during bootstrap; includes an integration test validating catch-up behavior.
Misc / WAL bounds
crates/redb3/src/wal/mod.rs, Cargo.toml
Safe boundary handling and inclusive range adjustments in WAL locate/slot logic; adds test target and new feature flags in Cargo.toml.

Sequence Diagram(s)

sequenceDiagram
    participant Bootstrap as Bootstrap
    participant WAL as WAL Store
    participant ChainLogic as CardanoLogic
    participant Archive as Archive Store
    participant Index as Index Store

    Bootstrap->>WAL: Read next WAL block + inputs (from cursor)
    WAL-->>Bootstrap: Block, Inputs, Point
    Bootstrap->>ChainLogic: compute_catchup(block, inputs, point)
    activate ChainLogic
    ChainLogic->>ChainLogic: decode block & inputs -> resolved map
    ChainLogic->>ChainLogic: compute UTxO apply-delta
    ChainLogic->>ChainLogic: build index delta via index_block(block, resolved_inputs)
    ChainLogic-->>Bootstrap: CatchUpBlockData {utxo_delta, index_delta, tx_hashes}
    deactivate ChainLogic
    Bootstrap->>Archive: apply utxo_delta & commit
    Archive-->>Bootstrap: ack
    Bootstrap->>Index: apply index_delta & commit
    Index-->>Bootstrap: ack
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰
Hopping through WAL with a nose for a trace,
One block, one index — I tidy the space.
Scripts, datums, redeemers all neatly in line,
Bootstrap now hums as the cursors align.
Thump — the archive and index are caught up in time.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title 'fix: catch-up stores during init' directly describes the main change: implementing catch-up functionality for archive and index stores during bootstrap initialization, which is the primary focus across multiple files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/catch-up-store-during-init
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can disable poems in the walkthrough.

Disable the reviews.poem setting to disable the poems in the walkthrough.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/cardano/src/lib.rs`:
- Around line 389-390: Remove the needless borrow on the first argument to
builder.index_block: call builder.index_block(blockv, &decoded_inputs) instead
of builder.index_block(&blockv, &decoded_inputs) because blockv is already a
reference (from view()) and the method expects a single reference.

In `@crates/core/src/bootstrap.rs`:
- Around line 135-159: Change archive_tip to keep the full ChainPoint
(slot+hash) returned by get_tip() instead of dropping the hash, and use that
ChainPoint everywhere as the replay boundary: construct archive_tip as
Option<ChainPoint> from get_tip(), pass that Option<ChainPoint> into
domain.wal().locate_point(...) and into the iter_blocks start logic, and when
skipping in the loop compare the block's ChainPoint to the archive_tip
ChainPoint (using the full ChainPoint equality/ordering) so same-slot reforks
are not treated as already caught up; also preserve the None behavior but ensure
locate_point only receives a full ChainPoint when Some so recreated archives
won't incorrectly skip required blocks.
- Around line 179-203: The index replay currently reduces index_cursor to
index_slot and compares only slots, which loses fork identity and can skip
necessary replay; change the logic to keep and use the full index_cursor Point
(not just its slot) for both locating the WAL start and skipping logs: compute a
cloned index_cursor_point from domain.indexes().cursor(), call
domain.wal().locate_point using the full point context (or otherwise derive the
correct start from index_cursor_point), replace the skip condition inside the
for (point, log) loop with a full-point comparison (e.g., if
index_cursor_point.as_ref().map_or(false, |c| point <= *c) { continue }), and
ensure you do not treat None as “complete” — only advance writer (writer from
domain.indexes().start_writer()) to state_cursor after you have actually
replayed all required logs up to state_cursor.
- Around line 153-168: The archive catch-up loop uses
domain.archive().start_writer() and ArchiveStoreWriter::apply()/commit() but
leaves a window where apply() buffers and commit() appends flatfiles before the
DB transaction commits, allowing duplicates if the process dies; fix by making
the append atomic or idempotent: either (A) perform a final re-check of the
archive tip just before calling writer.commit() and skip/apply only blocks with
slot > the refreshed tip (use the same point/slot checks used earlier), or (B)
change the writer usage so commit() is part of the same durable transaction
(e.g., obtain a writer variant or API that takes/uses the redb transaction so
flatfile appends are only visible when the DB commit succeeds) so
ArchiveStoreWriter::commit() cannot append on its own outside the DB
transaction. Ensure you update catch_up_archive(), the writer usage around
domain.archive().start_writer(), and the block filtering logic to reference
point.slot() and archive_tip consistently to prevent duplicate appends.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0894cb51-d4b5-44e7-8847-5935d1e5fbde

📥 Commits

Reviewing files that changed from the base of the PR and between 3282678 and 5a7759c.

📒 Files selected for processing (5)
  • crates/cardano/src/indexes/delta.rs
  • crates/cardano/src/lib.rs
  • crates/cardano/src/roll/batch.rs
  • crates/core/src/bootstrap.rs
  • crates/core/src/lib.rs

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
Cargo.toml (1)

145-145: ⚠️ Potential issue | 🟠 Major

Add feature guards to inquire imports in bootstrap modules.

inquire is marked as optional and gated behind the utils feature in Cargo.toml, but src/bin/dolos/bootstrap/mod.rs and src/bin/dolos/bootstrap/snapshot.rs import inquire::list_option::ListOption unconditionally. Any build without the utils feature will fail. Guard these imports with #[cfg(feature = "utils")] or make the dependency unconditional.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Cargo.toml` at line 145, The code imports inquire::list_option::ListOption
unconditionally, but inquire is optional under the "utils" feature; wrap those
imports with #[cfg(feature = "utils")] (or #[cfg_attr(..., allow(...))] if
needed) in the bootstrap module files that reference ListOption (e.g., the
bootstrap mod and snapshot modules) so the ListOption import and any code using
it are compiled only when the "utils" feature is enabled, or alternatively
remove the cfg and make the inquire dependency unconditional in Cargo.toml.
♻️ Duplicate comments (2)
crates/core/src/bootstrap.rs (2)

179-203: ⚠️ Potential issue | 🟠 Major

Resume index replay from the exact cursor, not just its slot.

Lines 185-203 reduce index_cursor to index_slot before locating the WAL start and filtering entries. That skips same-slot reforks, and a recreated index database can be advanced from an incomplete WAL window while still looking current. Keep the full ChainPoint for both the replay boundary and the skip check.

Based on learnings, "The index database is kept separate from primary storage (state, chain, wal) to enable independent scaling, tuning, and rebuilding without affecting primary data".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/core/src/bootstrap.rs` around lines 179 - 203, The code reduces
index_cursor to index_slot and uses only the slot for locate_point and skip
checks, which misses same-slot reforks; update logic to keep and use the full
ChainPoint (index_cursor) when computing the WAL start (pass index_cursor to
domain.wal().locate_point or equivalent) and when filtering replayed entries
(compare point == index_cursor or use full ChainPoint ordering instead of only
point.slot()), replacing uses of index_slot in start computation and the if
Some(point.slot()) <= index_slot check so the replay boundary and skip condition
operate on the complete ChainPoint values (references: index_cursor,
domain.wal().locate_point, logs iteration variables point, index_slot).

135-147: ⚠️ Potential issue | 🟠 Major

Use the full ChainPoint when resuming archive replay.

This collapses the archive boundary to a slot for the early return, WAL resume point, and skip check. Same-slot reforks or a recreated archive database can then be treated as already caught up even when the hash lineage differs. Keep the full (slot, hash) tip as the replay boundary instead of comparing only point.slot().

Also applies to: 156-159

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/core/src/bootstrap.rs` around lines 135 - 147, The code is only
comparing slots when resuming archive replay; instead use the full ChainPoint
(slot, hash) returned by domain.archive().get_tip() for the early return, for
computing the WAL resume point, and for the skip check (also update the similar
logic around the later 156-159 region). Change archive_tip to hold the full
ChainPoint (e.g., Some((slot, hash)) or a ChainPoint type), compare it against
the state cursor’s ChainPoint (instead of state_cursor.slot()), and call
domain.wal().locate_point with the full archive tip ChainPoint rather than only
the slot so reforks or rebuilt archives with differing hashes won’t be treated
as up-to-date.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@Cargo.toml`:
- Line 145: The code imports inquire::list_option::ListOption unconditionally,
but inquire is optional under the "utils" feature; wrap those imports with
#[cfg(feature = "utils")] (or #[cfg_attr(..., allow(...))] if needed) in the
bootstrap module files that reference ListOption (e.g., the bootstrap mod and
snapshot modules) so the ListOption import and any code using it are compiled
only when the "utils" feature is enabled, or alternatively remove the cfg and
make the inquire dependency unconditional in Cargo.toml.

---

Duplicate comments:
In `@crates/core/src/bootstrap.rs`:
- Around line 179-203: The code reduces index_cursor to index_slot and uses only
the slot for locate_point and skip checks, which misses same-slot reforks;
update logic to keep and use the full ChainPoint (index_cursor) when computing
the WAL start (pass index_cursor to domain.wal().locate_point or equivalent) and
when filtering replayed entries (compare point == index_cursor or use full
ChainPoint ordering instead of only point.slot()), replacing uses of index_slot
in start computation and the if Some(point.slot()) <= index_slot check so the
replay boundary and skip condition operate on the complete ChainPoint values
(references: index_cursor, domain.wal().locate_point, logs iteration variables
point, index_slot).
- Around line 135-147: The code is only comparing slots when resuming archive
replay; instead use the full ChainPoint (slot, hash) returned by
domain.archive().get_tip() for the early return, for computing the WAL resume
point, and for the skip check (also update the similar logic around the later
156-159 region). Change archive_tip to hold the full ChainPoint (e.g.,
Some((slot, hash)) or a ChainPoint type), compare it against the state cursor’s
ChainPoint (instead of state_cursor.slot()), and call domain.wal().locate_point
with the full archive tip ChainPoint rather than only the slot so reforks or
rebuilt archives with differing hashes won’t be treated as up-to-date.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c5e5d5f3-d273-4c2b-b9bc-27031c61f437

📥 Commits

Reviewing files that changed from the base of the PR and between 5a7759c and 52160ed.

📒 Files selected for processing (5)
  • Cargo.toml
  • crates/cardano/src/lib.rs
  • crates/core/src/bootstrap.rs
  • crates/redb3/src/wal/mod.rs
  • tests/bootstrap.rs

@scarmuega scarmuega merged commit 014fbf4 into main Mar 18, 2026
12 checks passed
@scarmuega scarmuega deleted the fix/catch-up-store-during-init branch March 18, 2026 13:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant