Conversation
📝 WalkthroughWalkthroughConsolidates per-block indexing into a new Changes
Sequence Diagram(s)sequenceDiagram
participant Bootstrap as Bootstrap
participant WAL as WAL Store
participant ChainLogic as CardanoLogic
participant Archive as Archive Store
participant Index as Index Store
Bootstrap->>WAL: Read next WAL block + inputs (from cursor)
WAL-->>Bootstrap: Block, Inputs, Point
Bootstrap->>ChainLogic: compute_catchup(block, inputs, point)
activate ChainLogic
ChainLogic->>ChainLogic: decode block & inputs -> resolved map
ChainLogic->>ChainLogic: compute UTxO apply-delta
ChainLogic->>ChainLogic: build index delta via index_block(block, resolved_inputs)
ChainLogic-->>Bootstrap: CatchUpBlockData {utxo_delta, index_delta, tx_hashes}
deactivate ChainLogic
Bootstrap->>Archive: apply utxo_delta & commit
Archive-->>Bootstrap: ack
Bootstrap->>Index: apply index_delta & commit
Index-->>Bootstrap: ack
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip You can disable poems in the walkthrough.Disable the |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/cardano/src/lib.rs`:
- Around line 389-390: Remove the needless borrow on the first argument to
builder.index_block: call builder.index_block(blockv, &decoded_inputs) instead
of builder.index_block(&blockv, &decoded_inputs) because blockv is already a
reference (from view()) and the method expects a single reference.
In `@crates/core/src/bootstrap.rs`:
- Around line 135-159: Change archive_tip to keep the full ChainPoint
(slot+hash) returned by get_tip() instead of dropping the hash, and use that
ChainPoint everywhere as the replay boundary: construct archive_tip as
Option<ChainPoint> from get_tip(), pass that Option<ChainPoint> into
domain.wal().locate_point(...) and into the iter_blocks start logic, and when
skipping in the loop compare the block's ChainPoint to the archive_tip
ChainPoint (using the full ChainPoint equality/ordering) so same-slot reforks
are not treated as already caught up; also preserve the None behavior but ensure
locate_point only receives a full ChainPoint when Some so recreated archives
won't incorrectly skip required blocks.
- Around line 179-203: The index replay currently reduces index_cursor to
index_slot and compares only slots, which loses fork identity and can skip
necessary replay; change the logic to keep and use the full index_cursor Point
(not just its slot) for both locating the WAL start and skipping logs: compute a
cloned index_cursor_point from domain.indexes().cursor(), call
domain.wal().locate_point using the full point context (or otherwise derive the
correct start from index_cursor_point), replace the skip condition inside the
for (point, log) loop with a full-point comparison (e.g., if
index_cursor_point.as_ref().map_or(false, |c| point <= *c) { continue }), and
ensure you do not treat None as “complete” — only advance writer (writer from
domain.indexes().start_writer()) to state_cursor after you have actually
replayed all required logs up to state_cursor.
- Around line 153-168: The archive catch-up loop uses
domain.archive().start_writer() and ArchiveStoreWriter::apply()/commit() but
leaves a window where apply() buffers and commit() appends flatfiles before the
DB transaction commits, allowing duplicates if the process dies; fix by making
the append atomic or idempotent: either (A) perform a final re-check of the
archive tip just before calling writer.commit() and skip/apply only blocks with
slot > the refreshed tip (use the same point/slot checks used earlier), or (B)
change the writer usage so commit() is part of the same durable transaction
(e.g., obtain a writer variant or API that takes/uses the redb transaction so
flatfile appends are only visible when the DB commit succeeds) so
ArchiveStoreWriter::commit() cannot append on its own outside the DB
transaction. Ensure you update catch_up_archive(), the writer usage around
domain.archive().start_writer(), and the block filtering logic to reference
point.slot() and archive_tip consistently to prevent duplicate appends.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0894cb51-d4b5-44e7-8847-5935d1e5fbde
📒 Files selected for processing (5)
crates/cardano/src/indexes/delta.rscrates/cardano/src/lib.rscrates/cardano/src/roll/batch.rscrates/core/src/bootstrap.rscrates/core/src/lib.rs
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
Cargo.toml (1)
145-145:⚠️ Potential issue | 🟠 MajorAdd feature guards to
inquireimports in bootstrap modules.
inquireis marked as optional and gated behind theutilsfeature in Cargo.toml, butsrc/bin/dolos/bootstrap/mod.rsandsrc/bin/dolos/bootstrap/snapshot.rsimportinquire::list_option::ListOptionunconditionally. Any build without theutilsfeature will fail. Guard these imports with#[cfg(feature = "utils")]or make the dependency unconditional.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Cargo.toml` at line 145, The code imports inquire::list_option::ListOption unconditionally, but inquire is optional under the "utils" feature; wrap those imports with #[cfg(feature = "utils")] (or #[cfg_attr(..., allow(...))] if needed) in the bootstrap module files that reference ListOption (e.g., the bootstrap mod and snapshot modules) so the ListOption import and any code using it are compiled only when the "utils" feature is enabled, or alternatively remove the cfg and make the inquire dependency unconditional in Cargo.toml.
♻️ Duplicate comments (2)
crates/core/src/bootstrap.rs (2)
179-203:⚠️ Potential issue | 🟠 MajorResume index replay from the exact cursor, not just its slot.
Lines 185-203 reduce
index_cursortoindex_slotbefore locating the WAL start and filtering entries. That skips same-slot reforks, and a recreated index database can be advanced from an incomplete WAL window while still looking current. Keep the fullChainPointfor both the replay boundary and the skip check.Based on learnings, "The index database is kept separate from primary storage (state, chain, wal) to enable independent scaling, tuning, and rebuilding without affecting primary data".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/bootstrap.rs` around lines 179 - 203, The code reduces index_cursor to index_slot and uses only the slot for locate_point and skip checks, which misses same-slot reforks; update logic to keep and use the full ChainPoint (index_cursor) when computing the WAL start (pass index_cursor to domain.wal().locate_point or equivalent) and when filtering replayed entries (compare point == index_cursor or use full ChainPoint ordering instead of only point.slot()), replacing uses of index_slot in start computation and the if Some(point.slot()) <= index_slot check so the replay boundary and skip condition operate on the complete ChainPoint values (references: index_cursor, domain.wal().locate_point, logs iteration variables point, index_slot).
135-147:⚠️ Potential issue | 🟠 MajorUse the full
ChainPointwhen resuming archive replay.This collapses the archive boundary to a slot for the early return, WAL resume point, and skip check. Same-slot reforks or a recreated archive database can then be treated as already caught up even when the hash lineage differs. Keep the full
(slot, hash)tip as the replay boundary instead of comparing onlypoint.slot().Also applies to: 156-159
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/core/src/bootstrap.rs` around lines 135 - 147, The code is only comparing slots when resuming archive replay; instead use the full ChainPoint (slot, hash) returned by domain.archive().get_tip() for the early return, for computing the WAL resume point, and for the skip check (also update the similar logic around the later 156-159 region). Change archive_tip to hold the full ChainPoint (e.g., Some((slot, hash)) or a ChainPoint type), compare it against the state cursor’s ChainPoint (instead of state_cursor.slot()), and call domain.wal().locate_point with the full archive tip ChainPoint rather than only the slot so reforks or rebuilt archives with differing hashes won’t be treated as up-to-date.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@Cargo.toml`:
- Line 145: The code imports inquire::list_option::ListOption unconditionally,
but inquire is optional under the "utils" feature; wrap those imports with
#[cfg(feature = "utils")] (or #[cfg_attr(..., allow(...))] if needed) in the
bootstrap module files that reference ListOption (e.g., the bootstrap mod and
snapshot modules) so the ListOption import and any code using it are compiled
only when the "utils" feature is enabled, or alternatively remove the cfg and
make the inquire dependency unconditional in Cargo.toml.
---
Duplicate comments:
In `@crates/core/src/bootstrap.rs`:
- Around line 179-203: The code reduces index_cursor to index_slot and uses only
the slot for locate_point and skip checks, which misses same-slot reforks;
update logic to keep and use the full ChainPoint (index_cursor) when computing
the WAL start (pass index_cursor to domain.wal().locate_point or equivalent) and
when filtering replayed entries (compare point == index_cursor or use full
ChainPoint ordering instead of only point.slot()), replacing uses of index_slot
in start computation and the if Some(point.slot()) <= index_slot check so the
replay boundary and skip condition operate on the complete ChainPoint values
(references: index_cursor, domain.wal().locate_point, logs iteration variables
point, index_slot).
- Around line 135-147: The code is only comparing slots when resuming archive
replay; instead use the full ChainPoint (slot, hash) returned by
domain.archive().get_tip() for the early return, for computing the WAL resume
point, and for the skip check (also update the similar logic around the later
156-159 region). Change archive_tip to hold the full ChainPoint (e.g.,
Some((slot, hash)) or a ChainPoint type), compare it against the state cursor’s
ChainPoint (instead of state_cursor.slot()), and call domain.wal().locate_point
with the full archive tip ChainPoint rather than only the slot so reforks or
rebuilt archives with differing hashes won’t be treated as up-to-date.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: c5e5d5f3-d273-4c2b-b9bc-27031c61f437
📒 Files selected for processing (5)
Cargo.tomlcrates/cardano/src/lib.rscrates/core/src/bootstrap.rscrates/redb3/src/wal/mod.rstests/bootstrap.rs
Summary by CodeRabbit