feat: lazy chain config init via ChainConfigWatch with epoch-boundary… by ginnun · Pull Request #490 · blockfrost/blockfrost-platform

ginnun · 2026-03-23T13:50:32Z

Resolves #489

Replace eager chain config init with lazy config init and epoch-boundary refresh #489

Summary

Replace one-shot init_caches at startup with ChainConfigWatch — a background watcher that waits for the node to reach Conway era with ≥99.9% sync, then refreshes protocol parameters at each epoch boundary
ExternalEvaluator is now pull-based: reads config on demand from cache, spawns testgen-hs lazily on first evaluate request, and re-spawns when config changes (detected via Arc::ptr_eq)
Server starts immediately, evaluate endpoints return 503 until the node is ready, all other endpoints work from the start
Add BlockfrostError::service_unavailable (503)

Motivation

init_caches ran once at startup. If the node was still syncing (wrong era or incomplete state), the evaluator got wrong genesis config and protocol parameters for its entire lifetime. Protocol parameter changes at epoch boundaries (via governance) were also missed — requiring a full restart.

… refresh

cloudflare-workers-and-pages · 2026-03-23T13:50:55Z

Deploying blockfrost-platform with Cloudflare Pages

Latest commit:	`2ea2983`
Status:	✅ Deploy successful!
Preview URL:	https://1e118ba4.blockfrost-platform.pages.dev
Branch Preview URL:	https://feat-chain-config-from-node.blockfrost-platform.pages.dev

View logs

ginnun · 2026-03-24T10:44:54Z

crates/tx_evaluator/src/native.rs

 use crate::model::api::Value;
 use crate::model::api::ValueV6;

+// TODO: This evaluator fetches genesis_config_and_pp() from the node on every


Replying #378 (comment) from @michalrus:
Let's continue here regarding the comment.
The problem you referenced is left open as a TODO in this sub-PR.
Native evaluator should use the new caching mechanism.

Hmmm, now I'm not sure if it wouldn't be easier to review, if it was just a single PR

Hmmm, now I'm not sure if it wouldn't be easier to review, if it was just a single PR

@ginnun, @vladimirvolek, WDYT?

Copilot

Pull request overview

This PR introduces lazy, background initialization and refresh of chain configuration (genesis + protocol parameters) so transaction evaluation stays correct while the node syncs and across epoch-boundary parameter updates.

Changes:

Add ChainConfigWatch to wait for node readiness and refresh protocol parameters at epoch boundaries, publishing updates via a watch channel.
Make ExternalEvaluator pull-based and lazily spawn/re-spawn testgen-hs based on cached config availability/changes; evaluation returns 503 until config is ready.
Update platform server bootstrap and tests to accommodate non-blocking startup and the new watcher; add BlockfrostError::service_unavailable (503).

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
crates/tx_evaluator/src/native.rs	Adds note about future migration to cached chain config.
crates/tx_evaluator/src/external.rs	Refactors evaluator to use `ChainConfigWatch` and lazy (re)spawning of `testgen-hs`.
crates/tx_evaluator/Cargo.toml	Adds `tokio` + `tracing` workspace deps for new async/logging usage.
crates/platform/src/server.rs	Switches from eager `init_caches` to `ChainConfigWatch` and new evaluator construction; returns watcher to callers.
crates/platform/src/main.rs	Adapts to new `build()` return tuple shape.
crates/platform/tests/common/mod.rs	Exposes watcher from app builder and adds `initialize_app()` that waits for readiness when needed.
crates/platform/tests/{root,metrics,submit,icebreakers,data_node}_test.rs	Updates destructuring to match new build tuple.
crates/node/src/lib.rs	Exposes new `chain_config_watch` module.
crates/node/src/ledger.rs	Removes now-redundant one-off protocol/genesis query helpers.
crates/node/src/chain_config_watch.rs	Implements the watcher (sync gating, epoch-boundary scheduling, refresh loop) plus unit tests for scheduling.
crates/node/src/chain_config.rs	Minor refactor of error mapping for genesis/pp init.
crates/common/src/errors.rs	Adds `service_unavailable()` and maps 503 to `StatusCode::SERVICE_UNAVAILABLE`.
crates/common/src/chain_config.rs	Improves docs and introduces `ChainConfigCache::CONWAY_ERA` constant.
Cargo.lock	Records new dependencies pulled into `tx_evaluator`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-24T11:09:29Z

crates/node/src/chain_config_watch.rs

+            if info.era != CONWAY_ERA {
+                SyncStatus::NotReady(format!(
+                    "Node is in era {} (need Conway era {CONWAY_ERA})",
+                    info.era
+                ))
+            } else if info.sync_progress < SYNC_THRESHOLD {
+                SyncStatus::NotReady(format!(


ChainConfigCache::CONWAY_ERA is documented as the minimum supported era, but the sync gate treats any era other than exactly Conway as "not ready". This will block forever once the node moves to a future era (>6) even if the platform could still operate. Consider using a info.era < CONWAY_ERA check (or otherwise defining the intended policy explicitly).

Copilot · 2026-03-24T11:09:29Z

crates/node/src/chain_config_watch.rs

+
+    // Slots into the current epoch.
+    let slot_in_epoch = (current_slot - slot_config.zero_slot) % slot_config.epoch_length;
+    let slots_until_next = slot_config.epoch_length - slot_in_epoch;


duration_until_next_epoch waits a full epoch when slot_in_epoch == 0 (exactly on an epoch boundary), because slots_until_next becomes epoch_length. That can delay the epoch-boundary refresh by an entire epoch in the boundary case. Consider computing slots_until_next so that a boundary yields 0 (or 1) and then adding the buffer, e.g. (epoch_length - slot_in_epoch) % epoch_length.

Suggested change

let slots_until_next = slot_config.epoch_length - slot_in_epoch;

// Compute slots until next epoch boundary; if we're exactly on a boundary,

// this yields 0 instead of a full epoch.

let slots_until_next =

(slot_config.epoch_length - slot_in_epoch) % slot_config.epoch_length;

Copilot · 2026-03-24T11:09:29Z

crates/node/src/chain_config_watch.rs

+            if self.rx.changed().await.is_err() {
+                break;


wait_ready returns silently if the watch sender is dropped (e.g., the background task panics/exits) even though the config is still None, which contradicts the method’s intent and can make tests proceed with an unready app. Consider returning a Result/error on channel closure (or at least logging) so callers can fail fast instead of continuing without config.

Suggested change

if self.rx.changed().await.is_err() {

break;

match self.rx.changed().await {

Ok(_) => {

// value changed; loop condition will re-check for readiness

}

Err(err) => {

// Sender was dropped before config became available; this indicates

// that the background task has terminated and readiness will never be reached.

tracing::error!(

"ChainConfigWatch: watch channel closed before configuration became available: {}",

err

);

break;

}

Copilot · 2026-03-24T11:09:30Z

crates/tx_evaluator/src/external.rs

        }
+        state.testgen = Some(testgen.clone());
+        state.last_config = Some(current_config);
+        tracing::info!("ExternalEvaluator: testgen-hs initialized successfully");


When reinitializing, the previous Testgen instance is dropped and replaced. In bf_testgen::Testgen, dropping all senders closes the channel, and the worker thread treats "request channel closed" as an error and keeps restarting the child in an infinite loop (spawning a new testgen-hs repeatedly). With epoch-boundary respawns this can leak processes/threads over time. Consider adding an explicit shutdown path for Testgen (and exiting the worker thread on channel close), or otherwise ensuring the old process is terminated cleanly before dropping the handle.

Suggested change

}

state.testgen = Some(testgen.clone());

state.last_config = Some(current_config);

tracing::info!("ExternalEvaluator: testgen-hs initialized successfully");

}

// Config changed: take the old Testgen so we can shut it down explicitly.

let old_testgen = if let Some(ref last) = state.last_config {

if !Arc::ptr_eq(last, &current_config) {

state.testgen.take()

} else {

None

}

} else {

None

};

state.testgen = Some(testgen.clone());

state.last_config = Some(current_config);

tracing::info!("ExternalEvaluator: testgen-hs initialized successfully");

// Drop the lock before awaiting shutdown to avoid holding the mutex across .await.

drop(state);

if let Some(old) = old_testgen {

if let Err(err) = old.shutdown().await {

tracing::warn!(

"ExternalEvaluator: failed to cleanly shut down previous testgen-hs instance: {err}"

);

}

}

Copilot · 2026-03-24T11:09:30Z

crates/common/src/chain_config.rs


-/// This structure is used to share server-wide cachables
+/// Cached chain configuration queried from the Cardano node at startup and
+/// refreshed at epoch boundaries by [`ChainConfigWatch`].


This doc comment uses an intra-doc link to ChainConfigWatch, but that type lives in bf_node and isn’t in scope in the bf_common crate. This can produce broken intra-doc-link warnings during cargo doc/docs.rs. Consider removing the link formatting or referencing it as plain text (or using an external URL) to avoid unresolved links.

Suggested change

/// refreshed at epoch boundaries by [`ChainConfigWatch`].

/// refreshed at epoch boundaries by `ChainConfigWatch`.

ginnun · 2026-03-24T13:46:16Z

@michalrus Let's wait a bit to review this PR. I want to understand if it would be cheaper to implement #482 (replacing all this binary callings) , when compared to me doing all the AI feedbacks and you reviewing all the work.

michalrus · 2026-03-24T13:50:09Z

I want to understand if it would be cheaper to implement #482

Oh, I don’t know! I'm assuming it would be quite a feat to get it working, I never tried that. I'm not even sure if this is the correct idea, after all we were supposed to write it in Rust 😅 And now we'd link the Haskell binary with ours.

feat: lazy chain config init via ChainConfigWatch with epoch-boundary…

2ea2983

… refresh

ginnun linked an issue Mar 23, 2026 that may be closed by this pull request

Replace eager chain config init with lazy config init and epoch-boundary refresh #489

Open

ginnun added the enhancement New feature or request label Mar 23, 2026

ginnun self-assigned this Mar 23, 2026

ginnun mentioned this pull request Mar 23, 2026

testgen-hs: load it as a shared library? .so/DLL/.dylib? #482

Open

michalrus mentioned this pull request Mar 23, 2026

Tx evaluation #378

Open

ginnun commented Mar 24, 2026

View reviewed changes

vladimirvolek requested a review from Copilot March 24, 2026 11:03

Copilot started reviewing on behalf of vladimirvolek March 24, 2026 11:03 View session

Copilot AI reviewed Mar 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: lazy chain config init via ChainConfigWatch with epoch-boundary…#490

feat: lazy chain config init via ChainConfigWatch with epoch-boundary…#490
ginnun wants to merge 1 commit intofeat/tx-evaluation-rebasedfrom
feat/chain-config-from-node

ginnun commented Mar 23, 2026 •

edited by michalrus

Loading

Uh oh!

cloudflare-workers-and-pages bot commented Mar 23, 2026

Uh oh!

ginnun Mar 24, 2026

Uh oh!

michalrus Mar 24, 2026

Uh oh!

michalrus Mar 31, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

Copilot AI Mar 24, 2026

Uh oh!

ginnun commented Mar 24, 2026

Uh oh!

michalrus commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    let slots_until_next = slot_config.epoch_length - slot_in_epoch;
+    // Compute slots until next epoch boundary; if we're exactly on a boundary,
+    // this yields 0 instead of a full epoch.
+    let slots_until_next =
+        (slot_config.epoch_length - slot_in_epoch) % slot_config.epoch_length;

-            if self.rx.changed().await.is_err() {
-                break;
+            match self.rx.changed().await {
+                Ok(_) => {
+                    // value changed; loop condition will re-check for readiness
+                }
+                Err(err) => {
+                    // Sender was dropped before config became available; this indicates
+                    // that the background task has terminated and readiness will never be reached.
+                    tracing::error!(
+                        "ChainConfigWatch: watch channel closed before configuration became available: {}",
+                        err
+                    );
+                    break;
+                }

-        }
-        state.testgen = Some(testgen.clone());
-        state.last_config = Some(current_config);
-        tracing::info!("ExternalEvaluator: testgen-hs initialized successfully");
+        }
+        // Config changed: take the old Testgen so we can shut it down explicitly.
+        let old_testgen = if let Some(ref last) = state.last_config {
+            if !Arc::ptr_eq(last, &current_config) {
+                state.testgen.take()
+            } else {
+                None
+            }
+        } else {
+            None
+        };
+        state.testgen = Some(testgen.clone());
+        state.last_config = Some(current_config);
+        tracing::info!("ExternalEvaluator: testgen-hs initialized successfully");
+        // Drop the lock before awaiting shutdown to avoid holding the mutex across .await.
+        drop(state);
+        if let Some(old) = old_testgen {
+            if let Err(err) = old.shutdown().await {
+                tracing::warn!(
+                    "ExternalEvaluator: failed to cleanly shut down previous testgen-hs instance: {err}"
+                );
+            }
+        }

	/// refreshed at epoch boundaries by [`ChainConfigWatch`].
	/// refreshed at epoch boundaries by `ChainConfigWatch`.

Conversation

ginnun commented Mar 23, 2026 • edited by michalrus Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Uh oh!

cloudflare-workers-and-pages bot commented Mar 23, 2026

Deploying blockfrost-platform with Cloudflare Pages

Uh oh!

ginnun Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

michalrus Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

michalrus Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

Uh oh!

ginnun commented Mar 24, 2026

Uh oh!

michalrus commented Mar 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

ginnun commented Mar 23, 2026 •

edited by michalrus

Loading