Conversation
📝 WalkthroughWalkthroughAdds Prometheus-based metrics and recording to anchor-evm and anchor-service: new metrics modules, workspace/prometheus deps, optional Metrics injection points, instrumentation across EVM transaction flow and AnchorService batch pipeline, and timing/gas fields on AnchorResult. Changes
Sequence Diagram(s)sequenceDiagram
participant Service as AnchorService
participant TM as EvmTransactionManager
participant Node as EthereumNode
participant Metrics as MetricsRegistry
participant Store as EventStore
rect rgba(200,200,255,0.5)
Service->>Metrics: record(ServiceStarted)
end
rect rgba(200,255,200,0.5)
Service->>Service: fetch & deduplicate requests
Service->>Metrics: record(BatchStarted)
end
rect rgba(255,200,200,0.5)
Service->>TM: submit_and_wait(batch_tx)
TM->>Metrics: record(TransactionSubmitted)
TM->>Node: send transaction
Node-->>TM: tx_hash / error
TM->>Node: poll confirmations
TM->>Metrics: record(TransactionSucceeded / TransactionFailed / RetryAttempt)
TM-->>Service: AnchorResult { total_duration, confirmation_duration, gas_cost_wei, ... }
end
rect rgba(200,255,200,0.5)
Service->>Store: store time events
Store-->>Service: ack
Service->>Metrics: record(BatchSucceeded / HighWaterMarkUpdated)
end
rect rgba(200,200,255,0.5)
Service->>Metrics: record(ServiceStopped)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs). Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
anchor-evm/src/metrics.rs (1)
255-311: Strengthen metrics tests with value assertions.Lines 255-311 currently confirm only that record paths execute. Please assert observable counter/gauge/histogram outcomes to protect event-to-metric wiring.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-evm/src/metrics.rs` around lines 255 - 311, Tests currently only call Metrics::register and Metrics.record for various EvmEvent variants but do not assert any metric values; update each test (e.g., test_record_transaction_submitted, test_record_transaction_succeeded, test_record_transaction_failed, test_record_retry_attempt, test_record_wallet_balance) to query the Registry after invoking Metrics.record and assert the expected metric changes — for example check the appropriate counter increment for TransactionSubmitted/RetryAttempt, error-type labeled counter for TransactionFailed, histogram buckets or observed value for TransactionSucceeded durations and gas_cost_wei, and gauge value for WalletBalanceUpdated — by using the same Registry instance returned to Metrics::register to gather metrics and comparing the concrete values/labels to the expected outcomes.anchor-service/src/metrics.rs (1)
319-375: Add assertions for metric values in tests.Lines 319-375 only verify that calls don’t panic. They won’t catch regressions like wrong counters/histograms being updated. Please assert emitted values (e.g., via encoded registry output) for each event path.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-service/src/metrics.rs` around lines 319 - 375, Tests currently only ensure methods don't panic; update each test to assert the registry actually recorded the expected metrics after calling Metrics::register(...) and metrics.record(...). For each test, call registry.gather() (or encode with prometheus::TextEncoder) and assert the presence and expected values for the relevant metric families: ServiceStarted/ServiceStopped should increment their counters; BatchSucceeded should increment counters for requests_received/requests_after_dedup and observe histogram values/duration metrics (total_duration, fetch_duration, tree_build_duration, anchor_duration, store_duration); BatchFailed should increment the failure counter with the error_type label; BatchEmpty should increment its counter; HighWaterMarkUpdated should set/update the gauge to the provided value. Use the Metrics::register return and AnchorEvent variants (ServiceStarted, ServiceStopped, BatchSucceeded, BatchFailed, BatchEmpty, HighWaterMarkUpdated) to locate and validate the exact metric names/labels produced by your Metrics implementation.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@anchor-evm/src/metrics.rs`:
- Around line 239-245: EvmEvent::WalletBalanceUpdated currently converts a u128
balance_wei to i64 with a plain as cast which can wrap for values > i64::MAX;
change the conversion to attempt a safe conversion (use i64::try_from or
equivalent) and clamp overflow to i64::MAX (or 0 for underflow) before calling
wallet_balance_gwei.get_or_create(&ChainLabels { chain_id: *chain_id
}).set(...); ensure you first divide balance_wei by 1_000_000_000 to get gwei,
then perform the try_from/clamp and pass the resulting i64 to set to avoid
negative wraparound.
In `@anchor-service/src/anchor_batch.rs`:
- Around line 179-190: The code currently calls
self.high_water_mark_store.high_water_mark().await.expect(...), which will panic
on transient DB failures; change this to handle the Result instead: call
self.high_water_mark_store.high_water_mark().await and match on
Ok(high_water_mark) to continue and record AnchorEvent::HighWaterMarkUpdated as
before, but on Err(e) record the failure metric (e.g. emit a BatchFailed or
similar failure label via self.metrics) and return Err(e) (or map it into the
function's error type) instead of panicking so the caller can handle the error
and metrics are emitted; update references to
high_water_mark_store.high_water_mark(), AnchorEvent::HighWaterMarkUpdated, and
the BatchFailed metric emission accordingly.
---
Nitpick comments:
In `@anchor-evm/src/metrics.rs`:
- Around line 255-311: Tests currently only call Metrics::register and
Metrics.record for various EvmEvent variants but do not assert any metric
values; update each test (e.g., test_record_transaction_submitted,
test_record_transaction_succeeded, test_record_transaction_failed,
test_record_retry_attempt, test_record_wallet_balance) to query the Registry
after invoking Metrics.record and assert the expected metric changes — for
example check the appropriate counter increment for
TransactionSubmitted/RetryAttempt, error-type labeled counter for
TransactionFailed, histogram buckets or observed value for TransactionSucceeded
durations and gas_cost_wei, and gauge value for WalletBalanceUpdated — by using
the same Registry instance returned to Metrics::register to gather metrics and
comparing the concrete values/labels to the expected outcomes.
In `@anchor-service/src/metrics.rs`:
- Around line 319-375: Tests currently only ensure methods don't panic; update
each test to assert the registry actually recorded the expected metrics after
calling Metrics::register(...) and metrics.record(...). For each test, call
registry.gather() (or encode with prometheus::TextEncoder) and assert the
presence and expected values for the relevant metric families:
ServiceStarted/ServiceStopped should increment their counters; BatchSucceeded
should increment counters for requests_received/requests_after_dedup and observe
histogram values/duration metrics (total_duration, fetch_duration,
tree_build_duration, anchor_duration, store_duration); BatchFailed should
increment the failure counter with the error_type label; BatchEmpty should
increment its counter; HighWaterMarkUpdated should set/update the gauge to the
provided value. Use the Metrics::register return and AnchorEvent variants
(ServiceStarted, ServiceStopped, BatchSucceeded, BatchFailed, BatchEmpty,
HighWaterMarkUpdated) to locate and validate the exact metric names/labels
produced by your Metrics implementation.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (8)
anchor-evm/Cargo.tomlanchor-evm/src/evm_transaction_manager.rsanchor-evm/src/lib.rsanchor-evm/src/metrics.rsanchor-service/Cargo.tomlanchor-service/src/anchor_batch.rsanchor-service/src/lib.rsanchor-service/src/metrics.rs
anchor-service/src/anchor_batch.rs
Outdated
| let high_water_mark = self | ||
| .high_water_mark_store | ||
| .high_water_mark() | ||
| .await | ||
| .expect("error getting high water mark from database"); | ||
|
|
||
| // Update high water mark metric | ||
| if let Some(ref metrics) = self.metrics { | ||
| metrics.record(&AnchorEvent::HighWaterMarkUpdated { | ||
| value: high_water_mark, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Handle high-water-mark read errors without panicking.
Line 183 uses .expect(...); a transient DB failure will crash batch processing and skip BatchFailed metrics emission. This path should return Err and record a failure label instead.
Proposed fix
- let high_water_mark = self
- .high_water_mark_store
- .high_water_mark()
- .await
- .expect("error getting high water mark from database");
+ let high_water_mark = match self.high_water_mark_store.high_water_mark().await {
+ Ok(value) => value,
+ Err(e) => {
+ if let Some(ref metrics) = self.metrics {
+ metrics.record(&AnchorEvent::BatchFailed {
+ error_type: "high_water_mark_error".to_string(),
+ });
+ }
+ return Err(e);
+ }
+ };🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@anchor-service/src/anchor_batch.rs` around lines 179 - 190, The code
currently calls self.high_water_mark_store.high_water_mark().await.expect(...),
which will panic on transient DB failures; change this to handle the Result
instead: call self.high_water_mark_store.high_water_mark().await and match on
Ok(high_water_mark) to continue and record AnchorEvent::HighWaterMarkUpdated as
before, but on Err(e) record the failure metric (e.g. emit a BatchFailed or
similar failure label via self.metrics) and return Err(e) (or map it into the
function's error type) instead of panicking so the caller can handle the error
and metrics are emitted; update references to
high_water_mark_store.high_water_mark(), AnchorEvent::HighWaterMarkUpdated, and
the BatchFailed metric emission accordingly.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
anchor-evm/src/evm_transaction_manager.rs (1)
219-237:⚠️ Potential issue | 🟠 MajorCompute gas-cost baseline after pending transactions are cleared.
starting_balanceis sampled beforewait_for_pending_transactions(...). If old pending txs get mined during that wait, current gas-cost metrics include unrelated spend.Proposed fix
- // Log starting wallet balance - let starting_balance = provider - .get_balance(wallet_address) - .await - .map_err(|e| anyhow!("Failed to get wallet balance: {}", e))?; - info!("Starting wallet balance: {} wei", starting_balance); - - // Update wallet balance metric - if let Some(ref metrics) = self.metrics { - // U256 to u128: saturate at u128::MAX (a very large balance) - let balance_u128: u128 = starting_balance.try_into().unwrap_or(u128::MAX); - metrics.record(&EvmEvent::WalletBalanceUpdated { - chain_id: self.config.chain_id, - balance_wei: balance_u128, - }); - } - // Wait for any pending transactions from previous runs to clear Self::wait_for_pending_transactions(&provider, wallet_address).await; + + // Log starting wallet balance (after pending txs are cleared) + let starting_balance = provider + .get_balance(wallet_address) + .await + .map_err(|e| anyhow!("Failed to get wallet balance: {}", e))?; + info!("Starting wallet balance: {} wei", starting_balance); + + if let Some(ref metrics) = self.metrics { + let balance_u128: u128 = starting_balance.try_into().unwrap_or(u128::MAX); + metrics.record(&EvmEvent::WalletBalanceUpdated { + chain_id: self.config.chain_id, + balance_wei: balance_u128, + }); + }Also applies to: 316-333
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-evm/src/evm_transaction_manager.rs` around lines 219 - 237, The starting_balance is sampled before pending transactions are awaited, so if prior txs mine during Self::wait_for_pending_transactions(...) the recorded WalletBalanceUpdated (EvmEvent::WalletBalanceUpdated) will include unrelated gas spend; move the balance sampling and metrics.record block (the code that sets starting_balance, converts to balance_u128, and calls metrics.record) to after Self::wait_for_pending_transactions(&provider, wallet_address). Apply the same change to the analogous block around lines 316-333 to ensure balances are always taken after pending transactions clear.
♻️ Duplicate comments (2)
anchor-evm/src/metrics.rs (1)
255-257:⚠️ Potential issue | 🟠 MajorPrevent signed wraparound when recording wallet balance gauge.
On Line 256,
as i64can wrap for large balances and emit negative gauge values. Use checked conversion and clamp before.set(...).Proposed fix
- // Convert wei to gwei (divide by 10^9) to fit in i64 - let balance_gwei = (*balance_wei / 1_000_000_000) as i64; + // Convert wei to gwei (divide by 10^9) and clamp to i64 range + let balance_gwei_u128 = *balance_wei / 1_000_000_000; + let balance_gwei = i64::try_from(balance_gwei_u128).unwrap_or(i64::MAX); self.wallet_balance_gwei .get_or_create(&labels) .set(balance_gwei);#!/bin/bash # Verify current conversion site and inspect surrounding code. rg -n -C3 "WalletBalanceUpdated|balance_gwei|as i64" anchor-evm/src/metrics.rs🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-evm/src/metrics.rs` around lines 255 - 257, The current cast "(*balance_wei / 1_000_000_000) as i64" can wrap and produce negative gauge values; change the conversion to use a checked/try conversion from the unsigned gwei value to i64 and clamp to the valid range before calling self.wallet_balance_gwei.set(...). Locate the conversion site around balance_gwei and the metric setter (references: balance_wei, balance_gwei, self.wallet_balance_gwei, WalletBalanceUpdated) and replace the direct as i64 cast with a safe conversion (e.g., compute gwei as an unsigned integer, attempt try_into::<i64>() or use i64::try_from, and on Err or overflow set to i64::MAX or 0 as appropriate) then pass the clamped i64 to .set().anchor-service/src/anchor_batch.rs (1)
179-183: Handle high-water-mark read errors without panicking.
.expect()will crash batch processing on a transient DB error, bypassingBatchFailedmetric emission entirely.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-service/src/anchor_batch.rs` around lines 179 - 183, Replace the panic-causing expect on the high water mark read with proper error handling: call self.high_water_mark_store.high_water_mark().await and match the Result; on Err(e) log the error, increment/emit the BatchFailed metric and return/propagate an Err from the enclosing function instead of panicking, and on Ok(val) continue using val as high_water_mark. Update any function signature if necessary to return a Result so the error can be propagated; reference the high_water_mark_store.high_water_mark() call and the variable high_water_mark so reviewers can locate the change.
🧹 Nitpick comments (4)
anchor-evm/src/metrics.rs (1)
269-325: Strengthen metrics tests with value assertions.These tests currently only call
record(...); they don't assert counters/gauges/histograms changed as expected. A recorder regression could pass unnoticed.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-evm/src/metrics.rs` around lines 269 - 325, Update each test that calls Metrics::register and metrics.record(...) to assert the metric changes: after recording an EvmEvent (TransactionSubmitted, TransactionSucceeded, TransactionFailed, RetryAttempt, WalletBalanceUpdated), call registry.gather() and locate the specific metric by name (e.g., transaction_submitted_total, transaction_succeeded_duration_seconds histogram, transaction_failed_total with label error_type, retry_attempts_total, wallet_balance_wei gauge) and assert the sample value(s) reflect the expected increment or observed value; use the Metrics::register return to know metric label keys (like chain_id) and assert label matches chain_id=100 as well as expected histogram buckets/counts and gauge value for balance_wei and scalar counters for submissions/failures/retries. Ensure each test replaces the no-op record call with these concrete assertions so regressions will fail.anchor-service/src/metrics.rs (3)
83-86:requests_deduplicatedcounter name is semantically inverted.The counter stores
requests_after_dedup— the count of requests surviving deduplication — but the namerequests_deduplicatedimplies the count of requests removed by deduplication. These are inverses. The doc comment is correct; the identifier is not.♻️ Proposed rename
- /// Requests remaining after deduplication. - requests_deduplicated: Counter, + /// Requests remaining after deduplication. + requests_after_dedup: Counter,And in
register!/record/Self {}construction:- register!( - requests_deduplicated, - "Anchor requests remaining after deduplication", - Counter::default(), - sub_registry - ); + register!( + requests_after_dedup, + "Anchor requests remaining after deduplication", + Counter::default(), + sub_registry + );- self.requests_deduplicated.inc_by(*requests_after_dedup); + self.requests_after_dedup.inc_by(*requests_after_dedup);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-service/src/metrics.rs` around lines 83 - 86, The field name requests_deduplicated is semantically inverted: it actually counts requests_after_dedup (requests surviving deduplication). Rename the identifier to requests_after_dedup (and update all usages) so it matches the doc comment; update its declaration, the register! invocation, any record() calls, and the Self { ... } construction to use requests_after_dedup consistently.
107-108:dedup_ratiosemantics conflict with its name.The histogram stores
after_dedup / received(retention ratio: 1.0 = nothing removed). The namededup_ratiotypically implies how much was deduped (1 - retention), which is the opposite. A consumer building a dashboard alert on this metric would likely interpret it backwards.Consider renaming to
dedup_retention_ratioand updating the doc comment to say "Ratio of requests surviving deduplication (1.0 = no deduplication)" to make the direction explicit.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-service/src/metrics.rs` around lines 107 - 108, The field and its docs are misleading: rename the Histogram field dedup_ratio to dedup_retention_ratio and update its doc comment to "Ratio of requests surviving deduplication (1.0 = no deduplication)"; also update every usage/registration/metric label that references dedup_ratio (e.g., where the Histogram is constructed, recorded to, or exported) to use dedup_retention_ratio so callers and dashboards reflect the retention semantics consistently.
313-374: Tests only verify no panic; they don't assert recorded values.None of the tests inspect the resulting counter/gauge/histogram values after recording events. A regression in the
record()impl (e.g., incrementing the wrong field) would pass undetected.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-service/src/metrics.rs` around lines 313 - 374, Tests currently only ensure Metrics::register and Metrics::record don't panic but never assert the resulting metric values; update the tests that call Metrics::record (e.g., test_record_service_lifecycle, test_record_batch_succeeded, test_record_batch_failed, test_record_batch_empty, test_record_high_water_mark) to gather metrics from the Registry after each record and assert the expected counter/gauge/histogram values were mutated correctly for the corresponding AnchorEvent variant. Specifically, after calling Metrics::record on a given event, use the Registry::gather() output to find metric families by their names (the names defined in Metrics::register), then assert counts, gauge values, and histogram buckets/sums match the expected increments or values for ServiceStarted/ServiceStopped, BatchSucceeded fields (requests_received, requests_after_dedup, durations), BatchFailed error counter, BatchEmpty counter, and HighWaterMarkUpdated gauge; this will catch regressions where the wrong metric is updated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@anchor-evm/src/evm_transaction_manager.rs`:
- Around line 171-172: The transaction_duration metric is being started too
early by setting total_start = Instant::now() before provider setup and
preflight; change the timing so the Instant::now() is taken immediately before
the actual transaction submission (right after preflight checks and provider
readiness) and use that timestamp for measuring submission→confirmation; update
all other occurrences where transaction_duration is measured (the other
total_start usages) to the same pattern so the metric consistently reflects
submission-to-confirmation only, and rename or reuse the variable (e.g.,
transaction_start) near the submit_transaction / send_and_confirm code paths to
make intent clear.
In `@anchor-service/src/anchor_batch.rs`:
- Around line 234-292: process_next_batch duplicates the pipeline implemented in
anchor_batch; refactor process_next_batch to call anchor_batch (or extracted
sub-methods) instead of inlining build_merkle_tree, tx_manager.anchor_root, and
build_time_events, and ensure you map errors back to metrics by catching each
phase's Result and emitting metrics.record(&AnchorEvent::BatchFailed {
error_type: "<phase>_error".to_string() }) before returning the propagated
error; use the same returned values
(root_cid/proof/remote_merkle_nodes/chain_inclusion/time_events or the
refactored TimeEventBatch) to construct the TimeEventBatch/continue processing
so timing/metrics are preserved, referencing functions/ids: process_next_batch,
anchor_batch, build_merkle_tree, tx_manager.anchor_root, build_time_events,
TimeEventBatch, and AnchorEvent::BatchFailed.
- Around line 185-190: The pre-batch emission of
AnchorEvent::HighWaterMarkUpdated is incorrect because the high water mark is
only read, not modified; remove the metrics.record call that emits
HighWaterMarkUpdated from the pre-batch block (the one surrounding
metrics.record(&AnchorEvent::HighWaterMarkUpdated { value: high_water_mark }))
and rely on the existing post-store emission from store_time_events, or
alternatively replace it with a new read-specific event (e.g.,
AnchorEvent::HighWaterMarkRead) if you need to capture the starting offset;
update any metric name/enum usage accordingly so HighWaterMarkUpdated is only
emitted after store_time_events completes.
---
Outside diff comments:
In `@anchor-evm/src/evm_transaction_manager.rs`:
- Around line 219-237: The starting_balance is sampled before pending
transactions are awaited, so if prior txs mine during
Self::wait_for_pending_transactions(...) the recorded WalletBalanceUpdated
(EvmEvent::WalletBalanceUpdated) will include unrelated gas spend; move the
balance sampling and metrics.record block (the code that sets starting_balance,
converts to balance_u128, and calls metrics.record) to after
Self::wait_for_pending_transactions(&provider, wallet_address). Apply the same
change to the analogous block around lines 316-333 to ensure balances are always
taken after pending transactions clear.
---
Duplicate comments:
In `@anchor-evm/src/metrics.rs`:
- Around line 255-257: The current cast "(*balance_wei / 1_000_000_000) as i64"
can wrap and produce negative gauge values; change the conversion to use a
checked/try conversion from the unsigned gwei value to i64 and clamp to the
valid range before calling self.wallet_balance_gwei.set(...). Locate the
conversion site around balance_gwei and the metric setter (references:
balance_wei, balance_gwei, self.wallet_balance_gwei, WalletBalanceUpdated) and
replace the direct as i64 cast with a safe conversion (e.g., compute gwei as an
unsigned integer, attempt try_into::<i64>() or use i64::try_from, and on Err or
overflow set to i64::MAX or 0 as appropriate) then pass the clamped i64 to
.set().
In `@anchor-service/src/anchor_batch.rs`:
- Around line 179-183: Replace the panic-causing expect on the high water mark
read with proper error handling: call
self.high_water_mark_store.high_water_mark().await and match the Result; on
Err(e) log the error, increment/emit the BatchFailed metric and return/propagate
an Err from the enclosing function instead of panicking, and on Ok(val) continue
using val as high_water_mark. Update any function signature if necessary to
return a Result so the error can be propagated; reference the
high_water_mark_store.high_water_mark() call and the variable high_water_mark so
reviewers can locate the change.
---
Nitpick comments:
In `@anchor-evm/src/metrics.rs`:
- Around line 269-325: Update each test that calls Metrics::register and
metrics.record(...) to assert the metric changes: after recording an EvmEvent
(TransactionSubmitted, TransactionSucceeded, TransactionFailed, RetryAttempt,
WalletBalanceUpdated), call registry.gather() and locate the specific metric by
name (e.g., transaction_submitted_total, transaction_succeeded_duration_seconds
histogram, transaction_failed_total with label error_type, retry_attempts_total,
wallet_balance_wei gauge) and assert the sample value(s) reflect the expected
increment or observed value; use the Metrics::register return to know metric
label keys (like chain_id) and assert label matches chain_id=100 as well as
expected histogram buckets/counts and gauge value for balance_wei and scalar
counters for submissions/failures/retries. Ensure each test replaces the no-op
record call with these concrete assertions so regressions will fail.
In `@anchor-service/src/metrics.rs`:
- Around line 83-86: The field name requests_deduplicated is semantically
inverted: it actually counts requests_after_dedup (requests surviving
deduplication). Rename the identifier to requests_after_dedup (and update all
usages) so it matches the doc comment; update its declaration, the register!
invocation, any record() calls, and the Self { ... } construction to use
requests_after_dedup consistently.
- Around line 107-108: The field and its docs are misleading: rename the
Histogram field dedup_ratio to dedup_retention_ratio and update its doc comment
to "Ratio of requests surviving deduplication (1.0 = no deduplication)"; also
update every usage/registration/metric label that references dedup_ratio (e.g.,
where the Histogram is constructed, recorded to, or exported) to use
dedup_retention_ratio so callers and dashboards reflect the retention semantics
consistently.
- Around line 313-374: Tests currently only ensure Metrics::register and
Metrics::record don't panic but never assert the resulting metric values; update
the tests that call Metrics::record (e.g., test_record_service_lifecycle,
test_record_batch_succeeded, test_record_batch_failed, test_record_batch_empty,
test_record_high_water_mark) to gather metrics from the Registry after each
record and assert the expected counter/gauge/histogram values were mutated
correctly for the corresponding AnchorEvent variant. Specifically, after calling
Metrics::record on a given event, use the Registry::gather() output to find
metric families by their names (the names defined in Metrics::register), then
assert counts, gauge values, and histogram buckets/sums match the expected
increments or values for ServiceStarted/ServiceStopped, BatchSucceeded fields
(requests_received, requests_after_dedup, durations), BatchFailed error counter,
BatchEmpty counter, and HighWaterMarkUpdated gauge; this will catch regressions
where the wrong metric is updated.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
anchor-evm/src/evm_transaction_manager.rsanchor-evm/src/metrics.rsanchor-service/src/anchor_batch.rsanchor-service/src/metrics.rs
| // Build Merkle tree and anchor | ||
| let tree_build_start = Instant::now(); | ||
| let MerkleTree { | ||
| root_cid, | ||
| nodes: local_merkle_nodes, | ||
| count, | ||
| } = match build_merkle_tree(&anchor_requests) { | ||
| Ok(tree) => tree, | ||
| Err(e) => { | ||
| if let Some(ref metrics) = self.metrics { | ||
| metrics.record(&AnchorEvent::BatchFailed { | ||
| error_type: "tree_build_error".to_string(), | ||
| }); | ||
| } | ||
| Ok(()) | ||
| return Err(e); | ||
| } | ||
| Err(e) => Err(e), | ||
| }; | ||
| let tree_build_duration = tree_build_start.elapsed(); | ||
|
|
||
| // Anchor the root via TransactionManager | ||
| let anchor_start = Instant::now(); | ||
| let RootTimeEvent { | ||
| proof, | ||
| detached_time_event, | ||
| mut remote_merkle_nodes, | ||
| chain_inclusion, | ||
| } = match self.tx_manager.anchor_root(root_cid).await { | ||
| Ok(result) => result, | ||
| Err(e) => { | ||
| if let Some(ref metrics) = self.metrics { | ||
| metrics.record(&AnchorEvent::BatchFailed { | ||
| error_type: "anchor_error".to_string(), | ||
| }); | ||
| } | ||
| return Err(e); | ||
| } | ||
| }; | ||
| let anchor_duration = anchor_start.elapsed(); | ||
|
|
||
| // Build time events | ||
| let time_events = match build_time_events(&anchor_requests, &detached_time_event, count) { | ||
| Ok(events) => events, | ||
| Err(e) => { | ||
| if let Some(ref metrics) = self.metrics { | ||
| metrics.record(&AnchorEvent::BatchFailed { | ||
| error_type: "time_event_build_error".to_string(), | ||
| }); | ||
| } | ||
| return Err(e); | ||
| } | ||
| }; | ||
|
|
||
| remote_merkle_nodes.extend(local_merkle_nodes); | ||
| let time_event_batch = TimeEventBatch { | ||
| merkle_nodes: remote_merkle_nodes, | ||
| proof, | ||
| raw_time_events: time_events, | ||
| chain_inclusion, | ||
| }; |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major
process_next_batch duplicates the entire anchor_batch pipeline.
The inlined logic at Lines 235–292 (build tree → anchor root → build time events → assemble TimeEventBatch) is identical to anchor_batch at Lines 332–351, now maintained in two places. A future change to one won't automatically apply to the other.
Refactor process_next_batch to delegate to anchor_batch, wrapping each step's ? with the appropriate BatchFailed emission:
♻️ Proposed refactor sketch
- // Build Merkle tree and anchor
- let tree_build_start = Instant::now();
- let MerkleTree { ... } = match build_merkle_tree(&anchor_requests) { ... };
- let tree_build_duration = tree_build_start.elapsed();
- let anchor_start = Instant::now();
- let RootTimeEvent { ... } = match self.tx_manager.anchor_root(root_cid).await { ... };
- let anchor_duration = anchor_start.elapsed();
- let time_events = match build_time_events(...) { ... };
- remote_merkle_nodes.extend(local_merkle_nodes);
- let time_event_batch = TimeEventBatch { ... };
+ let time_event_batch = match self.anchor_batch(&anchor_requests).await {
+ Ok(batch) => batch,
+ Err(e) => {
+ if let Some(ref metrics) = self.metrics {
+ metrics.record(&AnchorEvent::BatchFailed {
+ error_type: "anchor_error".to_string(),
+ });
+ }
+ return Err(e);
+ }
+ };Note: Per-phase timing (tree_build, anchor, store) currently tracked inline would need to be exposed by refactoring anchor_batch or split into sub-methods if granular durations are still required.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@anchor-service/src/anchor_batch.rs` around lines 234 - 292,
process_next_batch duplicates the pipeline implemented in anchor_batch; refactor
process_next_batch to call anchor_batch (or extracted sub-methods) instead of
inlining build_merkle_tree, tx_manager.anchor_root, and build_time_events, and
ensure you map errors back to metrics by catching each phase's Result and
emitting metrics.record(&AnchorEvent::BatchFailed { error_type:
"<phase>_error".to_string() }) before returning the propagated error; use the
same returned values
(root_cid/proof/remote_merkle_nodes/chain_inclusion/time_events or the
refactored TimeEventBatch) to construct the TimeEventBatch/continue processing
so timing/metrics are preserved, referencing functions/ids: process_next_batch,
anchor_batch, build_merkle_tree, tx_manager.anchor_root, build_time_events,
TimeEventBatch, and AnchorEvent::BatchFailed.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
anchor-evm/src/evm_transaction_manager.rs (1)
242-243:⚠️ Potential issue | 🟡 Minor
transaction_durationstill includes retry/backoff time.At Line 242, the timer starts before the retry loop, so values sent at Line 605-609 include delay and prior failed attempts. That still conflicts with the “submission → confirmation” semantics.
Please either (a) start timing immediately before each submission call, or (b) rename/document this as end-to-end operation duration.
Also applies to: 311-313, 398-399, 605-609
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-evm/src/evm_transaction_manager.rs` around lines 242 - 243, The timer total_start/transaction_duration currently begins before the retry loop, so logged transaction_duration includes retry/backoff time; either move the Instant::now() to just before each actual submission call (i.e., start timing immediately before the call that sends/submits the transaction) so transaction_duration measures submission→confirmation, or else rename/document the variable to reflect end-to-end operation duration; update all occurrences that use total_start/transaction_duration (the variables named total_start and transaction_duration in evm_transaction_manager.rs and the places that log them around the submission/confirmation flow) to follow the chosen approach for consistency.
🧹 Nitpick comments (1)
anchor-evm/src/metrics.rs (1)
270-327: Add assertions to metric recording tests (currently smoke-only).At Line 270-327, tests call
record(...)but never validate counters/gauges/histograms changed as expected. Adding assertions here will catch silent regressions in metric wiring.Proposed test hardening
#[test] fn test_record_transaction_submitted() { let mut registry = Registry::default(); let metrics = Metrics::register(&mut registry); metrics.record(&EvmEvent::TransactionSubmitted { chain_id: 100 }); + let labels = ChainLabels { chain_id: 100 }; + assert_eq!(metrics.transactions_total.get_or_create(&labels).get(), 1); } #[test] fn test_record_wallet_balance() { let mut registry = Registry::default(); let metrics = Metrics::register(&mut registry); // 1 ETH in wei metrics.record(&EvmEvent::WalletBalanceUpdated { chain_id: 100, balance_wei: 1_000_000_000_000_000_000, }); + let labels = ChainLabels { chain_id: 100 }; + assert_eq!(metrics.wallet_balance_gwei.get_or_create(&labels).get(), 1_000_000_000); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@anchor-evm/src/metrics.rs` around lines 270 - 327, The tests currently call Metrics::register and Metrics::record but don't assert metrics changed; add assertions after each metrics.record(...) to validate the expected metric families and values. For each test: call Registry::gather() (or Registry::get_metric_family_by_name) to find the specific metric names produced by Metrics (e.g., "transactions_submitted_total", "transactions_succeeded_total", "transaction_duration_seconds", "transaction_confirmation_seconds", "transaction_gas_cost_wei", "transactions_failed_total", "retry_attempts_total", "wallet_balance_wei"), assert the counter/gauge/histogram value incremented or set and that labels like chain_id (and error_type for TransactionFailed) match the event, and for TransactionSucceeded assert histogram buckets/sum/count reflect the provided durations and gas_cost_wei; update tests for TransactionSubmitted, TransactionSucceeded, TransactionFailed, RetryAttempt, and WalletBalanceUpdated accordingly using the same Metrics::register and Metrics::record calls to locate metrics via those metric names and label checks.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@anchor-service/src/anchor_batch.rs`:
- Around line 216-223: The dedup step builds anchor_requests from raw_requests
but does not guarantee the largest resume_token is at the tail, which can
regress the high-water-mark; after creating anchor_requests (the
Vec<AnchorRequest> produced from IndexMap::from_iter) sort it by each request's
resume_token (ascending) so the greatest resume_token is last before downstream
processing and before computing requests_after_dedup; locate the anchor_requests
Vec, call a stable sort (e.g. sort_by_key or sort_by on request.resume_token) to
enforce monotonicity and then continue with requests_after_dedup and subsequent
HWM update.
---
Duplicate comments:
In `@anchor-evm/src/evm_transaction_manager.rs`:
- Around line 242-243: The timer total_start/transaction_duration currently
begins before the retry loop, so logged transaction_duration includes
retry/backoff time; either move the Instant::now() to just before each actual
submission call (i.e., start timing immediately before the call that
sends/submits the transaction) so transaction_duration measures
submission→confirmation, or else rename/document the variable to reflect
end-to-end operation duration; update all occurrences that use
total_start/transaction_duration (the variables named total_start and
transaction_duration in evm_transaction_manager.rs and the places that log them
around the submission/confirmation flow) to follow the chosen approach for
consistency.
---
Nitpick comments:
In `@anchor-evm/src/metrics.rs`:
- Around line 270-327: The tests currently call Metrics::register and
Metrics::record but don't assert metrics changed; add assertions after each
metrics.record(...) to validate the expected metric families and values. For
each test: call Registry::gather() (or Registry::get_metric_family_by_name) to
find the specific metric names produced by Metrics (e.g.,
"transactions_submitted_total", "transactions_succeeded_total",
"transaction_duration_seconds", "transaction_confirmation_seconds",
"transaction_gas_cost_wei", "transactions_failed_total", "retry_attempts_total",
"wallet_balance_wei"), assert the counter/gauge/histogram value incremented or
set and that labels like chain_id (and error_type for TransactionFailed) match
the event, and for TransactionSucceeded assert histogram buckets/sum/count
reflect the provided durations and gas_cost_wei; update tests for
TransactionSubmitted, TransactionSucceeded, TransactionFailed, RetryAttempt, and
WalletBalanceUpdated accordingly using the same Metrics::register and
Metrics::record calls to locate metrics via those metric names and label checks.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
anchor-evm/src/evm_transaction_manager.rsanchor-evm/src/metrics.rsanchor-service/src/anchor_batch.rs
| let anchor_requests: Vec<AnchorRequest> = IndexMap::<Cid, AnchorRequest>::from_iter( | ||
| raw_requests | ||
| .into_iter() | ||
| .map(|request| (request.id, request)), | ||
| ) | ||
| .into_values() | ||
| .collect(); | ||
| let requests_after_dedup = anchor_requests.len() as u64; |
There was a problem hiding this comment.
Ensure deduped requests preserve high-water-mark monotonicity.
At Line 216, dedup can leave anchor_requests in an order where the greatest resume_token is not last. Since downstream high-water-mark update is derived from tail position, this can regress HWM and cause reprocessing of already-anchored events.
🐛 Proposed fix
- let anchor_requests: Vec<AnchorRequest> = IndexMap::<Cid, AnchorRequest>::from_iter(
+ let mut anchor_requests: Vec<AnchorRequest> = IndexMap::<Cid, AnchorRequest>::from_iter(
raw_requests
.into_iter()
.map(|request| (request.id, request)),
)
.into_values()
.collect();
+ anchor_requests.sort_by_key(|request| request.resume_token);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@anchor-service/src/anchor_batch.rs` around lines 216 - 223, The dedup step
builds anchor_requests from raw_requests but does not guarantee the largest
resume_token is at the tail, which can regress the high-water-mark; after
creating anchor_requests (the Vec<AnchorRequest> produced from
IndexMap::from_iter) sort it by each request's resume_token (ascending) so the
greatest resume_token is last before downstream processing and before computing
requests_after_dedup; locate the anchor_requests Vec, call a stable sort (e.g.
sort_by_key or sort_by on request.resume_token) to enforce monotonicity and then
continue with requests_after_dedup and subsequent HWM update.
Summary by CodeRabbit