Skip to content

Comments

fix(spv): add distinct error state for SPV sync failures#650

Open
lklimek wants to merge 4 commits intov1.0-devfrom
fix/spv-sync-error-status
Open

fix(spv): add distinct error state for SPV sync failures#650
lklimek wants to merge 4 commits intov1.0-devfrom
fix/spv-sync-error-status

Conversation

@lklimek
Copy link
Contributor

@lklimek lklimek commented Feb 24, 2026

Issue being fixed or feature implemented

SPV sync gets permanently stuck in "Syncing" status when a fatal error occurs
(e.g., masternode QRInfo failure). The connectivity icon stays orange instead
of turning red, and no error message is shown to the user.

Symptoms observed in logs:

ERROR dash_spv::sync::masternodes::sync_manager: QRInfo failed after 4 retries: Required rotated chain lock sig at h - 0 not present for masternode diff block hash: ...
ERROR dash_spv::sync::sync_manager: Masternode error handling message: Masternode sync failed: ...

Despite these errors, the UI showed no indication of failure.

Root cause: Two gaps in error detection:

  1. spawn_sync_event_handler ignored SyncEvent::ManagerError events entirely.
  2. spawn_progress_watcher only checked is_synced() and defaulted everything
    else to SpvStatus::Syncing, never checking for SyncState::Error.

Additionally, an upstream bug in dash-spv (dashpay/rust-dashcore#469) means
the progress channel never receives the error state update, making gap #2
a dead path until the library is fixed.

What was done?

Error detection (manager.rs)

Added two error detection paths in src/spv/manager.rs:

  1. spawn_sync_event_handler (primary path): Now handles
    SyncEvent::ManagerError — transitions SpvStatus to Error and stores
    the error message in last_error. This works reliably because dash-spv
    always emits this event on manager failure.

  2. spawn_progress_watcher (defense-in-depth): Now checks
    watch_progress.state() == SyncState::Error and transitions accordingly.
    Currently blocked by upstream bug but will activate once
    bug: SyncManager does not emit progress update on manager error path rust-dashcore#469 is fixed.

Distinct Error UI state (connection_status.rs, top_panel.rs, theme.rs)

Previously, SpvStatus::Error mapped to OverallConnectionState::Disconnected
the same red static circle as "never connected". Users couldn't distinguish
between the two. Now:

  • Added OverallConnectionState::Error variant (repr 3)
  • SpvStatus::Error maps to OverallConnectionState::Error explicitly
  • Visual treatment: magenta circle with slow alarm pulse + white "!" glyph
  • Tooltip: shows "SPV sync error: {detail}" with the specific error message
    from spv_last_error (populated from SpvManager::status().last_error)
  • Added DashColors::sync_error_color() for the magenta-red indicator color
  • RPC mode includes a forward-compatible Error arm ("Connection error")

Related upstream issues

How has this been tested?

Breaking Changes

None

🤖 Co-authored by Claudius the Magnificent AI Agent

Summary by CodeRabbit

  • New Features

    • Connection indicator now distinctly displays SPV synchronization errors using a magenta color with an exclamation mark overlay.
    • Detailed error messages and diagnostic information are displayed in connection status tooltips when sync failures occur.
  • Documentation

    • Added manual test scenarios for SPV sync error state verification, including icon transitions and error message validation.

The SpvManager's progress watcher and sync event handler did not detect
fatal sync errors, leaving the UI stuck in "Syncing" status indefinitely.

Two detection paths are now in place:

1. spawn_sync_event_handler handles SyncEvent::ManagerError to transition
   SpvStatus to Error and store the error message. This is the primary
   path since dash-spv always emits this event on manager failure.

2. spawn_progress_watcher checks SyncState::Error from the progress
   channel as defense-in-depth (currently blocked by upstream bug
   dashpay/rust-dashcore#469 where try_emit_progress is not called on
   error paths).

Once SpvStatus::Error is set, the existing ConnectionStatus logic maps
it to OverallConnectionState::Disconnected, showing a red indicator icon
with "SPV: Error" in the tooltip.

Related upstream issues:
- dashpay/rust-dashcore#469 (missing progress emit on error)
- dashpay/rust-dashcore#470 (QRInfo chain lock retry robustness)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 24, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3b6e3a2 and c7f66d4.

📒 Files selected for processing (5)
  • docs/ai-design/2026-02-24-spv-sync-error-status/manual-test-scenarios.md
  • src/context/connection_status.rs
  • src/spv/manager.rs
  • src/ui/components/top_panel.rs
  • src/ui/theme.rs

📝 Walkthrough

Walkthrough

Adds SPV sync error detection and propagation from the SPV manager into connection status and UI, plus a manual test scenario doc for verifying SPV error display and diagnostic logging.

Changes

Cohort / File(s) Summary
SPV Manager
src/spv/manager.rs
Detects SyncState::Error in progress watcher, handles SyncEvent::ManagerError, sets SPV status to Error, and records detailed last_error messages (Arc-backed) for downstream consumption.
Connection Status / State
src/context/connection_status.rs
Adds OverallConnectionState::Error, stores spv_last_error: Mutex<Option<String>>, maps SpvStatus::ErrorOverallConnectionState::Error, surfaces SPV error text in tooltip/header, and resets/updates spv_last_error during refreshes.
UI Indicator
src/ui/components/top_panel.rs
Extends connection indicator rendering for the new Error state, adds magenta sync error color and pulsation, and draws an exclamation mark overlay when Error.
Theme Colors
src/ui/theme.rs
Introduces DashColors::sync_error_color(dark_mode: bool) returning magenta tones for the error indicator.
Documentation
docs/ai-design/2026-02-24-spv-sync-error-status/manual-test-scenarios.md
New manual test scenarios documenting steps to verify SPV error state, icon/tooltip transitions, expected messages, and where to find diagnostic log entries.

Sequence Diagram(s)

sequenceDiagram
    rect rgba(200,200,255,0.5)
    participant SPV as SPV Library
    end
    rect rgba(200,255,200,0.5)
    participant Manager as SPV Manager
    end
    rect rgba(255,200,200,0.5)
    participant Status as ConnectionStatus
    end
    rect rgba(255,230,200,0.5)
    participant UI as TopPanel/UI
    end

    SPV->>Manager: progress update (SyncState)
    alt SyncState::Error
        Manager->>Manager: set last_error (Arc<Mutex>)
        Manager->>Status: update SpvStatus::Error (+error detail)
        Status->>UI: refresh overall state -> OverallConnectionState::Error
        UI->>UI: render indicator (magenta + "!")
        Manager->>Status: emit diagnostic log (detailed message)
    else SyncState::Running/Complete
        Manager->>Status: update SpvStatus::Running/Synced
        Status->>UI: refresh overall state -> Synced/Syncing
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

Poem

🐰 I thumped my foot on sync today,
A magenta blink to show dismay.
Manager whispers what went wrong,
Tooltips, logs sing the error song.
Hooray — now bugs can't hide away!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(spv): add distinct error state for SPV sync failures' directly and clearly summarizes the main change: adding a distinct error state for SPV sync failures, which is the core objective of the PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/spv-sync-error-status

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spv/manager.rs`:
- Around line 1082-1087: The progress-watcher branch currently unconditionally
sets last_error to "Sync failed (reported by SPV library)" when is_error is
true, which can overwrite a more detailed error previously set by the sync event
handler; change the logic in the is_error branch (where SpvStatus::Error is
assigned) to open last_error.write(), inspect whether it is None, and only
assign the generic message if err_guard.is_none(), leaving an existing detailed
message intact (keep the existing status_guard = SpvStatus::Error assignment).

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 79e1479 and e409c29.

📒 Files selected for processing (2)
  • docs/ai-design/2026-02-24-spv-sync-error-status/manual-test-scenarios.md
  • src/spv/manager.rs

Only set the generic "Sync failed" message in last_error when no
detailed message has already been stored by the sync event handler.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/spv/manager.rs (1)

1082-1089: Past concern resolved — is_none() guard correctly preserves detailed error message.

The err_guard.is_none() check ensures the generic fallback message does not overwrite a detailed message already stored by the sync event handler, which is the fix requested in the prior review.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spv/manager.rs` around lines 1082 - 1089, The code correctly avoids
overwriting a detailed error by checking last_error.write() and
err_guard.is_none() before setting a generic message; no code change
needed—ensure the conditional around SpvStatus::Error uses the same pattern
(is_error branch setting *status_guard = SpvStatus::Error and only writing to
last_error when err_guard.is_none()) to preserve previously stored detailed
errors from the sync event handler.
🧹 Nitpick comments (1)
src/spv/manager.rs (1)

1082-1089: Consider adding inline unit tests for the new error-transition paths.

Neither the is_error branch in spawn_progress_watcher nor the SyncEvent::ManagerError handler in spawn_sync_event_handler has test coverage. The status/last_error mutation logic can be tested independently of the async spawn by extracting it into a small helper or using tokio::test. As per coding guidelines, unit tests should be written inline using #[test].

Also applies to: 1155-1163

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spv/manager.rs` around lines 1082 - 1089, Add inline unit tests covering
the error-transition logic by extracting the status/last_error mutation into a
small helper (e.g., set_spv_error_status or update_status_on_error) and then
write #[test] (or #[tokio::test] if async) cases that exercise the is_error
branch used by spawn_progress_watcher and the SyncEvent::ManagerError handling
used by spawn_sync_event_handler; tests should call the helper (or invoke the
handler functions directly) to assert SpvStatus becomes SpvStatus::Error and
last_error is set to Some(...) when an error condition is simulated, and also
include a case where last_error is already Some to ensure it is not overwritten.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/spv/manager.rs`:
- Around line 1082-1089: The code correctly avoids overwriting a detailed error
by checking last_error.write() and err_guard.is_none() before setting a generic
message; no code change needed—ensure the conditional around SpvStatus::Error
uses the same pattern (is_error branch setting *status_guard = SpvStatus::Error
and only writing to last_error when err_guard.is_none()) to preserve previously
stored detailed errors from the sync event handler.

---

Nitpick comments:
In `@src/spv/manager.rs`:
- Around line 1082-1089: Add inline unit tests covering the error-transition
logic by extracting the status/last_error mutation into a small helper (e.g.,
set_spv_error_status or update_status_on_error) and then write #[test] (or
#[tokio::test] if async) cases that exercise the is_error branch used by
spawn_progress_watcher and the SyncEvent::ManagerError handling used by
spawn_sync_event_handler; tests should call the helper (or invoke the handler
functions directly) to assert SpvStatus becomes SpvStatus::Error and last_error
is set to Some(...) when an error condition is simulated, and also include a
case where last_error is already Some to ensure it is not overwritten.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e409c29 and 3b6e3a2.

📒 Files selected for processing (1)
  • src/spv/manager.rs

Distinguish "connected but sync failed" from "never connected" by
adding an `OverallConnectionState::Error` variant. SPV sync errors
now show a magenta pulsating circle with "!" glyph instead of the
same red static circle used for disconnected state. Tooltip shows
the specific SPV error message for easier debugging.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@lklimek lklimek changed the title fix(spv): transition to error status when sync manager fails fix(spv): add distinct error state for SPV sync failures Feb 24, 2026
- Move last_error write outside status write guard in
  spawn_progress_watcher to maintain consistent lock ordering
  (status → release → last_error), eliminating latent deadlock risk.
- Update manual test scenarios to match actual implementation:
  magenta pulsating "!" indicator (not red/static/Disconnected).
- Add Scenario 4 to explicitly verify Error vs Disconnected distinction.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@lklimek
Copy link
Contributor Author

lklimek commented Feb 24, 2026

Audit Summary

Reviewed by: Claude Code with a 2-agent team:

  • code-reviewer — correctness, conventions, state machine, UI logic, test doc accuracy
  • security-engineer — concurrency safety, OWASP classification, information disclosure, resource consumption

Overall Risk: LOW — well-scoped status propagation change with no new attack surface.

Findings

# Severity OWASP Location Description Status
1 MEDIUM manual-test-scenarios.md Test doc described old behavior (red/static/Disconnected) instead of new (magenta/pulsing/"SPV sync error") ✅ Fixed in c7f66d4
2 MEDIUM A04 manager.rs:1079-1089 Nested lock: last_error write acquired inside status write guard — inconsistent ordering with event handler path (CWE-833) ✅ Fixed in c7f66d4
3 MEDIUM A04 connection_status.rs:227-278 TOCTOU: tooltip_text() reads overall_state() then spv_last_error separately — state can change between reads (CWE-367). Impact cosmetic only, corrects on next frame. ℹ️ Accepted
4 LOW A09 manager.rs:1161 Error string from dash-spv displayed verbatim in tooltip — could expose peer IPs/block hashes during screen sharing (CWE-209). Desktop app, local user already has log access. ℹ️ Note
5 LOW top_panel.rs:123,166 Error pulsation runs indefinitely since Error is a terminal state — continuous repaints (CWE-400). Minor CPU/GPU concern for long sessions. ℹ️ Note
6 LOW connection_status.rs:350 Error state polls every 2s (same as Disconnected). Since Error is terminal without user action, longer interval would reduce wasted work. ℹ️ Note
7 LOW connection_status.rs:252 RPC-mode Error tooltip reads generic "Connection error" — currently unreachable (RPC never produces Error). Forward-compat placeholder, harmless. ℹ️ Note
8 LOW connection_status.rs:53 spv_last_error uses Mutex while SpvManager.last_error uses RwLock — stylistic inconsistency, no bug. ℹ️ Note

Pre-existing / Out-of-scope

  • Existing CodeRabbit review comment about last_error overwrite was already addressed by the is_none() guard + nested lock fix in c7f66d4.

Positive Observations

  • All match arms on OverallConnectionState are exhaustive across the codebase — no missed patterns.
  • network_chooser_screen.rs != Disconnected checks naturally and correctly include Error.
  • Excellent explanatory comment in event handler about why direct ManagerError handling is needed (dash-spv doesn't update progress channel on error path).
  • reset() properly clears spv_last_error — good state hygiene.
  • Clean state machine: Error reachable only via explicit SpvStatus::Error, cleared on start()/reset().
  • Consistent poisoned-lock handling (if let Ok(...)) throughout.
  • Commit structure is logical and atomic across 4 commits.

Redundancy

Both agents flagged the nested lock issue (code-reviewer as "overwrite asymmetry", security as "nested lock acquisition"). Security's framing was more precise — the lock ordering is the root concern. Redundancy ratio: 1 finding out of 8 unique = 12.5%.

Verdict: Approve. The two MEDIUM findings have been fixed. Remaining LOW/INFO items are acceptable trade-offs for a desktop application.

🤖 Co-authored by Claudius the Magnificent AI Agent

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant