Skip to content

blockchain+netsync: follow-up fixes for headers-first IBD review#2508

Open
Roasbeef wants to merge 4 commits intomasterfrom
fix-pr-2428-review-items
Open

blockchain+netsync: follow-up fixes for headers-first IBD review#2508
Roasbeef wants to merge 4 commits intomasterfrom
fix-pr-2428-review-items

Conversation

@Roasbeef
Copy link
Copy Markdown
Member

NOTE: This PR is stacked on top of #2428 and should be reviewed/merged
after that PR lands. The diff shown here includes #2428's changes; the new
commits are the last four on this branch.

In this PR, we address a set of issues found during code review of #2428
(the headers-first IBD rework). The changes here are correctness fixes,
thread safety improvements, and performance/cleanup items that sit on top
of the new IBD pipeline. See each commit message for a detailed description
w.r.t the incremental changes.

Correctness Fixes

The new fetchHigherPeers helper dropped the segwit peer filtering that
the old startSync had. Post-segwit activation, this meant btcd could
pick a non-witness peer for sync and fail to fully validate the chain. We
restore the IsDeploymentActive + IsWitnessEnabled check inside
fetchHigherPeers (and the new fetchEqualPeers) so witness-only sync
is enforced whenever segwit is active.

The old startSync also had an equalPeers fallback for peers at the
same height, which is critical for regtest where both nodes start at
genesis (height 0) and neither is strictly higher. We add
fetchEqualPeers and wire it into the block-download path of startSync
to restore this behavior.

We also bring back syncCandidate demotion: peers whose last advertised
block has fallen below our height get syncCandidate = false so they
aren't repeatedly considered in future sync rounds. The demotion uses <
(not <=) to preserve the existing behavior where equal-height peers
remain candidates.

Thread Safety

IsValidHeader, HeaderHashByHeight, and HeaderHeightByHash all read
from bestHeader without holding chainLock, while
maybeAcceptBlockHeader modifies bestHeader under chainLock.Lock().
We acquire chainLock.RLock() in each accessor. The redundant Contains
check in HeaderHashByHeight is also removed since NodeByHeight already
guarantees membership in the chain view.

Performance

buildBlockRequest previously iterated from forkHeight+1 to
bestHeaderHeight on every refill call, i.e. O(n) per invocation. We
introduce a lastBlockRequested cursor that tracks the highest scanned
height, so subsequent calls skip already-requested ranges. The cursor
resets on peer disconnect and IBD completion.

In maybeAcceptBlockHeader, known non-invalid side-chain headers were
falling through to re-run CheckBlockHeaderSanity and
CheckBlockHeaderContext. Since these headers passed validation when first
added, the re-validation is pure overhead. We return early for known
side-chain headers.

checkHeadersList was calling IsValidHeader then HeaderHeightByHash,
each doing their own LookupNode. We add ValidHeaderHeight that
combines both into a single index lookup.

Cleanup

The unused headerNode type (left over from the old headerList-based
sync) is removed. The duplicate nil guard in buildBlockRequest is
removed since the only caller (fetchHeaderBlocks) already checks. A
BIP130 design note is added to handleHeadersMsg explaining why
unsolicited headers are now accepted (peers can proactively push headers
via sendheaders). The crash-recovery behavior of flushToDB's
header-only skip is documented: a restart during header sync loses header
progress, which is an acceptable trade-off given the small size and fast
re-validation of headers.

In maybeAcceptBlockHeader, when a header already exists in the block
index and is not invalid, the function previously fell through to
re-run CheckBlockHeaderSanity and CheckBlockHeaderContext for
side-chain headers.  Since these headers were already validated when
first added, this re-validation is pure overhead.

Return early with (false, nil) for known non-invalid side-chain
headers, and simplify the subsequent node creation path which is now
guaranteed to only execute for genuinely new headers.

Also documents the crash-recovery behavior of flushToDB's header-only
skip optimization and fixes a typo in process_test.go.
IsValidHeader, HeaderHashByHeight, and HeaderHeightByHash all read
from bestHeader without holding chainLock, while
maybeAcceptBlockHeader modifies bestHeader under chainLock.Lock().
This creates a potential data race on concurrent access.

Acquire chainLock.RLock() in each accessor before reading
bestHeader.  Also removes the redundant Contains check from
HeaderHashByHeight since NodeByHeight already returns from the
chain view's internal array, guaranteeing membership.

Introduces ValidHeaderHeight as a single-lookup alternative to
calling IsValidHeader + HeaderHeightByHash separately.  This
avoids two redundant LookupNode calls in netsync's
checkHeadersList hot path during IBD.
Several behavioral regressions were introduced when the old
checkpoint-based headers-first mode was replaced with the new IBD
pipeline.  This commit addresses them along with performance and
code quality improvements.

Restore segwit peer filtering in fetchHigherPeers so that post-
activation, only witness-enabled peers are selected for sync.
Without this, btcd could download blocks from a non-witness peer
and fail to fully validate the chain.

Restore syncCandidate demotion for peers whose last advertised block
has fallen behind our height, preventing them from being repeatedly
considered in future sync rounds.

Add fetchEqualPeers and wire it into startSync's block-download
fallback path.  This restores the equalPeers behavior needed for
regtest where both nodes start at height 0 and neither is strictly
higher than the other.

Introduce lastBlockRequested as a cursor in buildBlockRequest to
skip already-scanned height ranges.  The old code re-iterated from
forkHeight+1 on every refill call, which is O(n) per invocation
during IBD with large header chains.  The cursor is reset on peer
disconnect and IBD completion.

Remove the duplicate nil check in buildBlockRequest (the only caller,
fetchHeaderBlocks, already guards against nil).  Remove the unused
headerNode type left over from the old headerList-based sync.
Consolidate checkHeadersList to use the new ValidHeaderHeight for a
single index lookup instead of two.  Add a BIP130 design note to
handleHeadersMsg explaining why unsolicited headers are accepted.
TestFetchHigherPeersDemotesStalePeers verifies that fetchHigherPeers
sets syncCandidate to false for peers whose last block is strictly
below the given height, preventing them from being repeatedly
considered in subsequent sync rounds.

TestStartSyncEqualPeersFallback verifies that when no peer is
strictly higher than our block height but peers at the same height
exist, startSync falls back to selecting one of those equal-height
peers for block download.  This is critical for regtest where both
nodes start at genesis.
@coveralls
Copy link
Copy Markdown

Pull Request Test Coverage Report for Build 23468855169

Details

  • 73 of 96 (76.04%) changed or added relevant lines in 3 files are covered.
  • 14 unchanged lines in 2 files lost coverage.
  • Overall coverage increased (+0.06%) to 56.083%

Changes Missing Coverage Covered Lines Changed/Added Lines %
blockchain/chain.go 17 26 65.38%
netsync/manager.go 53 67 79.1%
Files with Coverage Reduction New Missed Lines %
mempool/mempool.go 1 66.56%
blockchain/chain.go 13 85.26%
Totals Coverage Status
Change from base Build 23468048213: 0.06%
Covered Lines: 31863
Relevant Lines: 56814

💛 - Coveralls

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants