blockchain+netsync: follow-up fixes for headers-first IBD review#2508
Open
blockchain+netsync: follow-up fixes for headers-first IBD review#2508
Conversation
In maybeAcceptBlockHeader, when a header already exists in the block index and is not invalid, the function previously fell through to re-run CheckBlockHeaderSanity and CheckBlockHeaderContext for side-chain headers. Since these headers were already validated when first added, this re-validation is pure overhead. Return early with (false, nil) for known non-invalid side-chain headers, and simplify the subsequent node creation path which is now guaranteed to only execute for genuinely new headers. Also documents the crash-recovery behavior of flushToDB's header-only skip optimization and fixes a typo in process_test.go.
IsValidHeader, HeaderHashByHeight, and HeaderHeightByHash all read from bestHeader without holding chainLock, while maybeAcceptBlockHeader modifies bestHeader under chainLock.Lock(). This creates a potential data race on concurrent access. Acquire chainLock.RLock() in each accessor before reading bestHeader. Also removes the redundant Contains check from HeaderHashByHeight since NodeByHeight already returns from the chain view's internal array, guaranteeing membership. Introduces ValidHeaderHeight as a single-lookup alternative to calling IsValidHeader + HeaderHeightByHash separately. This avoids two redundant LookupNode calls in netsync's checkHeadersList hot path during IBD.
Several behavioral regressions were introduced when the old checkpoint-based headers-first mode was replaced with the new IBD pipeline. This commit addresses them along with performance and code quality improvements. Restore segwit peer filtering in fetchHigherPeers so that post- activation, only witness-enabled peers are selected for sync. Without this, btcd could download blocks from a non-witness peer and fail to fully validate the chain. Restore syncCandidate demotion for peers whose last advertised block has fallen behind our height, preventing them from being repeatedly considered in future sync rounds. Add fetchEqualPeers and wire it into startSync's block-download fallback path. This restores the equalPeers behavior needed for regtest where both nodes start at height 0 and neither is strictly higher than the other. Introduce lastBlockRequested as a cursor in buildBlockRequest to skip already-scanned height ranges. The old code re-iterated from forkHeight+1 on every refill call, which is O(n) per invocation during IBD with large header chains. The cursor is reset on peer disconnect and IBD completion. Remove the duplicate nil check in buildBlockRequest (the only caller, fetchHeaderBlocks, already guards against nil). Remove the unused headerNode type left over from the old headerList-based sync. Consolidate checkHeadersList to use the new ValidHeaderHeight for a single index lookup instead of two. Add a BIP130 design note to handleHeadersMsg explaining why unsolicited headers are accepted.
TestFetchHigherPeersDemotesStalePeers verifies that fetchHigherPeers sets syncCandidate to false for peers whose last block is strictly below the given height, preventing them from being repeatedly considered in subsequent sync rounds. TestStartSyncEqualPeersFallback verifies that when no peer is strictly higher than our block height but peers at the same height exist, startSync falls back to selecting one of those equal-height peers for block download. This is critical for regtest where both nodes start at genesis.
Pull Request Test Coverage Report for Build 23468855169Details
💛 - Coveralls |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
NOTE: This PR is stacked on top of #2428 and should be reviewed/merged
after that PR lands. The diff shown here includes #2428's changes; the new
commits are the last four on this branch.
In this PR, we address a set of issues found during code review of #2428
(the headers-first IBD rework). The changes here are correctness fixes,
thread safety improvements, and performance/cleanup items that sit on top
of the new IBD pipeline. See each commit message for a detailed description
w.r.t the incremental changes.
Correctness Fixes
The new
fetchHigherPeershelper dropped the segwit peer filtering thatthe old
startSynchad. Post-segwit activation, this meant btcd couldpick a non-witness peer for sync and fail to fully validate the chain. We
restore the
IsDeploymentActive+IsWitnessEnabledcheck insidefetchHigherPeers(and the newfetchEqualPeers) so witness-only syncis enforced whenever segwit is active.
The old
startSyncalso had anequalPeersfallback for peers at thesame height, which is critical for regtest where both nodes start at
genesis (height 0) and neither is strictly higher. We add
fetchEqualPeersand wire it into the block-download path ofstartSyncto restore this behavior.
We also bring back
syncCandidatedemotion: peers whose last advertisedblock has fallen below our height get
syncCandidate = falseso theyaren't repeatedly considered in future sync rounds. The demotion uses
<(not
<=) to preserve the existing behavior where equal-height peersremain candidates.
Thread Safety
IsValidHeader,HeaderHashByHeight, andHeaderHeightByHashall readfrom
bestHeaderwithout holdingchainLock, whilemaybeAcceptBlockHeadermodifiesbestHeaderunderchainLock.Lock().We acquire
chainLock.RLock()in each accessor. The redundantContainscheck in
HeaderHashByHeightis also removed sinceNodeByHeightalreadyguarantees membership in the chain view.
Performance
buildBlockRequestpreviously iterated fromforkHeight+1tobestHeaderHeighton every refill call, i.e. O(n) per invocation. Weintroduce a
lastBlockRequestedcursor that tracks the highest scannedheight, so subsequent calls skip already-requested ranges. The cursor
resets on peer disconnect and IBD completion.
In
maybeAcceptBlockHeader, known non-invalid side-chain headers werefalling through to re-run
CheckBlockHeaderSanityandCheckBlockHeaderContext. Since these headers passed validation when firstadded, the re-validation is pure overhead. We return early for known
side-chain headers.
checkHeadersListwas callingIsValidHeaderthenHeaderHeightByHash,each doing their own
LookupNode. We addValidHeaderHeightthatcombines both into a single index lookup.
Cleanup
The unused
headerNodetype (left over from the oldheaderList-basedsync) is removed. The duplicate nil guard in
buildBlockRequestisremoved since the only caller (
fetchHeaderBlocks) already checks. ABIP130 design note is added to
handleHeadersMsgexplaining whyunsolicited headers are now accepted (peers can proactively push headers
via
sendheaders). The crash-recovery behavior offlushToDB'sheader-only skip is documented: a restart during header sync loses header
progress, which is an acceptable trade-off given the small size and fast
re-validation of headers.