db_stress: constrain batched prefix scans by xingbowang · Pull Request #14512 · facebook/rocksdb

xingbowang · 2026-03-26T02:34:36Z

Summary

BatchedOpsStressTest::TestPrefixScan opens 10 iterators with different prefixes and compares their results in lockstep. About half of the iterators randomly receive an iterate_upper_bound, while the other half had no range constraint at all. Without prefix_same_as_start, unconstrained iterators could return keys beyond their seek prefix, producing more entries than bounded iterators and triggering assertion failures in the verification loop.

Fix

Set prefix_same_as_start = true on all 10 ReadOptions so every iterator stops at the seek prefix boundary, making the lockstep comparison valid regardless of whether an upper bound is also set.

Batched snapshot mode does not verify the full DB state against the expected-state oracle. Instead, it relies on local invariants over the 10-key family written atomically by BatchedOpsStressTest::TestPut(). For a base key K and value body V, TestPut() writes: "0" + K -> V + "0" ... "9" + K -> V + "9" in one write batch. TestPrefixScan() then picks one base key, derives 10 seek prefixes 0+P .. 9+P where P is the first FLAGS_prefix_size - 1 bytes of K, and opens 10 iterators on the same snapshot. The verifier advances those iterators in lockstep and checks that: * each iterator stays in its own seek prefix; * no iterator exhausts its prefix before the others; * at position n, all values match after stripping the trailing digit; * the trailing digit still matches the corresponding prefix stream; and * value() and columns() remain mutually consistent for wide-column / entity reads. That lockstep oracle assumes every iterator is explicitly bounded to the seek prefix before we assert same-length, same-index progress across all 10 streams. Before this change, the batched test only installed an iterate_upper_bound to the next prefix half of the time. In the other half, Seek(prefix) could legitimately continue into the next prefix, while the verifier still assumed the iterator would stop at the prefix boundary. That made the verifier depend on behavior it had not actually requested from RocksDB. Set ReadOptions::prefix_same_as_start = true for every iterator used by this batched prefix verifier. This matches the semantics the verifier needs: RocksDB records the seek prefix and invalidates the iterator once Next() / Prev() leave that prefix, and the memtable path also switches to dynamic prefix iteration under the same condition. Relevant option interactions for this verifier: * test_batches_snapshots requires prefix_size > 0. * db_stress installs NewFixedPrefixTransform(prefix_size) whenever prefix_size >= 0, so the semantic precondition here is the configured prefix_extractor, not specifically memtablerep=prefix_hash. * prefixpercent only controls how often this verifier runs. * auto_refresh_iterator_with_snapshot is irrelevant here because the batched verifier pins one explicit snapshot for all 10 iterators. * When a next-prefix upper bound is installed, the test still exercises iterate_upper_bound and SQFC range-query filtering. * allow_unprepared_value is still handled via PrepareValue(). * use_put_entity_one_in, use_attribute_group, use_timed_put_one_in, and use_merge only change how the values/entities are encoded; the verifier still checks trailing-digit agreement and value()/columns() consistency. * memtablerep=prefix_hash, memtable_prefix_bloom_size_ratio, memtable_whole_key_filtering, and enable_memtable_insert_with_hint_prefix_extractor affect prefix-aware filtering/performance paths but do not change the verifier's lockstep invariant. * user_timestamp_size is out of scope because batched snapshot mode does not support timestamps. This intentionally makes the batched prefix-scan oracle validate explicitly prefix-bounded iteration every time, while the non-batched stress path continues to provide separate randomized coverage of iterate_upper_bound-only behavior.

github-actions · 2026-03-26T02:36:40Z

⚠️ clang-tidy: 2 warning(s) on changed lines

Completed in 1461.9s.

Summary by check

Check	Count
`performance-unnecessary-copy-initialization`	2
Total	2

Details

db/blob/db_blob_basic_test.cc (2 warning(s))

db/blob/db_blob_basic_test.cc:2612:17: warning: local copy 'r1_end' of the variable 'mid' is never modified; consider avoiding the copy [performance-unnecessary-copy-initialization]
db/blob/db_blob_basic_test.cc:2613:17: warning: local copy 'r2_start' of the variable 'mid' is never modified; consider avoiding the copy [performance-unnecessary-copy-initialization]

github-actions · 2026-03-26T02:52:13Z

✅ Claude Code Review

Auto-triggered after CI passed — reviewing commit 5514e9d

I now have a thorough understanding of the change. Let me write the review.

Summary

Correct, minimal fix for a real assertion failure in BatchedOpsStressTest::TestPrefixScan. The lockstep comparison loop assumes all 10 iterators produce the same number of entries per prefix, but iterators without iterate_upper_bound could scan past their prefix boundary, violating this invariant. Setting prefix_same_as_start = true on all iterators constrains them properly.

Issues Found

No critical or significant issues.

🟢 Nitpick: The while-loop at line 606 already manually checks iters[0]->key().starts_with(prefix_slices[0]), which is redundant when prefix_same_as_start is now always true (the iterator itself will become invalid at the prefix boundary). This is harmless — the manual check is now a belt-and-suspenders defense — but worth noting that the outer while-loop condition could be simplified to just iters[0]->Valid(). However, keeping the explicit prefix check is arguably more defensive and readable, so no change is necessary.

🟢 Nitpick: The comment (lines 588-590 in the post-change file) is good and explains the "why." No issues there.

Cross-Component Analysis

Context	Assessment
Prefix seek / total order seek	`prefix_same_as_start` requires a prefix extractor. The function is gated by `FLAGS_prefix_size > 0` (line 565) and `BatchedOpsStressTest` requires prefix mode, so the precondition is satisfied.
`iterate_upper_bound` interaction	When both `prefix_same_as_start` and `iterate_upper_bound` are set, the iterator stops at whichever boundary comes first. Since `iterate_upper_bound` is set to "next prefix" via `GetNextPrefix`, and `prefix_same_as_start` also stops at the prefix boundary, these are semantically equivalent — setting both is safe and consistent.
Comparison with `no_batched_ops_stress.cc`	The sibling implementation (line 1628-1629) uses `prefix_same_as_start` randomly and independently per single iterator. The batched version correctly makes it unconditional because it needs all 10 iterators to agree. This is the right design choice.
Performance	This is stress test code, not a hot path. No performance concerns.

Positive Observations

The fix is surgical: one line of functional change, well-targeted at the root cause.
The comment clearly explains why the setting is needed (lockstep comparison), not just what it does.
The approach is consistent with how no_batched_ops_stress.cc already uses the same option.

ℹ️ About this response

Generated by Claude Code.
Review methodology: claude_md/code_review.md

Limitations:

Claude may miss context from files not in the diff
Large PRs may be truncated
Always apply human judgment to AI suggestions

Commands:

/claude-review [context] — Request a code review
/claude-query <question> — Ask about the PR or codebase

meta-codesync · 2026-03-26T14:02:57Z

@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D98302018.

meta-codesync · 2026-03-28T22:57:55Z

@xingbowang has imported this pull request. If you are a Meta employee, you can view this in D98302018.

meta-cla bot added the CLA Signed label Mar 26, 2026

xingbowang added 5 commits March 27, 2026 17:44

db_crashtest: disable BlobDB in best-efforts recovery

79e5b6c

db_crashtest: preserve expected-state dirs across restarts

e33d0f5

Skip invalid cross-prefix iterator stress checks

5f1605c

Add trie UDI iterator regression coverage

ed37508

Fix MSVC C4244 warning: cast char arithmetic to char in trie test

f46dc8c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

db_stress: constrain batched prefix scans#14512

db_stress: constrain batched prefix scans#14512
xingbowang wants to merge 6 commits intofacebook:mainfrom
xingbowang:2026_03_25_stress_prefix_fix2

xingbowang commented Mar 26, 2026

Uh oh!

github-actions bot commented Mar 26, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 26, 2026

Uh oh!

meta-codesync bot commented Mar 26, 2026

Uh oh!

meta-codesync bot commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

xingbowang commented Mar 26, 2026

Summary

Fix

Uh oh!

github-actions bot commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ clang-tidy: 2 warning(s) on changed lines

Summary by check

Details

Uh oh!

github-actions bot commented Mar 26, 2026

✅ Claude Code Review

Summary

Issues Found

Cross-Component Analysis

Positive Observations

Uh oh!

meta-codesync bot commented Mar 26, 2026

Uh oh!

meta-codesync bot commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions bot commented Mar 26, 2026 •

edited

Loading