-
Notifications
You must be signed in to change notification settings - Fork 116
Make Lucene partitioner thread-safe #3699
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make Lucene partitioner thread-safe #3699
Conversation
This changes the concurrentUpdate to run with a single partition, and adds usages of AsyncLock to ensure that updates to the partition metadata work correctly. Before calling the issue complete an additional test should be added that does concurrent inserts, and one for concurrent deletes.
...cord-layer-core/src/main/java/com/apple/foundationdb/record/RecordCoreInternalException.java
Outdated
Show resolved
Hide resolved
...d-layer-lucene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintainer.java
Outdated
Show resolved
Hide resolved
...d-layer-lucene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintainer.java
Outdated
Show resolved
Hide resolved
...d-layer-lucene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintainer.java
Outdated
Show resolved
Hide resolved
...d-layer-lucene/src/main/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintainer.java
Outdated
Show resolved
Hide resolved
...er-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintenanceTest.java
Show resolved
Hide resolved
...er-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintenanceTest.java
Outdated
Show resolved
Hide resolved
...er-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintenanceTest.java
Outdated
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexTestValidator.java
Show resolved
Hide resolved
...ayer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexTestValidator.java
Show resolved
Hide resolved
…ord/RecordCoreInternalException.java Co-authored-by: Scott Dugas <scott.dugas@gmail.com>
Code restructuring to makemore readable Verify that partition is empty before removal Added tests
...er-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintenanceTest.java
Outdated
Show resolved
Hide resolved
...er-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintenanceTest.java
Outdated
Show resolved
Hide resolved
fdb-record-layer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexTest.java
Outdated
Show resolved
Hide resolved
...d-layer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LucenePartitionerTest.java
Show resolved
Hide resolved
...d-layer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LucenePartitionerTest.java
Show resolved
Hide resolved
...d-layer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LucenePartitionerTest.java
Show resolved
Hide resolved
...d-layer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LucenePartitionerTest.java
Outdated
Show resolved
Hide resolved
...d-layer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LucenePartitionerTest.java
Show resolved
Hide resolved
| assertThat(partitionCounts, Matchers.contains(5, 3, 4))); | ||
| } | ||
|
|
||
| static Stream<Arguments> removeEmptyPartitions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the new assertions there are good, but that is different from the test I was describing.
concurrentDelete only does one merge, after deleting all the records, and I was suggesting a loop:
while (records.size() > 0) {
delete(random.nextInt(records.size()));
commit
merge
validate
}
This will give better coverage of the relationship between merge and various partial states, as opposed to all partitions being gone.
| } else { | ||
| timerSnapshot = null; | ||
| } | ||
| final StoreTimerSnapshot timerSnapshot = getStoreTimerSnapshot(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should mark this teamscale warning as tolerated.
The line at which we call getStoreTimerSnapshot is important, and calling it later would not give us timing information.
added test to randomly remove records and assert empty partitions.
This changes the concurrentUpdate to run with a single partition, and adds usages of AsyncLock to ensure that updates to the partition metadata work correctly. Before calling the issue complete an additional test should be added that does concurrent inserts, and one for concurrent deletes.
…ord/RecordCoreInternalException.java Co-authored-by: Scott Dugas <scott.dugas@gmail.com>
Code restructuring to makemore readable Verify that partition is empty before removal Added tests
added test to randomly remove records and assert empty partitions.
…read-safe-partitioner # Conflicts: # fdb-record-layer-lucene/src/test/java/com/apple/foundationdb/record/lucene/LuceneIndexMaintenanceTest.java
📊 Metrics Diff Analysis ReportSummary
ℹ️ About this analysisThis automated analysis compares query planner metrics between the base branch and this PR. It categorizes changes into:
The last category in particular may indicate planner regressions that should be investigated. |
| @Tag(Tags.RequiresFDB) | ||
| public class LuceneRepartitionPlannerTest { | ||
|
|
||
| private static Stream<Arguments> luceneRepartitionPlannerTest() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm surprised this can be private and junit can handle it, but it seems like it can.
Lucene partitioner is not thread safe when records are updated concurrently.
The partition
countand boundaries are not thread safe and can skew in multi threaded updates.The changes include protecting the partition metadata changes with a
keyspacelock.The
LuceneIndexMaintenanceTest.concurrent*tests have been extended to run in partitioned setup and are now passing.In addition, a bug where empty partitions were not removed was uncovered, and a fix was introduced, with accompanying tests. Empty partitions (first, middle and last) are removed with their index data, ensuring at least one partition is always left behind.
Resolves #2990