-
Notifications
You must be signed in to change notification settings - Fork 1k
PHOENIX-7524: Fix IndexOutOfBoundsException in queries with OFFSET when rows are exhausted #2307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
PHOENIX-7524: Fix IndexOutOfBoundsException in queries with OFFSET when rows are exhausted #2307
Conversation
…en rows are exhausted
| * @param region The region being scanned | ||
| * @return A valid row key derived from scan or region boundaries | ||
| */ | ||
| public static byte[] deriveRowKeyFromScanOrRegionBoundaries(Scan scan, Region region) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Naming convention for this method is similar to getScanStartRowKeyFromScanOrRegionBoundaries method in the ServerUtil.java
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use getScanStartRowKeyFromScanOrRegionBoundaries() instead of this new method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@virajjasani,
No, we cannot use getScanStartRowKeyFromScanOrRegionBoundaries() because the two methods serve different purposes -
getScanStartRowKeyFromScanOrRegionBoundaries() - Returns only the start row key of the scan
while we need getLargestPossibleRowKeyInRange() which analyzes the entire range. Which is implemented in deriveRowKeyFromScanOrRegionBoundaries(). It basically derives a representative row key within the scan range for the OFFSET queries.
palashc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1, thank you!
| } else if (scan.includeStopRow()) { | ||
| rowKey = endKey; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this generates correct rowkey: if scan start rowkey is not inclusive, we need to find the shortest possible next rowkey?
I think we should have this logic elsewhere, @TheNamesRai could you please check once?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@virajjasani ,
The existing code uses the same pattern as i used - usecase1, usecase2, ..
But actually you are right, we can use the nextkey before jumping to the endkey here. Something like this should be better? What do you think?
if (rowKey == null) {
if (scan.includeStartRow()) {
rowKey = startKey;
} else {
byte[] nextStartKey = ByteUtil.nextKey(startKey);
if (nextStartKey != null) {
rowKey = nextStartKey;
} else if (scan.includeStopRow()) {
rowKey = endKey;
} else {
rowKey = HConstants.EMPTY_END_ROW;
}
}
}
PHOENIX-7524: Fix
IndexOutOfBoundsExceptionin queries with OFFSET when rows are exhaustedDesign doc - link
When executing queries with OFFSET where the available rows are exhausted:
The server-side scanner would encounter an empty tuple after exhausting all available rows and attempt to call
getOffsetKvWithLastScannedRowKey(), which internally calls tuple.getKey(). This causedIndexOutOfBoundsExceptioninMultiKeyValueTuple.getKey()when accessing an empty tuple.Solution - Added a check in NonAggregateRegionScannerFactory.java to detect empty tuples before calling
getOffsetKvWithLastScannedRowKey()Modified Files:
Test Files: