CASSANALYTICS-147: BufferingInputStream fails to read last chunk#193
CASSANALYTICS-147: BufferingInputStream fails to read last chunk#193lukasz-antoniak wants to merge 3 commits intoapache:trunkfrom
Conversation
|
|
||
| int bytesToRead = chunkSize * numChunks; | ||
| long skipAhead = size - bytesToRead + 1; | ||
| long skipAhead = size - bytesToRead; |
There was a problem hiding this comment.
I am not sure how effective the change in BufferingInputStream would affect skip() used during BIG index reading. All integration tests pass though, and I think hereby unit test is just a simulation.
|
|
||
| // Deliver data in chunks until request is fulfilled | ||
| while (position < actualEnd) | ||
| while (position <= actualEnd) // range boundaries are inclusive |
There was a problem hiding this comment.
According to below JavaDoc, ranges should be considered inclusive.
/**
* Asynchronously request bytes for the SSTable file component in the range start-end, and pass on to the StreamConsumer when available.
* The start-end range is inclusive.
*
* @param start the start of the bytes range
* @param end the end of the bytes range
* @param consumer the StreamConsumer to return the bytes to when the request is complete
*/
void request(long start, long end, StreamConsumer consumer);
|
Seems like With this fix, end is now at most
|
| } | ||
|
|
||
| @Test | ||
| public void testUnalignedEndReading() throws IOException |
There was a problem hiding this comment.
Minor: We might want to assert on returnedBuffers.size() == 2 to catch regressions where extra or missing requests are issued
| this.rangeStart = position; | ||
| // adjusting bytes read and written so that isFinished() returns correct output and we know | ||
| // when we reached the end of the stream | ||
| this.bytesRead = position == 0 ? 0 : (position + 1); |
There was a problem hiding this comment.
Is this calculation still correct after this fix? For example, if position is 50, does it mean BufferingInputStream should start reading from the offset 50? If yes, don't we need to set both bytesRead and bytesWritten to 50 in that case? Whereas this code will set both of them to 51. Is this expected?
|
|
||
| // Configure source to provide sequential data | ||
| testSource.setRequestHandler(call -> { | ||
| int dataSize = (int) (call.end - call.start); |
There was a problem hiding this comment.
If both start and end are inclusive, don't we need to modify dataSize = end - start + 1 ? Perhaps we can catch this off by a byte by adding content/length verification below.
| @@ -157,7 +157,7 @@ | |||
| // Backward seek path has a bug: requests buffer.remaining() + 1 bytes | |||
There was a problem hiding this comment.
Existing bug requests 1 byte extra. Looks like after this change, CdcRandomAccessReader's rebuffer requests 2 bytes extra for backward seek case. Assume offset is 0, buffer remaining is 50, then we will request from 0 to 51, which will be 2 bytes extra. Don't we need to fix this code in CdcRandomAccessReader ?
// attempting to read bytes previously read
// in practice we read the Commit Logs sequentially
// but we still need to respect random access reader API, it will just require blocking
int requestLen = buffer.remaining() + 1;
long end = offset + requestLen;
BlockingStreamConsumer streamConsumer = new BlockingStreamConsumer();
source.request(offset, end, streamConsumer);
There was a problem hiding this comment.
Looks like this code was always off by 2, this PR didn't change this.
Fixes CASSANALYTICS-147.