-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Fix failure when reading deep or shallow cloned Delta Lake tables #27098
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix failure when reading deep or shallow cloned Delta Lake tables #27098
Conversation
Reviewer's GuideThis PR adjusts the checkpoint metadata validation to accept version 0 on Delta Lake table clones and supplements it with comprehensive unit tests covering valid, zero, and negative version scenarios as well as JSON serialization. Class diagram for updated CheckpointMetadataEntry validationclassDiagram
class CheckpointMetadataEntry {
+long version
+Optional<Map<String, String>> tags
+CheckpointMetadataEntry(long version, Optional<Map<String, String>> tags)
}
CheckpointMetadataEntry : version >= 0 validation
CheckpointMetadataEntry : tags are copied as ImmutableMap
Class diagram for new TestCheckpointMetadataEntry unit testsclassDiagram
class TestCheckpointMetadataEntry {
+testValidVersion()
+testZeroVersion()
+testNegativeVersionThrows()
+testJsonSerialization()
}
TestCheckpointMetadataEntry --> CheckpointMetadataEntry
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/transactionlog/TestCheckpointMetadataEntry.java:55-64` </location>
<code_context>
+ }
+
+ @Test
+ void testInvalidCheckpointMetadataEntry()
+ {
+ @Language("JSON")
+ String jsonWithNegativeVersion = "{\"version\":-1,\"tags\":{\"sidecarNumActions\":\"1\",\"sidecarSizeInBytes\":\"20965\",\"numOfAddFiles\":\"1\",\"sidecarFileSchema\":\"\"}}";
+ assertThatThrownBy(() -> codec.fromJson(jsonWithNegativeVersion))
+ .isInstanceOf(IllegalArgumentException.class)
+ .hasMessageContaining("Invalid JSON string for");
+
+ @Language("JSON")
+ String jsonWithoutTags = "{\"version\":-1}";
+ assertThatThrownBy(() -> codec.fromJson(jsonWithoutTags))
+ .isInstanceOf(IllegalArgumentException.class)
</code_context>
<issue_to_address>
**suggestion (testing):** Missing test for valid CheckpointMetadataEntry with absent 'tags' field.
Please add a test for deserializing a valid CheckpointMetadataEntry with a non-negative version and no 'tags' field to confirm correct handling of this case.
</issue_to_address>
### Comment 2
<location> `plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/transactionlog/TestCheckpointMetadataEntry.java:70-89` </location>
<code_context>
+ }
+
+ @Test
+ void testCheckpointMetadataEntryToJson()
+ {
+ assertThat(codec.toJson(new CheckpointMetadataEntry(
+ 100,
+ Optional.of(ImmutableMap.of(
+ "sidecarNumActions", "1",
+ "sidecarSizeInBytes", "20965",
+ "numOfAddFiles", "1",
+ "sidecarFileSchema", "")))))
+ .isEqualTo("{\n" +
+ " \"version\" : 100,\n" +
+ " \"tags\" : {\n" +
+ " \"sidecarNumActions\" : \"1\",\n" +
+ " \"sidecarSizeInBytes\" : \"20965\",\n" +
+ " \"numOfAddFiles\" : \"1\",\n" +
+ " \"sidecarFileSchema\" : \"\"\n" +
+ " }\n" +
+ "}");
+ }
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test for serialization with version 0 and absent 'tags'.
Please add a test case for serializing a CheckpointMetadataEntry with version 0 and no 'tags', and verify the resulting JSON structure.
```suggestion
@Test
void testCheckpointMetadataEntryToJson()
{
assertThat(codec.toJson(new CheckpointMetadataEntry(
100,
Optional.of(ImmutableMap.of(
"sidecarNumActions", "1",
"sidecarSizeInBytes", "20965",
"numOfAddFiles", "1",
"sidecarFileSchema", "")))))
.isEqualTo("{\n" +
" \"version\" : 100,\n" +
" \"tags\" : {\n" +
" \"sidecarNumActions\" : \"1\",\n" +
" \"sidecarSizeInBytes\" : \"20965\",\n" +
" \"numOfAddFiles\" : \"1\",\n" +
" \"sidecarFileSchema\" : \"\"\n" +
" }\n" +
"}");
}
@Test
void testCheckpointMetadataEntryToJsonWithVersionZeroAndNoTags()
{
assertThat(codec.toJson(new CheckpointMetadataEntry(
0,
Optional.empty())))
.isEqualTo("{\n" +
" \"version\" : 0\n" +
"}");
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
...lake/src/test/java/io/trino/plugin/deltalake/transactionlog/TestCheckpointMetadataEntry.java
Show resolved
Hide resolved
...lake/src/test/java/io/trino/plugin/deltalake/transactionlog/TestCheckpointMetadataEntry.java
Show resolved
Hide resolved
https://trino.io/development/process#pull-request-and-commit-guidelines
|
14dd843 to
d1593a9
Compare
Fixed |
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeBasic.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeBasic.java
Outdated
Show resolved
Hide resolved
199baac to
3be6244
Compare
reminder |
chenjian2664
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeBasic.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeBasic.java
Outdated
Show resolved
Hide resolved
3be6244 to
78997c3
Compare
...ources/databricks154/clone_checkpoint_version_zero/checkpoint_v2/deep_cloned_table/README.md
Outdated
Show resolved
Hide resolved
|
This pull request has gone a while without any activity. Ask for help on #core-dev on Trino slack. |
|
Hi @chenjian2664 , got above alert, any update on this? thx! |
|
@yangshangqing95 Hi, I think the pr looks good overall. my only concern is the test coverage, we should still cover both deep and shallow clone cases, since they are different scenarios, and adding dedicated tests won't cause much overhead/hurt |
...lake/src/test/java/io/trino/plugin/deltalake/transactionlog/TestCheckpointMetadataEntry.java
Outdated
Show resolved
Hide resolved
sure, I'll update |
32be4a9 to
9027663
Compare
Hi @chenjian2664, shallow cloned table test case added |
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeBasic.java
Outdated
Show resolved
Hide resolved
...lake/src/test/java/io/trino/plugin/deltalake/transactionlog/TestCheckpointMetadataEntry.java
Outdated
Show resolved
Hide resolved
plugin/trino-delta-lake/src/test/java/io/trino/plugin/deltalake/TestDeltaLakeBasic.java
Outdated
Show resolved
Hide resolved
9027663 to
4e3f2a9
Compare
|
/test-with-secrets sha=4e3f2a99fdbbe8882aa654626b05c2569e3b5b24 |
|
The CI workflow run with tests that require additional secrets has been started: https://github.com/trinodb/trino/actions/runs/19720679824 |
Description
For cloned Delta Lake tables (either deep or shallow clones), the checkpoint version may start at
0.The previous validation in the
CheckpointMetadataEntryconstructor required the version to be positive,which caused the following exception:
Root cause is:
Additional context and related issues
Fixes #27097
Release notes
( ) This is not user-visible or is docs only, and no release notes are required.
( ) Release notes are required. Please propose a release note for me.
(x) Release notes are required, with the following suggested text: