Skip to content

Conversation

@LiebingYu
Copy link
Contributor

Purpose

Linked issue: #846

This PR only tolerate Paimon lake table existent if the schema and properties matches. Other lake storage should implement thier own logic to tolerate lake table existent.

Brief change log

Tests

API and Format

Documentation

@LiebingYu LiebingYu force-pushed the reuse-lake-table-new branch from c161b84 to c79c62f Compare October 21, 2025 07:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements tolerance for existing Paimon lake tables during creation and alteration operations if the schema and properties match. This enables idempotent table creation for eventual consistency, addressing issue #846.

  • Introduces PaimonSchemaValidation utility to validate existing Paimon tables against new table specifications
  • Updates table creation logic to tolerate existing tables when schemas match rather than failing immediately
  • Refactors alter table logic to always sync lake tables even when datalake is disabled

Reviewed Changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
fluss-test-coverage/pom.xml Updates test coverage exclusion path for relocated FlussDataTypeToPaimonDataType class
fluss-server/src/test/java/org/apache/fluss/server/lakehouse/TestingPaimonStoragePlugin.java Updates test plugin to tolerate existing tables with matching descriptors
fluss-server/src/test/java/org/apache/fluss/server/coordinator/LakeTableManagerITCase.java Updates test expectations to verify TableAlreadyExistException thrown by Fluss instead of lake-specific exception
fluss-server/src/main/java/org/apache/fluss/server/coordinator/MetadataManager.java Refactors alter table logic to always sync with lake catalog when available, removing premature lake table creation logic
fluss-server/src/main/java/org/apache/fluss/server/coordinator/CoordinatorService.java Simplifies error handling to delegate validation to lake catalog implementation
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/PaimonLakeCatalogTest.java Renames test method and uses Collections.singletonList for better semantics
fluss-lake/fluss-lake-paimon/src/test/java/org/apache/fluss/lake/paimon/LakeEnabledTableCreateITCase.java Adds comprehensive test coverage for table creation with existing lake tables and alter operations
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/utils/PaimonSchemaValidation.java Introduces new validation utility to compare existing and new Paimon table schemas
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/utils/PaimonConversions.java Makes FLUSS_CONF_PREFIX public and adds PATH to unsettable options list
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/utils/FlussDataTypeToPaimonDataType.java Relocates class to utils package for better organization
fluss-lake/fluss-lake-paimon/src/main/java/org/apache/fluss/lake/paimon/PaimonLakeCatalog.java Implements tolerance for existing tables by validating schema compatibility instead of failing
fluss-flink/fluss-flink-common/src/test/java/org/apache/fluss/flink/catalog/FlinkCatalogTest.java Updates test to verify successful table creation with matching existing lake table

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiebingYu Thanks for the pr. The another problem will be:

  1. user create table fluss_t1 with lake enabled, a lake table lake_t1 is created
  2. write some data to fluss_t1, and data has been tiered to lake_t1
  3. then user drop table fluss_t1, note lake_t1 won't be drop
  4. user create fluss_t1, create successfully, with lake_t1 which is for prevous fluss_t1 in step 1

But I do understand the problem you want to solve. I have one idea about that to pass a flag to mark whether the create lake table context is creating fluss table or just alter fluss table.

In method CoordinatorService#createTable, set is_creating_fluss_table(may be some better name) to true, then LakeCatalog#createTable will check schema match and no records just as jark said in the issue. It can solve the problem I describe.

In method CoordinatorService#alterTable, if it to enable lake table for an existing table, set is_creating_fluss_table = false, then LakeCatalog#createTable just need to check schema match or not. It'll solve the problem that user first disable it and then enable it again.
WDTY?

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LiebingYu Thanks for the pr. Only left minor comments. PTAL

@LiebingYu LiebingYu force-pushed the reuse-lake-table-new branch from 83449d3 to 615f3f8 Compare November 12, 2025 12:42
Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@luoyuxia luoyuxia merged commit 32003fc into apache:main Nov 12, 2025
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants