-
Notifications
You must be signed in to change notification settings - Fork 121
SQLAlchemy: JOIN modifiers (ALL/ANY/ASOF, GLOBAL) #635
Description
Is your feature request related to a problem? Please describe.
ClickHouse supports strictness and distribution modifiers on JOIN clauses that are not available in the SQLAlchemy dialect. The dialect's generates standard SQL joins (INNER JOIN, LEFT OUTER JOIN, etc.) with no way to add these ClickHouse-specific modifiers.
Strictness modifiers control how multiple matches are handled:
ALL- return all matching rowsANY- return only the first matchASOF- time-series join, finds closest match
Distribution modifier:
GLOBAL- for distributed queries, broadcasts the right table to all nodes
These are important for controlling query behavior and performance, especially in distributed ClickHouse setups.
Describe the solution you'd like
Support for strictness and distribution modifiers in the SQLAlchemy dialect's join(), for example:
stmt = select(users.c.name, orders.c.product).join(
orders, users.c.user_id == orders.c.user_id,
isouter=True, strictness="ALL", distribution="GLOBAL"
)Generating:
SELECT ... FROM users GLOBAL ALL LEFT JOIN orders ON ...Describe alternatives you've considered
Using client.raw_query() with raw SQL instead of the SQLAlchemy dialect.
Additional context
Reproduction showing the difference between ALL and ANY (v0.10.0):
import clickhouse_connect
client = clickhouse_connect.get_client(host="localhost", port=8123, username="default", password="")
client.command("DROP TABLE IF EXISTS test_orders")
client.command("DROP TABLE IF EXISTS test_users")
client.command("CREATE TABLE test_users (user_id UInt32, name String) ENGINE = MergeTree() ORDER BY user_id")
client.command("CREATE TABLE test_orders (user_id UInt32, product String) ENGINE = MergeTree() ORDER BY user_id")
client.command("INSERT INTO test_users VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Charlie')")
client.command("INSERT INTO test_orders VALUES (1, 'Phone'), (1, 'Laptop'), (2, 'Tablet')")
# ALL LEFT JOIN - returns all matching rows (Alice appears twice)
result = client.query("SELECT u.name, o.product FROM test_users u ALL LEFT JOIN test_orders o ON u.user_id = o.user_id")
print(f"ALL: {result.result_rows}")
# Output: ALL: [('Alice', 'Phone'), ('Alice', 'Laptop'), ('Bob', 'Tablet'), ('Charlie', '')]
# ANY LEFT JOIN - returns first match only (Alice appears once)
result = client.query("SELECT u.name, o.product FROM test_users u ANY LEFT JOIN test_orders o ON u.user_id = o.user_id")
print(f"ANY: {result.result_rows}")
# Output: ANY: [('Alice', 'Phone'), ('Bob', 'Tablet'), ('Charlie', '')]
client.command("DROP TABLE test_orders")
client.command("DROP TABLE test_users")