Skip to content

SQLAlchemy: JOIN modifiers (ALL/ANY/ASOF, GLOBAL) #635

@rafalmisiarek

Description

@rafalmisiarek

Is your feature request related to a problem? Please describe.

ClickHouse supports strictness and distribution modifiers on JOIN clauses that are not available in the SQLAlchemy dialect. The dialect's generates standard SQL joins (INNER JOIN, LEFT OUTER JOIN, etc.) with no way to add these ClickHouse-specific modifiers.

Strictness modifiers control how multiple matches are handled:

  • ALL - return all matching rows
  • ANY - return only the first match
  • ASOF - time-series join, finds closest match

Distribution modifier:

  • GLOBAL - for distributed queries, broadcasts the right table to all nodes

These are important for controlling query behavior and performance, especially in distributed ClickHouse setups.

Describe the solution you'd like

Support for strictness and distribution modifiers in the SQLAlchemy dialect's join(), for example:

stmt = select(users.c.name, orders.c.product).join(
    orders, users.c.user_id == orders.c.user_id,
    isouter=True, strictness="ALL", distribution="GLOBAL"
)

Generating:

SELECT ... FROM users GLOBAL ALL LEFT JOIN orders ON ...

Describe alternatives you've considered

Using client.raw_query() with raw SQL instead of the SQLAlchemy dialect.

Additional context

Reproduction showing the difference between ALL and ANY (v0.10.0):

import clickhouse_connect

client = clickhouse_connect.get_client(host="localhost", port=8123, username="default", password="")

client.command("DROP TABLE IF EXISTS test_orders")
client.command("DROP TABLE IF EXISTS test_users")
client.command("CREATE TABLE test_users (user_id UInt32, name String) ENGINE = MergeTree() ORDER BY user_id")
client.command("CREATE TABLE test_orders (user_id UInt32, product String) ENGINE = MergeTree() ORDER BY user_id")
client.command("INSERT INTO test_users VALUES (1, 'Alice'), (2, 'Bob'), (3, 'Charlie')")
client.command("INSERT INTO test_orders VALUES (1, 'Phone'), (1, 'Laptop'), (2, 'Tablet')")

# ALL LEFT JOIN - returns all matching rows (Alice appears twice)
result = client.query("SELECT u.name, o.product FROM test_users u ALL LEFT JOIN test_orders o ON u.user_id = o.user_id")
print(f"ALL:  {result.result_rows}")
# Output: ALL:  [('Alice', 'Phone'), ('Alice', 'Laptop'), ('Bob', 'Tablet'), ('Charlie', '')]

# ANY LEFT JOIN - returns first match only (Alice appears once)
result = client.query("SELECT u.name, o.product FROM test_users u ANY LEFT JOIN test_orders o ON u.user_id = o.user_id")
print(f"ANY:  {result.result_rows}")
# Output: ANY:  [('Alice', 'Phone'), ('Bob', 'Tablet'), ('Charlie', '')]

client.command("DROP TABLE test_orders")
client.command("DROP TABLE test_users")

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions