Conversation
There was a problem hiding this comment.
4 issues found across 16 files
Prompt for AI agents (all issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="cli/nao_core/templates/defaults/databases/indexes.md.j2">
<violation number="1" location="cli/nao_core/templates/defaults/databases/indexes.md.j2:16">
P2: **Duplicate database query:** `db.indexes()` is called twice — once in the `{% if %}` guard and once to render the content. Since the ClickHouse implementation executes `SHOW CREATE TABLE` via `raw_sql` on each call (no caching), this doubles the number of queries per table during sync. Use Jinja2's `{% set %}` to call it once and reuse the result.</violation>
</file>
<file name="cli/nao_core/config/databases/clickhouse.py">
<violation number="1" location="cli/nao_core/config/databases/clickhouse.py:228">
P1: Missing `GROUP BY` clause when `AggregateFunction` columns are present. The `-Merge` combinators (e.g. `uniqMerge`, `sumMerge`) are aggregate functions. When mixed with non-aggregated columns in SELECT, ClickHouse requires a `GROUP BY` on all non-aggregated columns; otherwise the query will fail. Consider detecting when aggregate expressions are present and adding `GROUP BY` over the plain columns.</violation>
</file>
<file name=".env.example">
<violation number="1" location=".env.example:53">
P2: Port/secure mismatch: `CLICKHOUSE_PORT=8443` is the HTTPS port but `CLICKHOUSE_SECURE=false`. For a non-secure connection the default HTTP port should be `8123` (consistent with the integration test `.env.example` and code defaults).</violation>
</file>
<file name="cli/nao_core/config/databases/base.py">
<violation number="1" location="cli/nao_core/config/databases/base.py:87">
P2: `hasattr(cursor, "description")` is `True` even when `cursor.description is None` (DB-API 2.0 sets it to `None` for non-returning statements). This will raise `TypeError` when iterating over `None`. Check that `description` is not `None`.</violation>
</file>
Since this is your first cubic review, here's how it works:
- cubic automatically reviews your code and comments on bugs and improvements
- Teach cubic by replying to its comments. cubic learns from your replies and gets better over time
- Add one-off context when rerunning by tagging
@cubic-dev-aiwith guidance or docs links (includingllms.txt) - Ask questions if you need clarification on any suggestion
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
Hello @poupou-web3 thank you so much for the Clickhouse addition! I will do some tests, run the integration tests once, do a small review and it should be merged asap |
Bl3f
left a comment
There was a problem hiding this comment.
Small question and change around indexes.md, but apart from it LGTM. Thank you so much integrations tests worked well!
Welcome as a new contributor ✨
There was a problem hiding this comment.
1 issue found across 5 files (changes from recent commits).
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="cli/nao_core/config/databases/clickhouse.py">
<violation number="1" location="cli/nao_core/config/databases/clickhouse.py:199">
P1: Bug: `columns()` returns `[]` when `_direct_select_disallowed` is True, but `column_count()` still returns the actual count from `system.columns` for the same condition. This means stream-like engines (Kafka, RabbitMQ, FileLog) will have no column metadata in generated docs, and `column_count` vs `columns` will be inconsistent. The previous code correctly fell back to `_columns_from_system()` here.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
I realized that the index.md is also a duplicate of the columns.md, and we could just put everything in columns.md. |
|
@poupou-web3 yes, it was a bit was I tried to say. Could you change it in the PR pls? |
Add ClickHouse support
Summary
ClickHouseConfig,DatabaseType.CLICKHOUSE) usingibis.clickhouseand wired it intoAnyDatabaseConfig/parsing.indexes.mdgeneration via a newDatabaseAccessor.INDEXESandindexes.md.j2template so agents can seeORDER BY,PRIMARY KEY, andPARTITION BYwhen available for clickhouse only.AggregateFunctioncolumns and safer handling of engines that disallow directSELECT(e.g. Kafka, stream-like engines).Details
ClickHouse context
ClickHouseDatabaseContextbuilds preview queries from the table schema, selecting plain columns directly and using*-Merge(e.g.uniqMerge) forAggregateFunctioncolumns.ibis-clickhouseSHOW CREATE TABLE+system.columnsso sync still completes.indexes()returns full DDL (SHOW CREATE TABLE) used byindexes.mdso the agent knows how tables are ordered/partitioned for efficient querying.Generic improvements
DatabaseConfig.raw_sql_to_dataframenow understands multiple cursor/result shapes, including ClickHouse'sresult_rows/column_names.DatabaseContext.indexes()hook powersindexes.mdwhile keeping non-ClickHouse databases unchanged (they simply don't emit indexes docs by default).Testing
Added ClickHouse integration tests (
test_clickhouse.py+clickhouse.sql) covering:indexes.md).MergeTree,SummingMergeTree,ReplacingMergeTree,AggregatingMergeTree, Kafka engine, materialized views, and dictionaries.Added
docker-compose.test.ymland expanded.env.examplefor Postgres, ClickHouse, and MSSQL so integration tests can be run locally with: