Skip to content

Added separate executor#252

Open
kosstbarz wants to merge 2 commits intomainfrom
kosst/separate-executor
Open

Added separate executor#252
kosstbarz wants to merge 2 commits intomainfrom
kosst/separate-executor

Conversation

@kosstbarz
Copy link
Contributor

No description provided.

@kosstbarz kosstbarz marked this pull request as ready for review March 11, 2026 15:13
@kosstbarz kosstbarz requested a review from a team March 11, 2026 15:13
@SokolovYaroslav SokolovYaroslav requested a review from Copilot March 11, 2026 15:26
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new “separate” execution mode that runs SQL directly against each datasource’s native SQLAlchemy engine (instead of routing through DuckDB), including Snowflake engine creation and schema introspection to populate the system prompt.

Changes:

  • Introduce SeparateExecutor + SeparateGraph with run_sql_query/submit_result tools and a dedicated system prompt template.
  • Add SQLAlchemy-based Snowflake schema inspection (information_schema.tables/columns) to generate TableInfo/ColumnInfo for prompt schema.
  • Extend database adapter plumbing with try_create_sqlalchemy_engine() and a Snowflake create_sqlalchemy_engine() implementation.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
databao/agent/sqlalchemy/schema_inspection.py New SQLAlchemy schema inspector (currently Snowflake-only) producing TableInfo/ColumnInfo.
databao/agent/sqlalchemy/init.py Initializes databao.agent.sqlalchemy package.
databao/agent/executors/separate/system_prompt.jinja New system prompt template for the separate executor (includes Snowflake quoting guidance).
databao/agent/executors/separate/separate_executor.py New executor that builds per-datasource SQLAlchemy engines and injects schema into the prompt.
databao/agent/executors/separate/graph.py New LangGraph tool loop to run SQL via SQLAlchemy engines and submit results.
databao/agent/databases/snowflake_adapter.py Adds Snowflake SQLAlchemy engine creation from connection config (password/keypair/SSO).
databao/agent/databases/databases.py Adds try_create_sqlalchemy_engine() helper to delegate engine creation to adapters.
databao/agent/databases/database_adapter.py Adds optional create_sqlalchemy_engine() hook on adapters (default None).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +123 to +147
url_kwargs: dict[str, str] = {"account": config.account}
if config.user:
url_kwargs["user"] = config.user
if config.database:
url_kwargs["database"] = config.database
if config.warehouse:
url_kwargs["warehouse"] = config.warehouse
if config.role:
url_kwargs["role"] = config.role

connect_args: dict[str, Any] = {}
auth = config.auth
if isinstance(auth, SnowflakePasswordAuth):
url_kwargs["password"] = auth.password
elif isinstance(auth, SnowflakeKeyPairAuth):
connect_args["private_key"] = cls._load_private_key_bytes(auth)
elif isinstance(auth, SnowflakeSSOAuth):
url_kwargs["authenticator"] = auth.authenticator
else:
return None

if connect_args:
return create_engine(URL(**url_kwargs), connect_args=connect_args)
return create_engine(URL(**url_kwargs))

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create_sqlalchemy_engine() drops config.additional_properties entirely. Existing Snowflake config parsing stores non-core URL/query settings in additional_properties (and _create_connection_string() includes them), so omitting them here can silently change connection behavior (e.g., session/driver params). Include additional_properties when constructing the SQLAlchemy URL / query params (or document why they must be ignored).

Copilot uses AI. Check for mistakes.
Comment on lines +116 to +146
@classmethod
def create_sqlalchemy_engine(cls, config: DBConnectionConfig) -> Engine | None:
if not isinstance(config, SnowflakeConnectionProperties):
return None

from snowflake.sqlalchemy import URL # type: ignore[import-untyped]

url_kwargs: dict[str, str] = {"account": config.account}
if config.user:
url_kwargs["user"] = config.user
if config.database:
url_kwargs["database"] = config.database
if config.warehouse:
url_kwargs["warehouse"] = config.warehouse
if config.role:
url_kwargs["role"] = config.role

connect_args: dict[str, Any] = {}
auth = config.auth
if isinstance(auth, SnowflakePasswordAuth):
url_kwargs["password"] = auth.password
elif isinstance(auth, SnowflakeKeyPairAuth):
connect_args["private_key"] = cls._load_private_key_bytes(auth)
elif isinstance(auth, SnowflakeSSOAuth):
url_kwargs["authenticator"] = auth.authenticator
else:
return None

if connect_args:
return create_engine(URL(**url_kwargs), connect_args=connect_args)
return create_engine(URL(**url_kwargs))
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New create_sqlalchemy_engine() behavior is not covered by tests. There is already comprehensive coverage for SnowflakeAdapter helpers in tests/test_snowflake_adapter.py; please add unit tests that validate URL/connect_args generation for password, key-pair, and SSO auth (and that additional_properties are preserved).

Copilot uses AI. Check for mistakes.
if engine is not None:
self._sa_engines[name] = engine
else:
_LOGGER.warning("Cannot create SQLAlchemy engine for '%s': unsupported config type", name)
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This warning message is misleading: try_create_sqlalchemy_engine() can return None even for a supported config type (it just means SQLAlchemy engine creation isn't implemented for that adapter yet). Consider changing the log to reflect unsupported engine creation (and ideally include the config/db type) to avoid confusion during debugging.

Suggested change
_LOGGER.warning("Cannot create SQLAlchemy engine for '%s': unsupported config type", name)
db_type = get_db_type(db_source.config)
_LOGGER.warning(
"SQLAlchemy engine creation not implemented for database '%s' (type '%s'); "
"continuing without SQLAlchemy engine",
name,
db_type,
)

Copilot uses AI. Check for mistakes.
"""
Call this tool with the ID of the query you want to submit to the user.
This will return control to the user and must always be the last tool call.
The user will see the full query result, not just the first 12 rows. Returns a confirmation message.
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

submit_result's docstring promises the user will see the full query result, but run_sql_query executes with limit_max_rows (via fetchmany(limit)), so the stored dataframe is already truncated. Either update the docstring to match the actual behavior or change execution so the submitted result can include the full result set (while still truncating what is echoed back into tool output).

Suggested change
The user will see the full query result, not just the first 12 rows. Returns a confirmation message.
The user will see the query result up to the configured maximum row limit (which may be larger than the
12-row preview shown in tool output). Returns a confirmation message.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +9
import logging
from collections import defaultdict

from sqlalchemy import Connection, Engine, text

from databao.agent.duckdb.schema_inspection import ColumnInfo, TableInfo

_LOGGER = logging.getLogger(__name__)

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_LOGGER is defined but never used in this module. Consider removing it (and import logging) or using it for unexpected dialect / query failures to avoid dead code.

Copilot uses AI. Check for mistakes.
- Cross joins are allowed only for tables that are guaranteed small (< 5 rows), such as enums or static dictionaries.
- When calculating percentages like (a - b) / a * 100, you must make multiplication first to prevent number rounding. Use 100 * (a - b) / a.
- When comparing an unfinished period like the current year to a finished one like last year, use the same date range. Never compare unfinished periods to finished one.
- Make sure the submitted result answers the user's question and it is not-empty
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: "it is not-empty" should be "it is not empty" / "it is non-empty".

Suggested change
- Make sure the submitted result answers the user's question and it is not-empty
- Make sure the submitted result answers the user's question and it is not empty

Copilot uses AI. Check for mistakes.
- Pay attention to SQL dialect specific commands
- Cross joins are allowed only for tables that are guaranteed small (< 5 rows), such as enums or static dictionaries.
- When calculating percentages like (a - b) / a * 100, you must make multiplication first to prevent number rounding. Use 100 * (a - b) / a.
- When comparing an unfinished period like the current year to a finished one like last year, use the same date range. Never compare unfinished periods to finished one.
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Grammar: "Never compare unfinished periods to finished one" should be plural ("finished ones") to read correctly.

Suggested change
- When comparing an unfinished period like the current year to a finished one like last year, use the same date range. Never compare unfinished periods to finished one.
- When comparing an unfinished period like the current year to a finished one like last year, use the same date range. Never compare unfinished periods to finished ones.

Copilot uses AI. Check for mistakes.
@SokolovYaroslav
Copy link
Collaborator

@kosstbarz please address the copilot's comments somehow :)

I personally don't look into the code until AI is happy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants