Skip to content

Conversation

@betodealmeida
Copy link
Member

@betodealmeida betodealmeida commented Mar 14, 2024

SUMMARY

This PR is part of SIP-117:

  • Introduce SQLScript and SQLStatement, and replace any simple calls to sqlparse with calls to the new classes (chore: improve SQL parsing #26767).
  • Implement a custom SQLStatement class for Kusto, since it doesn't use conventional SQL (this PR).
  • Implement insert_rls_in_predicate and insert_rls_as_subquery inside SQLStatement.
  • Implement get_cte_query inside SQLStatement.
  • Implement apply_top_to_sql in SQLStatement.
  • Replace ParsedQuery with SQLScript and SQLStatement.
  • Remove sqlparse from Superset.

One of the goals of SIP-117 is to encapsulate all SQL parsing and manipulation logic inside two classes: SQLScript and SQLStatement (mostly the latter; SQLScript is just a lightweight collection of SQLStatement objects). This includes logic that currently lives inside the DB engine specs (eg, parse_sql, apply_top_to_sql, get_cte_query, etc.).

Unfortunately, one of the DB engine specs currently supported by Superset is very unusual. The KustoKqlEngineSpec engine spec uses a language called KQL that is very different from SQL. In order to move the parse_sql method out of the DB engine spec this PR refactors the SQLStatement class, introducing a BaseSQLStatement base class that is used by SQLStatement (for dialects supported by sqlglot) and KustoKQLStatement (for KQL).

Note that this is not a path forward for supporting non-SQL databases! I only added support for KustoKqlEngineSpec because it's already there, but IMHO we should deprecate it in the future. Supporting non-SQL databases adds a lot of complexity to Superset, and is a problem that is better solved orthogonally by writing a SQL adapter.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@betodealmeida betodealmeida marked this pull request as ready for review March 14, 2024 22:54
@betodealmeida betodealmeida force-pushed the remove-sqlparse-kusto branch 2 times, most recently from 02e0c52 to dbe087f Compare March 15, 2024 13:25
@codecov
Copy link

codecov bot commented Mar 15, 2024

Codecov Report

Attention: Patch coverage is 89.90826% with 11 lines in your changes missing coverage. Please review.

Project coverage is 69.85%. Comparing base (5083ca0) to head (f22648e).
Report is 1378 commits behind head on master.

Files with missing lines Patch % Lines
superset/sql_parse.py 89.90% 11 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #27522      +/-   ##
==========================================
+ Coverage   69.83%   69.85%   +0.01%     
==========================================
  Files        1910     1910              
  Lines       74839    74924      +85     
  Branches     8351     8351              
==========================================
+ Hits        52263    52337      +74     
- Misses      20524    20535      +11     
  Partials     2052     2052              
Flag Coverage Δ
hive 48.98% <49.54%> (-0.01%) ⬇️
mysql 77.96% <55.04%> (-0.05%) ⬇️
postgres 78.07% <58.71%> (-0.08%) ⬇️
presto 53.70% <53.21%> (-0.02%) ⬇️
python 83.18% <89.90%> (+<0.01%) ⬆️
sqlite 77.51% <55.04%> (-0.07%) ⬇️
unit 56.78% <88.07%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

return {}


class SQLScript:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this already has tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is an existing class that already has tests, it looks like new because the file changed too much.

INSIDE_MULTILINE_STRING = enum.auto()


def split_kql(kql: str) -> list[str]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a test for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@betodealmeida betodealmeida force-pushed the remove-sqlparse-kusto branch from bc7a159 to 43034d6 Compare March 22, 2024 15:56
Copy link
Member

@craig-rueda craig-rueda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@betodealmeida betodealmeida force-pushed the remove-sqlparse-kusto branch from 43034d6 to f22648e Compare March 22, 2024 16:19
@betodealmeida betodealmeida merged commit cd7972d into master Mar 22, 2024
@rusackas rusackas deleted the remove-sqlparse-kusto branch March 22, 2024 18:48
EandrewJones pushed a commit to UMD-ARLIS/superset that referenced this pull request Apr 5, 2024
EnxDev pushed a commit to EnxDev/superset that referenced this pull request Apr 12, 2024
qleroy pushed a commit to qleroy/superset that referenced this pull request Apr 28, 2024
vinothkumar66 pushed a commit to vinothkumar66/superset that referenced this pull request Nov 11, 2024
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 4.1.0 First shipped in 4.1.0 labels Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 4.1.0 First shipped in 4.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants