-
Notifications
You must be signed in to change notification settings - Fork 16.5k
chore(sql_parse): Provide more meaningful SQLGlot errors #27858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chore(sql_parse): Provide more meaningful SQLGlot errors #27858
Conversation
a2d566e to
3f7cee3
Compare
superset/sql_parse.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I included the work may as SQLGlot may result in false positives whilst parsing. See here as an example.
superset/sql_parse.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simply doing str(ex) doesn't work as SQLGlot includes ANSI codes which are helpful when printing in the terminal, but simply render as [4m and [0m in HTML.
3f7cee3 to
d70dcfa
Compare
d70dcfa to
3593acf
Compare
|
It feels to me that the SQL IDE should really just pass the SQL to the underlying connection and return the results or error message in most cases, and not get in the way of things. Now I'm wondering if |
|
@mistercrunch that was the other option I considered and had discussed internally, i.e., the error being somewhat context aware. Rather than Thus in the context of SQL Lab—which doesn't leverage a cache and where institutions may have a third party security manager, e.g. Apache Ranger—then allowing the The challenge arrises when:
Furthermore the current implement of
then we may augment our security manager to simply not raise when the |
74cce94 to
2f8d080
Compare
That's a good idea! Part of the problem today is that |
betodealmeida
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
4544dde to
f3219c7
Compare
f3219c7 to
43f6f84
Compare
michael-s-molina
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow! That's what I call a well-defined PR description 👏🏼
Thanks for the improvement and explanations.
(cherry picked from commit c385297)
(cherry picked from commit c385297)
SUMMARY
#26767 replaced the non-validating
sqlparseparser with the pseudo-validatingSQLGlotparser to aid with SQL parsing. The main use case for SQL parsing—from a security perspective—is to identify/extract the underlying tables/views referenced in the statement and checking whether the user is authorized to access them.Previously, given the non-validating nature of
sqlparse, virtually all SQL statements could be parsed—irrespective of whether they were valid—which meant that any syntax errors were reported by the underlying engine in a pseudo-meaningful manner. Now however, given thatSQLGlotis more opinionated, syntax errors are detected bySQLGlotduring the table/view extraction phase meaning that the query is never run and thus the engine's parser is never run resulting in a non-actionable error, i.e., it merely stated that the SQL could not be parsed. This is especially problematic in SQL Lab where freeform SQL is the norm and can be rather lengthy.The TL;DR is now SQL statements which contain a syntax error are likely going to be captured by
SQLGlotupstream of being executed by the engine.This PR proposes leveraging the messaging from
SQLGlot'sParseErrororTokenError(example) to provide a more meaningful (and thus actionable) error message to the user.BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF
PREVIOUS
Prior to #26767
sqlparsewas able to "parse" the SQL statement—missing anANDafter thekey = 'foobar'clause—which was then executed by the Trino engine (given that the user had access to thedatainfra.one_rowtable) resulting in the following Trino error,which detected the error was around the
1token.CURRENT
After #26767
SQLGlotwas (rightfully) unable to parse the SQL statement and thus could not extract the underlying tables which halted the execution flow. The resulting error message isn't overly helpful as it doesn't inform the user where the actual syntax error is.PROPOSED
Leveraging
SQLGlot'sParseErroret al. exceptions to provide a more meaningful error message,which correctly detected the error was around the
1token. The messaging is loosely based on how MySQL presents errors.TESTING INSTRUCTIONS
CI.
ADDITIONAL INFORMATION