-
Notifications
You must be signed in to change notification settings - Fork 862
Type-checked data tables #7381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type-checked data tables #7381
Conversation
This reverts commit 6e1588d.
|
/ai review this PR. no need for praises just find the potential issues |
|
Claude finished @diegoimbert's task in 1m 39s —— View job PR Review: Potential Issues Found
1. SQL Comment Removal - Doesn't Handle String Literals CorrectlyFile: The #[test]
fn test_remove_comments_comment_in_string() {
let sql = "SELECT '-- not a comment' FROM table;";
let result = remove_comments(sql);
assert_eq!(result, "SELECT '-- not a comment' FROM table;");
}The test passes, but this is coincidental - the function doesn't actually track string state. If a string contains 2. SQL Query Statement Count Heuristic - Escape Sequence BugFile: The
3. Worker Cache Memory Leak RiskFile: The this._sqlQueriesByFile = new Map()
this._fileVersions = new Map()
this._transformedCodeCache = new Map()If users edit many files over a long session, these maps could grow unbounded. Consider implementing an LRU cache or cleaning up when files are closed. 4. Potential Race Condition in SQL Query UpdatesFile: The const model = editor.getModel(uri)
if (!model) {
console.warn(`[SqlTypeService] No Monaco model found for URI: ${uriString}`)
return
}
// ... model could be disposed here before revalidateModel is called
if (typeof worker.updateSqlQueries === 'function') {
await worker.updateSqlQueries(uriString, queries)
revalidateModel(model) // model might be disposed
}5. Silent Error Swallowing in Asset ParserFile: SQL parsing errors are silently ignored: match windmill_parser_sql::parse_assets(&sql_with_attach) {
Ok(mut sql_assets) => { ... }
_ => {} // Errors silently swallowed
}Consider at least logging these errors for debugging purposes. 6. Type Injection XSS-like Vulnerability (Low Risk)File: Column names from SQL queries are directly interpolated into TypeScript code without escaping: let middlePart =
'<{ ' +
Object.entries(query?.prepared?.columns ?? {})
.map(([key, type]) => `"${key}": ${type}`) // key not escaped
.join('; ') +
' }>'If a column name contains 7. Hardcoded Worker Initialization TimeoutFile: const WORKER_INIT_TIMEOUT = 10000
const MAX_RETRIES = 10
const RETRY_DELAY = 300These values are hardcoded and might be insufficient on slower machines or under heavy load. Consider making them configurable or increasing the defaults. 8. Missing Null Check in MapResourceFile: for (const key of Object.keys(toFetch)) {
let value = fetchedData[key]
obj[key] = value // value could be undefined if fetcher didn't return all keys
}If the fetcher fails to return data for some keys, 9. Type Assertion Without ValidationFile: let res = (await JobService.runScriptPreviewAndWaitResult({
...
})) as { error?: string; columns?: { name: string; type: string }[] }[]The response is cast without validation. If the API response format changes, this will fail silently or cause runtime errors. 10. Potential Issue with DuckDB Query Type ParametersFile: The export function ducklake(name: string = "main"): SqlTemplateFunction {
return sqlProviderImpl("ducklake", { name }); // No schema parsing
}But SummaryThe most critical issues are:
The other issues are lower priority but worth addressing for robustness. |
|
Claude finished @diegoimbert's task in 2m 44s —— View job PR Review: Type-checked Data Tables
SummaryThis PR implements type-checked data tables by:
The implementation is well-structured with good test coverage for the core functionality. I've added inline comments for specific issues found during the review. Key Issues FoundHigh Priority:
Medium Priority: Low Priority: Positive Observations
Testing InstructionsTo verify the type-checked data tables feature: Navigate to the Script Editor and create a new TypeScript (Bun) script. Import windmill and use the |
| this._sqlQueriesByFile = new Map() | ||
| // Map of file URI -> version number (incremented when SQL queries change) | ||
| this._fileVersions = new Map() | ||
| // Cache of transformed code and offset maps per file version | ||
| // Structure: fileUri -> {version, originalText, transformed, offsetMap, offsetMapEntries} | ||
| this._transformedCodeCache = new Map() | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential memory leak: These three Map instances (_sqlQueriesByFile, _fileVersions, _transformedCodeCache) have no cache eviction mechanism. If users edit many files over a long session, these maps could grow unbounded. Consider implementing an LRU cache or cleaning up entries when files are closed.
| this._sqlQueriesByFile = new Map() | |
| // Map of file URI -> version number (incremented when SQL queries change) | |
| this._fileVersions = new Map() | |
| // Cache of transformed code and offset maps per file version | |
| // Structure: fileUri -> {version, originalText, transformed, offsetMap, offsetMapEntries} | |
| this._transformedCodeCache = new Map() | |
| } | |
| // Map of file URI -> SQL query details | |
| this._sqlQueriesByFile = new Map() | |
| // Map of file URI -> version number (incremented when SQL queries change) | |
| this._fileVersions = new Map() | |
| // Cache of transformed code and offset maps per file version | |
| // Structure: fileUri -> {version, originalText, transformed, offsetMap, offsetMapEntries} | |
| // TODO: Consider implementing LRU eviction to prevent memory growth | |
| this._transformedCodeCache = new Map() |
|
|
||
| // Handle string literals (single or double quotes) | ||
| if !in_line_comment && !in_block_comment { | ||
| if (c == '\'' || c == '"') && prev_char != '\\' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL escape sequence handling: The backslash escape check (prev_char != '\\') is not correct for standard SQL. In standard SQL (and PostgreSQL without E'' strings), quotes are escaped by doubling them (''), not with backslash.
While this won't cause issues in most cases since strings containing \' are rare, it could cause incorrect behavior with escape strings like E'path\\data' in PostgreSQL where the backslash has special meaning.
| // This method is added by our sqlTypePlugin.worker.js | ||
| if (typeof worker.updateSqlQueries === 'function') { | ||
| await worker.updateSqlQueries(uriString, queries) | ||
| revalidateModel(model) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential race condition: The model could be disposed between the null check on line 100 and the call to revalidateModel here. While revalidateModel does check for disposal internally, the model could theoretically be disposed and recreated (with a different URI) in between async operations. Consider adding defensive programming:
| revalidateModel(model) | |
| await worker.updateSqlQueries(uriString, queries) | |
| // Re-fetch the model in case it was recreated during async operation | |
| const currentModel = editor.getModel(uri) | |
| if (currentModel) { | |
| revalidateModel(currentModel) | |
| } |
| } | ||
| _ => {} | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silent error swallowing: SQL parsing errors are silently ignored here. While this may be intentional to avoid breaking the parser for malformed SQL, it makes debugging harder. Consider at least logging the error:
| } | |
| _ => {} | |
| } | |
| Err(e) => { | |
| // Log SQL parsing error for debugging, but don't fail the asset parsing | |
| tracing::debug!("Failed to parse SQL query: {:?}", e); | |
| } |
| for (const key of Object.keys(toFetch)) { | ||
| let value = fetchedData[key] | ||
| obj[key] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing value check: If the fetcher fails to return data for some keys, undefined values will be stored in the result object. This could cause unexpected behavior downstream. Consider filtering out undefined values or logging a warning:
| for (const key of Object.keys(toFetch)) { | |
| let value = fetchedData[key] | |
| obj[key] = value | |
| for (const key of Object.keys(toFetch)) { | |
| let value = fetchedData[key] | |
| if (value !== undefined) { | |
| obj[key] = value | |
| } | |
| } |
| let res = (await JobService.runScriptPreviewAndWaitResult({ | ||
| workspace: getWorkspace()!, | ||
| requestBody: { | ||
| language: 'postgresql', | ||
| content: queryContent, | ||
| args: { database: `datatable://${chunk[0][1]?.source_name}` } | ||
| } | ||
| })) as { error?: string; columns?: { name: string; type: string }[] }[] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type assertion without validation: The response from JobService.runScriptPreviewAndWaitResult is cast without validation. If the API response format changes or returns an unexpected structure, this will cause runtime errors. Consider adding validation or defensive checks:
| let res = (await JobService.runScriptPreviewAndWaitResult({ | |
| workspace: getWorkspace()!, | |
| requestBody: { | |
| language: 'postgresql', | |
| content: queryContent, | |
| args: { database: `datatable://${chunk[0][1]?.source_name}` } | |
| } | |
| })) as { error?: string; columns?: { name: string; type: string }[] }[] | |
| })) as unknown | |
| // Validate response structure | |
| const res = Array.isArray(rawRes) | |
| ? (rawRes as { error?: string; columns?: { name: string; type: string }[] }[]) | |
| : [] |
| const WORKER_INIT_TIMEOUT = 10000 | ||
| const MAX_RETRIES = 10 | ||
| const RETRY_DELAY = 300 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded timeout values: These timeout values are hardcoded and might be insufficient on slower machines or under heavy load. Consider making these configurable or documenting the expected initialization time:
| const WORKER_INIT_TIMEOUT = 10000 | |
| const MAX_RETRIES = 10 | |
| const RETRY_DELAY = 300 | |
| // Worker initialization constants - adjust if users report timeout issues | |
| const WORKER_INIT_TIMEOUT = 10000 | |
| const MAX_RETRIES = 10 | |
| const RETRY_DELAY = 300 |
No description provided.