-
Notifications
You must be signed in to change notification settings - Fork 5k
enh: interp supports using non-null prev/next values to fill #34209
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… for improved parameter management
Summary of ChangesHello @Tony2h, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly improves the Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request enhances the INTERP function to support using non-null previous/next values for fill operations. The key change is that when ignore_null_values parameter is set to 1 in the INTERP function and fill modes PREV/NEXT/NEAR/LINEAR are used, the system will skip NULL values and search for the nearest non-NULL values instead.
Key Changes:
- Implemented a notification mechanism for table scan operators to signal when prev/next scan steps are complete
- Modified timeslice operator to skip NULL values when filling based on the
ignore_null_valuesparameter - Added comprehensive test coverage including stream processing scenarios
Reviewed changes
Copilot reviewed 36 out of 37 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| test/cases/09-DataQuerying/12-Interp/test_query_interp_fill.py | Added comprehensive test cases for new ignore-null fill behavior |
| test/cases/09-DataQuerying/12-Interp/in/interp_fill_ignore_null.in | Test input queries covering various scenarios |
| test/cases/09-DataQuerying/12-Interp/ans/interp_fill_ignore_null.csv | Expected results for ignore-null fill tests |
| source/libs/executor/src/timesliceoperator.c | Core logic for interp with notification mechanism and fill handling |
| source/libs/executor/src/scanoperator.c | Table scan preparation with notification support |
| source/libs/executor/src/executor.c | Added notifyTableScanTask function |
| source/libs/executor/src/exchangeoperator.c | Exchange operator notification handling |
| source/libs/qworker/src/qworker.c | Worker-side notification processing |
| source/dnode/vnode/src/tsdb/tsdbRead2.c | Reader step-done mechanism implementation |
| source/common/src/msg/tmsg.c | Serialization for new operator parameter types |
| docs/* | Documentation updates explaining new behavior in Chinese and English |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (req.pOpParam != NULL && req.pOpParam->value != NULL && | ||
| req.pOpParam->opType == QUERY_NODE_PHYSICAL_PLAN_TABLE_SCAN && | ||
| req.pOpParam->opType == nodeType(sStreamReaderCalcInfo->calcAst->pNode)) { |
Copilot
AI
Jan 7, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential logic error: Line 3826 has a duplicate check req.pOpParam->opType == QUERY_NODE_PHYSICAL_PLAN_TABLE_SCAN && req.pOpParam->opType == nodeType(...) which will always evaluate to false. The second condition should likely check something else, perhaps req.pOpParam->value != NULL or compare against a different type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a significant enhancement to the INTERP function, allowing it to ignore NULL values when filling data using PREV, NEXT, or NEAR. This is achieved through a new notification mechanism between operators, which allows the INTERP operator to signal downstream SCAN or EXCHANGE operators to stop fetching data once a non-NULL value is found. The implementation spans documentation, messaging, executor, and query worker layers. The changes are extensive and well-structured. I've identified a few minor issues in the documentation for clarity and consistency, and a potential robustness improvement in the C code.
| */ | ||
| ST_SET_EVENT_RECEIVED(sStreamReaderCalcInfo, ST_EVENT_NOTIFY); | ||
| atomic_store_64(&sStreamReaderCalcInfo->notifyTs, pScanParam->notifyTs); | ||
| freeOperatorParam(req.pOpParam, OP_GET_PARAM); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling freeOperatorParam with OP_GET_PARAM for a parameter of type NOTIFY_TYPE_SCAN_PARAM is potentially unsafe. The function freeTableScanGetOperatorParam, which will be called internally, is not designed to handle notification parameters and assumes a DYN_TYPE_SCAN_PARAM structure. This currently works because calloc is used for allocation (which zeroes out the memory), but this approach is fragile. It would be more robust to handle different parameter types within the free function to avoid potential issues if the allocation method changes.
| - `RANGE('2023-10-01T00:00:00.000', '2023-10-01T23:59:59.999', 1h)` | ||
| - EVERY: Time interval range, with every_val being a positive value and precision options of 1n, 1u, 1a, 1s, 1m, 1h, 1d, and 1w, such as EVERY (1s). | ||
| - FILL: Types can be selected as NONE (unfilled), VALUE (filled with specified value), PREV (previous non NULL value), NEXT (next non NULL value), NEAR (nearest non NULL value before and after). | ||
| - FILL: Types can be selected as NONE (unfilled), VALUE (filled with specified value), PREV (previous valid value), NEXT (next valid value), NEAR (nearest valid value before and after), LINEAR (linear interpolation). Note: whether a NULL value is considered valid data depends on the ignore_null_values parameter of the interp function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There appears to be a typo involving a zero-width space character (U+200B) after ignore_null_values. This can be difficult to see and may cause rendering issues. A similar issue is present on line 316. Please remove it for correctness.
| - FILL: Types can be selected as NONE (unfilled), VALUE (filled with specified value), PREV (previous valid value), NEXT (next valid value), NEAR (nearest valid value before and after), LINEAR (linear interpolation). Note: whether a NULL value is considered valid data depends on the ignore_null_values parameter of the interp function. | |
| - FILL: Types can be selected as NONE (unfilled), VALUE (filled with specified value), PREV (previous valid value), NEXT (next valid value), NEAR (nearest valid value before and after), LINEAR (linear interpolation). Note: whether a NULL value is considered valid data depends on the ignore_null_values parameter of the interp function. |
|
|
||
| 以上填充模式中,除了 NONE 模式默认不填充值之外,其他模式在查询的整个时间范围内如果没有数据 FILL 子句将被忽略,即不产生填充数据,查询结果为空。这种行为在部分模式(PREV、NEXT、LINEAR)下具有合理性,因为在这些模式下没有数据意味着无法产生填充数值。 | ||
|
|
||
| “有效值”的定义在 INTERVAL 子句和 INTERP 子句中有所不同:在 INTERVAL 子句中,扫描出的数据均为有效数据,例如 FILL(PREV) 即使用前一条数据填充;在 INTERP 子句中,从 v3.4.0.0 开始,NULL 值是否有效取决于 INTERP 函数的 ignore_null_values 参数,例如 FILL(PREV) 且 NULL 值无效,则略过 NULL,继续寻找 non-NULL 数据。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the rest of the Chinese documentation, it would be better to use "非 NULL" instead of the English "non-NULL".
| “有效值”的定义在 INTERVAL 子句和 INTERP 子句中有所不同:在 INTERVAL 子句中,扫描出的数据均为有效数据,例如 FILL(PREV) 即使用前一条数据填充;在 INTERP 子句中,从 v3.4.0.0 开始,NULL 值是否有效取决于 INTERP 函数的 ignore_null_values 参数,例如 FILL(PREV) 且 NULL 值无效,则略过 NULL,继续寻找 non-NULL 数据。 | |
| “有效值”的定义在 INTERVAL 子句和 INTERP 子句中有所不同:在 INTERVAL 子句中,扫描出的数据均为有效数据,例如 FILL(PREV) 即使用前一条数据填充;在 INTERP 子句中,从 v3.4.0.0 开始,NULL 值是否有效取决于 INTERP 函数的 ignore_null_values 参数,例如 FILL(PREV) 且 NULL 值无效,则略过 NULL,继续寻找非 NULL 数据。 |
| 以上填充模式中,除了 NONE 模式默认不填充值之外,其他模式在查询的整个时间范围内如果没有数据 FILL 子句将被忽略,即不产生填充数据,查询结果为空。这种行为在部分模式(PREV、NEXT、LINEAR)下具有合理性,因为在这些模式下没有数据意味着无法产生填充数值。而对另外一些模式(NULL、VALUE)来说,理论上是可以产生填充数值的,至于需不需要输出填充数值,取决于应用的需求。所以为了满足这类需要强制填充数据或 NULL 的应用的需求,同时不破坏现有填充模式的行为兼容性,从 v3.0.3.0 开始,增加了两种新的填充模式: | ||
| 以上填充模式中,除了 NONE 模式默认不填充值之外,其他模式在查询的整个时间范围内如果没有数据 FILL 子句将被忽略,即不产生填充数据,查询结果为空。这种行为在部分模式(PREV、NEXT、LINEAR)下具有合理性,因为在这些模式下没有数据意味着无法产生填充数值。 | ||
|
|
||
| “有效值”的定义在 INTERVAL 子句和 INTERP 子句中有所不同:在 INTERVAL 子句中,扫描出的数据均为有效数据,例如 FILL(PREV) 即使用前一条数据填充;在 INTERP 子句中,从 v3.4.0.0 开始,NULL 值是否有效取决于 INTERP 函数的 ignore_null_values 参数,例如 FILL(PREV) 且 NULL 值无效,则略过 NULL,继续寻找 non-NULL 数据。 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency within the Chinese documentation, it's better to use "非 NULL" instead of "non-NULL".
| “有效值”的定义在 INTERVAL 子句和 INTERP 子句中有所不同:在 INTERVAL 子句中,扫描出的数据均为有效数据,例如 FILL(PREV) 即使用前一条数据填充;在 INTERP 子句中,从 v3.4.0.0 开始,NULL 值是否有效取决于 INTERP 函数的 ignore_null_values 参数,例如 FILL(PREV) 且 NULL 值无效,则略过 NULL,继续寻找 non-NULL 数据。 | |
| “有效值”的定义在 INTERVAL 子句和 INTERP 子句中有所不同:在 INTERVAL 子句中,扫描出的数据均为有效数据,例如 FILL(PREV) 即使用前一条数据填充;在 INTERP 子句中,从 v3.4.0.0 开始,NULL 值是否有效取决于 INTERP 函数的 ignore_null_values 参数,例如 FILL(PREV) 且 NULL 值无效,则略过 NULL,继续寻找非 NULL 数据。 |
| resetAllDataBlockScanInfo(pReader->innerReader[1]->status.pTableMap, | ||
| pReader->info.window.ekey, step); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency with the logic in the EXTERNAL_ROWS_PREV case, it would be clearer to use pReader->status.pTableMap here instead of pReader->innerReader[1]->status.pTableMap. Although they point to the same shared map, using the outer reader's map directly improves code readability and reduces potential confusion.
resetAllDataBlockScanInfo(pReader->status.pTableMap,
pReader->info.window.ekey, step);
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.