Skip to content

⚡ Bolt: Vectorize NA replacement in combine_consumption#191

Open
Moohan wants to merge 4 commits intomainfrom
bolt-vectorize-na-replacement-16670411976422962245
Open

⚡ Bolt: Vectorize NA replacement in combine_consumption#191
Moohan wants to merge 4 commits intomainfrom
bolt-vectorize-na-replacement-16670411976422962245

Conversation

@Moohan
Copy link
Owner

@Moohan Moohan commented Mar 21, 2026

This PR implements a performance optimization in combine_consumption by replacing the vectorized ifelse() function with logical indexing for NA replacement. In R, ifelse() is a non-primitive function that evaluates all arguments and performs multiple vector copies, making it significantly slower for large datasets. Logical indexing (x[is.na(x)] <- 0) operates more directly on the memory.

Benchmarks using bench::mark() on a vector of 1 million elements showed:

  • Execution time: ~163ms (ifelse) vs ~15ms (indexing) -> ~10x speedup.
  • Memory allocation: ~53MB (ifelse) vs ~15.6MB (indexing) -> ~70% reduction.

Additionally, a scalar ifelse() in get_meter_details() was replaced with a standard if/else block to avoid unnecessary overhead and ensure type consistency by using NA_character_ instead of the default logical NA.

Verification:

  • testthat::test_local(): All tests passed.
  • rcmdcheck::rcmdcheck(): 0 errors, 0 warnings, 0 notes.

PR created automatically by Jules for task 16670411976422962245 started by @Moohan

Summary by Sourcery

Optimize meter consumption handling and NA replacement for better performance and correctness.

Enhancements:

  • Replace scalar ifelse() in get_meter_details() with a standard if/else block using NA_character_ for type-safe GSP assignment.
  • Vectorize NA replacement in combine_consumption() by using logical indexing to construct import and export consumption columns more efficiently.

Documentation:

  • Document the performance rationale for using logical indexing over ifelse() in combine_consumption()'s NA handling.

Summary by CodeRabbit

  • Refactor

    • Improved electricity meter details retrieval for enhanced reliability.
    • Optimised missing-data handling in consumption calculations for better performance and reduced memory use.
  • Documentation

    • Added a note documenting benchmarked performance of the recommended NA-replacement approach.
  • Chores

    • Updated build ignore patterns to exclude a top-level project directory from package build artifacts.

- Replaced `ifelse(is.na(x), 0, x)` with logical indexing `x[is.na(x)] <- 0` in `combine_consumption`.
- Replaced scalar `ifelse` with `if/else` block in `get_meter_details`.
- Added documentation of performance impact (~10-20x speedup, ~70% memory reduction).
- Added `.jules/` to `.Rbuildignore`.

Co-authored-by: Moohan <5982260+Moohan@users.noreply.github.com>
@google-labs-jules
Copy link
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link

sourcery-ai bot commented Mar 21, 2026

Reviewer's guide (collapsed on small PRs)

Reviewer's Guide

Optimizes NA handling in combine_consumption using logical indexing instead of vectorized ifelse, and simplifies scalar GSP assignment in get_meter_details for better performance and type correctness.

Flow diagram for NA replacement logic in combine_consumption

flowchart TD
  A[result$consumption_import] --> B[Copy to result$import_consumption]
  B --> C{Is element NA?}
  C -->|Yes| D[Set element to 0 in result$import_consumption]
  C -->|No| E[Keep original value]
  D --> F[Final result$import_consumption]
  E --> F[Final result$import_consumption]

  G[result$consumption_export] --> H[Copy to result$export_consumption]
  H --> I{Is element NA?}
  I -->|Yes| J[Set element to 0 in result$export_consumption]
  I -->|No| K[Keep original value]
  J --> L[Final result$export_consumption]
  K --> L[Final result$export_consumption]
Loading

File-Level Changes

Change Details Files
Optimize NA replacement in combine_consumption using logical indexing instead of vectorized ifelse and document the performance rationale.
  • Add roxygen @note explaining benchmarked performance benefits of logical indexing over ifelse for NA replacement in this function
  • Replace import_consumption assignment from ifelse-based NA handling to copying consumption_import then zeroing NAs via logical indexing
  • Replace export_consumption assignment from ifelse-based NA handling to copying consumption_export then zeroing NAs via logical indexing
R/meter_details.R
Ensure scalar GSP assignment in get_meter_details uses a standard if/else and character NA for type consistency.
  • Replace scalar ifelse call choosing between get_meter_gsp result and NA with a standard if/else block
  • Use NA_character_ instead of untyped NA to guarantee gsp is a character vector when meter_type is not electricity
R/meter_details.R

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 21, 2026

Walkthrough

Added .jules to .Rbuildignore; refactored R/meter_details.R to use explicit if/else for GSP selection and logical-index NA replacement in combine_consumption(); added a roxygen @note and updated man/combine_consumption.Rd with the NA-replacement performance note.

Changes

Cohort / File(s) Summary
Build configuration
/.Rbuildignore
Added ^\.jules$ to exclude the top-level .jules directory from R package builds.
Meter details & consumption logic
R/meter_details.R
Replaced ifelse() with explicit if/else for gsp in get_meter_details(); in combine_consumption() switched NA-imputation from ifelse(is.na(x), 0, x) to assignment plus x[is.na(x)] <- 0; added roxygen @note.
Documentation
man/combine_consumption.Rd
Added \note{} explaining that logical-index NA replacement (e.g. x[is.na(x)] <- 0) outperforms ifelse(is.na(x), 0, x) per bench::mark() results.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title references the main change (vectorizing NA replacement in combine_consumption) but includes an emoji prefix that adds noise and obscures the core message.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch bolt-vectorize-na-replacement-16670411976422962245

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've left some high level feedback:

  • The NA-replacement logic for import_consumption and export_consumption is duplicated; consider extracting a small helper or using a single inline function to reduce repetition and keep the intent clearer.
  • The detailed benchmark explanation is now duplicated both in the roxygen @note and in the inline comment in combine_consumption(); consider keeping the detailed rationale in just one place and making the other reference more concise to avoid drift over time.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The NA-replacement logic for `import_consumption` and `export_consumption` is duplicated; consider extracting a small helper or using a single inline function to reduce repetition and keep the intent clearer.
- The detailed benchmark explanation is now duplicated both in the roxygen `@note` and in the inline comment in `combine_consumption()`; consider keeping the detailed rationale in just one place and making the other reference more concise to avoid drift over time.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@R/meter_details.R`:
- Around line 210-213: The roxygen `@note` in meter_details.R currently exposes
implementation/benchmark details to users; remove or shorten that line in the
exported documentation for the meter_details help page (replace with a brief
user-facing note such as "NA values are replaced with 0 for missing periods")
and move the detailed benchmarking comment into an internal code comment near
the implementation that performs the replacement (e.g., the function
implementing meter_details or the vector-replacement block where x[is.na(x)] <-
0 is used) so maintainers keep the benchmark rationale without exposing
internals in the user docs.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7de5d2af-0353-4d1b-8ef8-38e843020cb5

📥 Commits

Reviewing files that changed from the base of the PR and between 74f7003 and a105901.

📒 Files selected for processing (3)
  • .Rbuildignore
  • .jules/bolt.md
  • R/meter_details.R

Comment on lines +210 to +213
#' @note Replacing NAs in vectors using logical indexing (e.g., x[is.na(x)] <- 0)
#' provides a significant performance boost (~9x speedup and ~70% memory
#' reduction) compared to ifelse(is.na(x), 0, x) in this context,
#' as verified by bench::mark().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider moving implementation detail out of exported documentation.

The @note explaining benchmarks is useful context for maintainers, but it exposes internal implementation details in the user-facing help page. Consider moving this to a code comment instead, or keeping the user-facing note brief (e.g., "NA values are replaced with 0 for missing periods").

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/meter_details.R` around lines 210 - 213, The roxygen `@note` in
meter_details.R currently exposes implementation/benchmark details to users;
remove or shorten that line in the exported documentation for the meter_details
help page (replace with a brief user-facing note such as "NA values are replaced
with 0 for missing periods") and move the detailed benchmarking comment into an
internal code comment near the implementation that performs the replacement
(e.g., the function implementing meter_details or the vector-replacement block
where x[is.na(x)] <- 0 is used) so maintainers keep the benchmark rationale
without exposing internals in the user docs.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@man/combine_consumption.Rd`:
- Line 52: Update the performance claim in the man page text (the line
containing "provides a significant performance boost (~9x speedup and ~70%
memory" in man/combine_consumption.Rd or the documentation for
combine_consumption) to reflect the actual benchmark reported in the PR (≈10×,
e.g. "~10x speedup" or "~11x" as preferred) and keep the memory claim unchanged
unless you want to adjust it; ensure the phrasing matches the PR summary "≈163
ms vs ≈15 ms (≈10× speedup)" for consistency.
- Around line 50-55: In the \note{} section of combine_consumption.Rd replace
the malformed Rd construct `x\link{is.na(x)} <- 0` with a proper code block
using \code{...} (e.g. \code{x[is.na(x)] <- 0}) and also wrap the
`ifelse(is.na(x), 0, x)` expression in \code{...} so both examples render as
code; update the text inside the \note{} command accordingly and keep the
surrounding explanatory sentence intact.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ef1ac974-f25a-4b84-bae3-6d7327e187d5

📥 Commits

Reviewing files that changed from the base of the PR and between a105901 and 1b15885.

📒 Files selected for processing (1)
  • man/combine_consumption.Rd

}
\note{
Replacing NAs in vectors using logical indexing (e.g., x\link{is.na(x)} <- 0)
provides a significant performance boost (~9x speedup and ~70\% memory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor discrepancy in performance claim.

The documentation states "~9x speedup", but the PR summary reports "~163 ms vs ~15 ms (≈10× speedup)". Consider updating the documentation to match the benchmark results more precisely (≈10× or ≈11×).

📊 Suggested alignment with benchmark results
-provides a significant performance boost (~9x speedup and ~70\% memory
+provides a significant performance boost (~10x speedup and ~70\% memory
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
provides a significant performance boost (~9x speedup and ~70\% memory
provides a significant performance boost (~10x speedup and ~70\% memory
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@man/combine_consumption.Rd` at line 52, Update the performance claim in the
man page text (the line containing "provides a significant performance boost
(~9x speedup and ~70% memory" in man/combine_consumption.Rd or the documentation
for combine_consumption) to reflect the actual benchmark reported in the PR
(≈10×, e.g. "~10x speedup" or "~11x" as preferred) and keep the memory claim
unchanged unless you want to adjust it; ensure the phrasing matches the PR
summary "≈163 ms vs ≈15 ms (≈10× speedup)" for consistency.

google-labs-jules bot and others added 2 commits March 21, 2026 12:25
- Replaced `ifelse(is.na(x), 0, x)` with logical indexing `x[is.na(x)] <- 0` in `combine_consumption`.
- Replaced scalar `ifelse` with `if/else` block in `get_meter_details`.
- Fixed ROxygen line length to comply with 80-character limit.
- Added documentation of performance impact (~10-20x speedup, ~70% memory reduction).
- Added `.jules/` to `.Rbuildignore`.

Co-authored-by: Moohan <5982260+Moohan@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
R/meter_details.R (1)

210-213: 🧹 Nitpick | 🔵 Trivial

Keep benchmark internals out of user-facing help text.

At Lines 210-213, the @note includes implementation-specific benchmark figures that are likely to age quickly. Prefer a brief behavioural note in docs and keep detailed benchmark rationale in nearby code comments or development notes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/meter_details.R` around lines 210 - 213, Update the roxygen `@note` in
meter_details.R to remove the specific benchmark figures and keep a concise
behavioral note (e.g., state that replacing NAs with logical indexing like
x[is.na(x)] <- 0 is preferable for performance) so user-facing docs do not
include implementation-specific numbers; move the detailed benchmark rationale
and exact timings into a nearby code comment or a development note file for
maintainers. Ensure you edit the `@note` block (the roxygen comment containing the
benchmark sentence) and add the detailed benchmark text as a non-exported code
comment adjacent to the implementation that motivated the note.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@R/meter_details.R`:
- Around line 210-213: Update the roxygen `@note` in meter_details.R to remove the
specific benchmark figures and keep a concise behavioral note (e.g., state that
replacing NAs with logical indexing like x[is.na(x)] <- 0 is preferable for
performance) so user-facing docs do not include implementation-specific numbers;
move the detailed benchmark rationale and exact timings into a nearby code
comment or a development note file for maintainers. Ensure you edit the `@note`
block (the roxygen comment containing the benchmark sentence) and add the
detailed benchmark text as a non-exported code comment adjacent to the
implementation that motivated the note.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6ee63c87-6fad-4cea-b1be-162115fbfdaf

📥 Commits

Reviewing files that changed from the base of the PR and between 1b15885 and 95991c9.

📒 Files selected for processing (2)
  • R/meter_details.R
  • man/combine_consumption.Rd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant