Skip to content

⚡ Bolt: optimize NA replacement and eliminate redundant API calls#179

Open
Moohan wants to merge 2 commits intomainfrom
bolt-optimization-na-replacement-and-api-calls-11554196582224710021
Open

⚡ Bolt: optimize NA replacement and eliminate redundant API calls#179
Moohan wants to merge 2 commits intomainfrom
bolt-optimization-na-replacement-and-api-calls-11554196582224710021

Conversation

@Moohan
Copy link
Copy Markdown
Owner

@Moohan Moohan commented Mar 8, 2026

This PR implements two key performance optimizations in the octopusR package:

  1. Efficient NA Replacement: In combine_consumption(), we now use logical indexing (x[is.na(x)] <- 0) instead of ifelse(is.na(x), 0, x). Benchmarking on 1M elements showed a ~3x speedup and significant reduction in memory allocation.
  2. Redundant API Call Elimination: The get_meter_details() function was automatically calling get_meter_gsp() (an API call) even when the GSP was not needed. We introduced an include_gsp parameter (default TRUE for backward compatibility) and updated get_consumption() to set it to FALSE. This eliminates one redundant API call every time get_consumption() is called without explicit MPAN/Serial.

Additionally, we improved type stability by using NA_character_ for the gsp field and updated the package documentation.

All tests pass and R CMD check is clean.


PR created automatically by Jules for task 11554196582224710021 started by @Moohan

Summary by Sourcery

Optimize meter consumption processing and reduce unnecessary external lookups for improved performance.

New Features:

  • Add an include_gsp flag to meter detail retrieval to allow callers to skip GSP lookup when not needed.

Enhancements:

  • Replace NA handling in combined consumption data with more efficient logical indexing to improve performance.
  • Ensure type-stable handling of the GSP field using character NA values.
  • Record performance learnings and decisions in an internal Bolt note for future reference.

Summary by CodeRabbit

  • Documentation

    • Added guidance on performance optimisation approaches for data-heavy operations.
  • Refactor

    • Optimised meter details retrieval to avoid unnecessary computations when GSP data is not required.
    • Simplified consumption value handling to reduce redundant operations.

💡 What:
1. Replaced `ifelse()` with logical indexing for NA replacement in `combine_consumption()`.
2. Added `include_gsp` parameter to `get_meter_details()` to skip redundant GSP lookup API calls.
3. Updated `get_consumption()` to use `include_gsp = FALSE`.
4. Switched to `NA_character_` for type consistency in `gsp` field.

🎯 Why:
`ifelse()` is inefficient for large vectors as it evaluates all branches and allocates more memory. `get_consumption()` was triggering a redundant API call to fetch GSP data which is not used by the function.

📊 Impact:
- NA replacement is ~3x faster and uses ~3x less memory.
- Redundant GSP lookup API call eliminated in `get_consumption()`, reducing network overhead.

🔬 Measurement:
`bench::mark()` showed:
- ifelse: 32.2ms, 53.4MB
- logical_indexing: 11.8ms, 15.6MB

Verified API call reduction with manual mocking of `octopus_api`.

Co-authored-by: Moohan <5982260+Moohan@users.noreply.github.com>
@google-labs-jules
Copy link
Copy Markdown
Contributor

👋 Jules, reporting for duty! I'm here to lend a hand with this pull request.

When you start a review, I'll add a 👀 emoji to each comment to let you know I've read it. I'll focus on feedback directed at me and will do my best to stay out of conversations between you and other bots or reviewers to keep the noise down.

I'll push a commit with your requested changes shortly after. Please note there might be a delay between these steps, but rest assured I'm on the job!

For more direct control, you can switch me to Reactive Mode. When this mode is on, I will only act on comments where you specifically mention me with @jules. You can find this option in the Pull Request section of your global Jules UI settings. You can always switch back!

New to Jules? Learn more at jules.google/docs.


For security, I will only act on instructions from the user who triggered this task.

@sourcery-ai
Copy link
Copy Markdown

sourcery-ai Bot commented Mar 8, 2026

Reviewer's Guide

Optimizes NA handling in consumption aggregation and avoids unnecessary GSP API lookups by parameterizing meter detail retrieval, while keeping type stability and documentation in sync.

Sequence diagram for get_consumption avoiding redundant GSP API call

sequenceDiagram
  actor RCaller
  participant get_consumption
  participant get_meter_details
  participant get_meter_gsp

  RCaller->>get_consumption: call with optional mpan_mprn, serial_number
  alt mpan_mprn or serial_number is NULL
    get_consumption->>get_meter_details: meter_type, direction, include_gsp = FALSE
    get_meter_details->>get_meter_details: validate meter_type, direction
    get_meter_details->>get_meter_details: read mpan_mprn, serial_number from env
    alt meter_type is electricity and include_gsp is TRUE
      get_meter_details->>get_meter_gsp: mpan
      get_meter_gsp-->>get_meter_details: gsp
      get_meter_details-->>get_consumption: meter_details with gsp
    else include_gsp is FALSE or meter_type is gas
      get_meter_details-->>get_consumption: meter_details with gsp = NA_character_
    end
    get_consumption->>get_consumption: fill missing mpan_mprn, serial_number
  else mpan_mprn and serial_number provided
    get_consumption->>get_consumption: skip get_meter_details call
  end
  get_consumption-->>RCaller: consumption data
Loading

Class diagram for updated meter details and helper functions

classDiagram
  class OctopusMeterPoint {
    +character type
    +character mpan_mprn
    +character serial_number
    +character direction
    +character gsp
  }

  class MeterDetailsService {
    +get_meter_details(meter_type, direction, include_gsp) OctopusMeterPoint
  }

  class GspService {
    +get_meter_gsp(mpan) character
  }

  class ConsumptionService {
    +combine_consumption(result) data_frame
    +get_consumption(meter_type, direction, mpan_mprn, serial_number) data_frame
  }

  MeterDetailsService --> OctopusMeterPoint : constructs
  MeterDetailsService --> GspService : optional uses
  ConsumptionService --> MeterDetailsService : uses
  ConsumptionService --> OctopusMeterPoint : reads fields
Loading

File-Level Changes

Change Details Files
Parameterize meter detail retrieval to avoid redundant GSP API calls and make GSP inclusion optional and type-stable.
  • Add include_gsp parameter (default TRUE) to meter_details helper and thread it into the function signature and call sites
  • Compute meter_gsp only for electricity meters when include_gsp is TRUE; otherwise set gsp to NA_character_
  • Update get_consumption to call get_meter_details with include_gsp = FALSE when it only needs MPAN/serial, removing an unnecessary API request
R/meter_details.R
R/get_consumption.R
Optimize NA replacement in combined consumption data for better performance and memory usage.
  • Replace ifelse-based NA replacement with direct logical indexing on import and export consumption vectors
  • Preserve existing column semantics while reducing allocations and avoiding per-element branching
R/meter_details.R
Capture optimization learnings and practices in project documentation.
  • Add a Bolt note documenting the inefficiency of ifelse for large vectors and the new include_gsp flag pattern
  • Clarify guidance on using NA_character_ for metadata type consistency
.jules/bolt.md

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 8, 2026

Warning

Rate limit exceeded

@Moohan has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 21 minutes and 49 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 30b786db-9297-4490-af23-5c0ebadd4d13

📥 Commits

Reviewing files that changed from the base of the PR and between 1a9395f and 403db88.

📒 Files selected for processing (2)
  • .Rbuildignore
  • R/get_consumption.R

Walkthrough

Introduces an include_gsp parameter to get_meter_details to conditionally skip expensive GSP calculations. Refactors NA handling in combine_consumption using vectorized indexing instead of ifelse. Updates internal data retrieval paths to use the new flag, with supporting documentation on performance optimisation strategies.

Changes

Cohort / File(s) Summary
Performance documentation
.jules/bolt.md
Adds guidance on performance-focused refactoring for data-heavy paths, highlighting ifelse evaluation inefficiencies and recommending vectorised indexing and conditional computation patterns.
Conditional GSP computation
R/meter_details.R, R/get_consumption.R
Introduces include_gsp parameter (default TRUE) to get_meter_details to conditionally skip expensive GSP lookups; refactors combine_consumption to use vectorised indexing for NA handling; updates get_consumption to pass include_gsp = FALSE when retrieving missing meter details.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main performance optimizations in the changeset: NA replacement efficiency and eliminating redundant API calls.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bolt-optimization-na-replacement-and-api-calls-11554196582224710021

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've reviewed your changes and they look great!


Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.jules/bolt.md:
- Line 1: The markdown heading on the first line ("2025-01-24 - Avoiding
redundant API calls and inefficient vector operations") is level-2; change it to
an H1 by replacing the leading "##" with a single "#" so the file begins with an
H1 to satisfy markdownlint.

In `@R/get_consumption.R`:
- Line 78: The single-line call to get_meter_details is exceeding the
80-character limit; break the call across multiple lines so it stays under 80
chars (e.g., put the function name and opening paren on one line and place
meter_type, direction, and include_gsp = FALSE on subsequent indented lines)
when assigning to meter_details to clear the lint warning.

In `@R/meter_details.R`:
- Around line 80-84: The testing branch of get_meter_details() ignores the new
include_gsp flag because it calls testing_meter() without passing it and
testing_meter() always derives an electricity gsp; update get_meter_details() to
forward the include_gsp argument to testing_meter(include_gsp = include_gsp) and
modify testing_meter() to accept an include_gsp parameter and only derive/attach
a gsp when include_gsp is TRUE (preserve existing behaviour when TRUE). Ensure
function signatures for get_meter_details() and testing_meter() are updated
consistently and any internal logic that unconditionally derives electricity gsp
is gated by include_gsp.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 02f13490-e019-4715-b5f6-5e59d46a673f

📥 Commits

Reviewing files that changed from the base of the PR and between 74f7003 and 1a9395f.

📒 Files selected for processing (3)
  • .jules/bolt.md
  • R/get_consumption.R
  • R/meter_details.R

Comment thread .jules/bolt.md
@@ -0,0 +1,5 @@
## 2025-01-24 - Avoiding redundant API calls and inefficient vector operations
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Use an H1 on the first line to satisfy markdownlint.

This file currently starts with a level-2 heading, which is what CI is complaining about.

Suggested fix
-## 2025-01-24 - Avoiding redundant API calls and inefficient vector operations
+# 2025-01-24 - Avoiding redundant API calls and inefficient vector operations
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
## 2025-01-24 - Avoiding redundant API calls and inefficient vector operations
# 2025-01-24 - Avoiding redundant API calls and inefficient vector operations
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.jules/bolt.md at line 1, The markdown heading on the first line
("2025-01-24 - Avoiding redundant API calls and inefficient vector operations")
is level-2; change it to an H1 by replacing the leading "##" with a single "#"
so the file begins with an H1 to satisfy markdownlint.

Comment thread R/get_consumption.R Outdated
Comment thread R/meter_details.R
Comment on lines +80 to +84
function(
meter_type = c("electricity", "gas"),
direction = NULL,
include_gsp = TRUE
) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Honour include_gsp in the testing branch as well.

get_meter_details() now exposes include_gsp, but the is_testing() path still delegates to testing_meter() without that flag, and testing_meter() always derives electricity gsp. That leaves the redundant lookup in place for test/dev runs and makes the new parameter behave differently depending on the code path.

Suggested fix
 get_meter_details <-
   function(
     meter_type = c("electricity", "gas"),
     direction = NULL,
     include_gsp = TRUE
   ) {
@@
-    if (is_testing()) {
-      testing_meter(meter_type)
+    if (is_testing()) {
+      testing_meter(meter_type, include_gsp = include_gsp)
     } else {
@@
-testing_meter <- function(meter_type = c("electricity", "gas")) {
+testing_meter <- function(
+  meter_type = c("electricity", "gas"),
+  include_gsp = TRUE
+) {
@@
-    meter_gsp <- if (identical(mpan, "sk_test_mpan")) {
+    meter_gsp <- if (!isTRUE(include_gsp)) {
+      NA_character_
+    } else if (identical(mpan, "sk_test_mpan")) {
       "J"
     } else {
       get_meter_gsp(mpan = mpan)
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@R/meter_details.R` around lines 80 - 84, The testing branch of
get_meter_details() ignores the new include_gsp flag because it calls
testing_meter() without passing it and testing_meter() always derives an
electricity gsp; update get_meter_details() to forward the include_gsp argument
to testing_meter(include_gsp = include_gsp) and modify testing_meter() to accept
an include_gsp parameter and only derive/attach a gsp when include_gsp is TRUE
(preserve existing behaviour when TRUE). Ensure function signatures for
get_meter_details() and testing_meter() are updated consistently and any
internal logic that unconditionally derives electricity gsp is gated by
include_gsp.

💡 What:
1. Replaced `ifelse()` with logical indexing for NA replacement in `combine_consumption()`.
2. Added `include_gsp` parameter to `get_meter_details()` to skip redundant GSP lookup API calls.
3. Updated `get_consumption()` to use `include_gsp = FALSE`.
4. Switched to `NA_character_` for type consistency in `gsp` field.
5. Fixed linter errors and updated `.Rbuildignore` to ignore `.jules`.

🎯 Why:
`ifelse()` is inefficient for large vectors as it evaluates all branches and allocates more memory. `get_consumption()` was triggering a redundant API call to fetch GSP data which is not used by the function.

📊 Impact:
- NA replacement is ~3x faster and uses ~3x less memory.
- Redundant GSP lookup API call eliminated in `get_consumption()`, reducing network overhead.

🔬 Measurement:
`bench::mark()` showed:
- ifelse: 32.2ms, 53.4MB
- logical_indexing: 11.8ms, 15.6MB

Verified API call reduction with manual mocking of `octopus_api`.

Co-authored-by: Moohan <5982260+Moohan@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant