-
Notifications
You must be signed in to change notification settings - Fork 130
fix: query oprimization with where #1090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: release/v2.2
Are you sure you want to change the base?
Conversation
Optimize query performance by conditionally removing WHERE ledger = ? clauses when a bucket contains only one ledger. The optimization is automatically detected and applied at runtime, providing 5-15% query performance improvement for single-ledger deployments with no impact on multi-ledger scenarios. Changes: - Add single-ledger detection cache in Store with thread-safe access - Implement applyLedgerFilter() and getLedgerFilterSQL() helpers - Refactor 38+ query locations across ledger storage package - Auto-detect single-ledger state on store creation and ledger open - Add CountLedgersInBucket() to system store for bucket analysis
WalkthroughThis pull request introduces context propagation throughout the storage layer and implements a single-ledger optimization. All resource handler methods now accept Changes
Sequence Diagram(s)sequenceDiagram
participant Handler as Resource Handler
participant Store as Ledger Store
participant SysStore as System Store
participant DB as Database
Handler->>Store: buildDataset(ctx, ...)
alt Single Ledger Optimization
Store->>SysStore: CountLedgersInBucket(ctx, bucket)
SysStore->>DB: SELECT COUNT(*) WHERE bucket = ?
DB-->>SysStore: 1
SysStore-->>Store: 1
Store->>Store: isSingleLedger() → true
Store->>DB: Query (no ledger filter applied)
else Multiple Ledgers
Store->>SysStore: CountLedgersInBucket(ctx, bucket)
SysStore->>DB: SELECT COUNT(*) WHERE bucket = ?
DB-->>SysStore: N (N > 1)
SysStore-->>Store: N
Store->>Store: isSingleLedger() → false
Store->>DB: Query + applyLedgerFilter(ledger = ?)
end
DB-->>Store: Results
Store-->>Handler: Query with optional ledger filter
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Areas requiring extra attention:
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
🧹 Nitpick comments (2)
internal/storage/ledger/resource_volumes.go (1)
69-70: Centralized ledger filtering improves consistency.Good migration to store.applyLedgerFilter across all relevant selects and lateral subqueries.
For sustained gains, ensure covering indexes exist on:
- accounts_volumes(ledger, accounts_address, asset)
- accounts(ledger, address)
- accounts_metadata(ledger, accounts_address, date)
- moves(ledger, accounts_address, effective_date, insertion_date)
Also applies to: 76-77, 104-105, 123-125, 137-138
internal/storage/ledger/resource_aggregated_balances.go (1)
55-57: Consistent ledger scoping across PIT/non-PIT paths — nice.Centralizing the filters reduces risk of cross-ledger leakage and eases future tweaks.
Ensure supporting indexes:
- moves(ledger, accounts_address, effective_date, seq) and (ledger, accounts_address, insertion_date, seq)
- accounts(ledger, address), accounts_metadata(ledger, accounts_address, date)
- accounts_volumes(ledger, accounts_address, asset)
Also applies to: 79-81, 94-95, 106-108, 114-115
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (11)
internal/storage/driver/driver.go(2 hunks)internal/storage/ledger/accounts.go(2 hunks)internal/storage/ledger/logs.go(1 hunks)internal/storage/ledger/resource_accounts.go(6 hunks)internal/storage/ledger/resource_aggregated_balances.go(4 hunks)internal/storage/ledger/resource_logs.go(1 hunks)internal/storage/ledger/resource_transactions.go(3 hunks)internal/storage/ledger/resource_volumes.go(4 hunks)internal/storage/ledger/store.go(3 hunks)internal/storage/ledger/transactions.go(4 hunks)internal/storage/system/store.go(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
internal/storage/system/store.go (1)
internal/ledger.go (1)
Ledger(18-26)
internal/storage/ledger/store.go (2)
internal/ledger.go (1)
Ledger(18-26)internal/storage/system/store.go (1)
Store(20-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Dirty
- GitHub Check: Tests
🔇 Additional comments (14)
internal/storage/driver/driver.go (1)
87-91: Initialize optimization state post-create/open — good placement.Non-blocking update with debug-only logging is fine. Please confirm UpdateSingleLedgerState is idempotent and thread-safe, as it may be called frequently on OpenLedger.
Also applies to: 104-108
internal/storage/ledger/logs.go (1)
122-129: Clean refactoring to centralized ledger filtering.The query construction is properly separated and the ledger filter is correctly applied via
store.applyLedgerFilter()before execution. This maintains the same logical behavior while enabling the single-ledger optimization.internal/storage/ledger/resource_logs.go (1)
25-29: LGTM - Consistent pattern.The centralized ledger filtering is correctly applied to the logs dataset query.
internal/storage/ledger/store.go (3)
25-28: Well-designed optimization state structure.The use of
sync.RWMutexfor protecting the optimization flag is appropriate for this read-heavy use case.
147-156: Thread-safe read implementation.The nil check and RLock usage are correct. Returning
falsewhen the cache is nil provides a safe default that disables the optimization.
176-194: No optimization refresh issue detected—UpdateSingleLedgerState is already invoked during ledger creation.The optimization state is correctly refreshed when
CreateLedgeris called (line 88 in driver.go), immediately after creating the store. Additionally, stores are instantiated fresh on each ledger access viaOpenLedgerorGetLedgerController, ensuring state is recalculated if the bucket's ledger count changes. The returned store fromCreateLedgeris discarded by the API layer, so there is no stale cache to concern over.internal/storage/ledger/resource_transactions.go (3)
86-87: Ledger filter correctly applied to transactions dataset.The centralized filtering is properly positioned after column selection and uses the correct table alias "transactions".
98-104: Metadata history subquery properly filtered.The ledger filter is correctly applied to the metadata history subquery using the "transactions_metadata" alias before the left join.
166-189: Well-structured effective volumes expansion.The refactoring to extract
movesSubqueryimproves readability and correctly applies ledger filtering before aggregation. The precomputed subquery is properly reused in the final dataset construction.internal/storage/ledger/resource_accounts.go (4)
57-58: Accounts dataset correctly filtered.The ledger filter is properly applied using the "accounts" table alias.
65-71: Metadata history subquery properly scoped.The ledger filter is correctly applied to the accounts metadata history subquery before the join.
94-112: Balance filtering correctly handles both PIT and current cases.The ledger filters are properly applied to both the moves table (for PIT queries) and the accounts_volumes table (for current balances), ensuring correct scoping in the correlated subqueries.
152-175: Volumes expansion properly filtered for both PIT and current cases.The ledger filters are correctly applied in both branches of the expand logic, maintaining consistency with the balance filtering pattern.
internal/storage/ledger/transactions.go (1)
179-186: Clean query construction pattern.Using an anonymous function to encapsulate the query building and ledger filter application is a good approach for inline query construction within
UnionAll.
Address CodeRabbit review comments: - Remove hardcoded "and " prefix from getLedgerFilterSQL() return value - Replace all fragile [4:] string slicing with clean filter application - Regenerate mocks to include CountLedgersInBucket method Changes: - store.go: getLedgerFilterSQL() now returns "ledger = ?" without prefix - accounts.go: Remove [4:] slicing in WhereGroup - transactions.go: Remove [4:] slicing in 3 WhereGroup calls - *_generated_test.go: Regenerated mocks with uber-go/mock This makes the code more robust and prevents potential SQL corruption if the filter format ever changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
internal/storage/ledger/transactions.go (1)
210-216: Consider extracting the duplicate WhereGroup pattern.The same 7-line
WhereGrouppattern appears identically inRevertTransaction,UpdateTransactionMetadata, andDeleteTransactionMetadata. While the current implementation is correct, extracting this to a helper method (e.g.,applyLedgerFilterToUpdate) could reduce duplication and improve maintainability.Example helper:
func (store *Store) applyLedgerFilterToUpdate(q *bun.UpdateQuery) *bun.UpdateQuery { return q.WhereGroup(" AND ", func(q *bun.UpdateQuery) *bun.UpdateQuery { ledgerFilter, ledgerArgs := store.getLedgerFilterSQL() if ledgerFilter != "" { return q.Where(ledgerFilter, ledgerArgs...) } return q }) }Then simplify call sites:
-query = query.WhereGroup(" AND ", func(q *bun.UpdateQuery) *bun.UpdateQuery { - ledgerFilter, ledgerArgs := store.getLedgerFilterSQL() - if ledgerFilter != "" { - return q.Where(ledgerFilter, ledgerArgs...) - } - return q -}) +query = store.applyLedgerFilterToUpdate(query)Also applies to: 249-255, 284-290
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (6)
internal/storage/driver/buckets_generated_test.go(4 hunks)internal/storage/driver/ledger_generated_test.go(2 hunks)internal/storage/driver/system_generated_test.go(3 hunks)internal/storage/ledger/accounts.go(2 hunks)internal/storage/ledger/store.go(3 hunks)internal/storage/ledger/transactions.go(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- internal/storage/ledger/store.go
- internal/storage/ledger/accounts.go
🧰 Additional context used
🧬 Code graph analysis (2)
internal/storage/driver/system_generated_test.go (2)
internal/storage/ledger/legacy/main_test.go (1)
T(40-44)pkg/testserver/server.go (1)
T(29-34)
internal/storage/driver/buckets_generated_test.go (1)
internal/storage/bucket/bucket_generated_test.go (2)
MockBucket(19-22)MockBucketMockRecorder(25-27)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Tests
🔇 Additional comments (6)
internal/storage/driver/ledger_generated_test.go (1)
6-7: LGTM! Standard gomock generation updates.The added blank lines and
isgomockfield are consistent with gomock v0.5.0 code generation patterns. These changes require no manual intervention.Also applies to: 23-23
internal/storage/driver/buckets_generated_test.go (1)
6-7: LGTM! Standard gomock generation updates.The additions of
isgomockfields, blank lines, and parameter name changes (ledger→arg2) are all standard gomock v0.5.0 code generation artifacts with no functional impact.Also applies to: 25-25, 46-48, 54-56, 157-157
internal/storage/driver/system_generated_test.go (2)
6-7: LGTM! Standard gomock generation updates.The added blank lines and
isgomockfield are consistent with gomock v0.5.0 code generation patterns.Also applies to: 26-26
46-59: LGTM! New mock method for single-ledger optimization.The
CountLedgersInBucketmock method and its recorder have been correctly generated to support the new interface method, which enables the PR's single-ledger query optimization feature.internal/storage/ledger/transactions.go (2)
179-186: LGTM! Centralized ledger filtering improves maintainability.The refactor to use
store.applyLedgerFilterproperly centralizes the ledger filtering logic for the union branch, making the optimization transparent to this code path.
210-216: Fragile [4:] prefix handling has been fixed.The past review comments flagged these three locations for fragile
[4:]slicing. The current implementation correctly checksif ledgerFilter != ""and callsq.Where(ledgerFilter, ledgerArgs...)directly without any string slicing, resolving the previous concerns.Also applies to: 249-255, 284-290
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not very fan, the same query (which check the number of ledger for the bucket) will be called multiple times in queries.
| &ledger.Ledger{}, | ||
| ) | ||
| }), | ||
| fx.Provide(func(tracerProvider trace.TracerProvider) systemstore.StoreFactory { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to recreate the system store.
The system store is already existing in the *Driver service.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Justification :
- The SystemStoreFactory must be injected into the ledgerstore.Factory to provide the CountLedgersInBucket function
- The Create(params.DB) call in the closure does not really "recreate" a heavy store - it is just a wrapper struct around the DB
- The context needed for CountLedgersInBucket is only available at runtime, so the closure is necessary
- Handle errors properly in isSingleLedger by returning (bool, error) - Record errors in trace span instead of silently ignoring them - Add explanatory comment for SystemStoreFactory provider - Apply conservative behavior (use filter) when counting fails Addresses review comments from PR #1090
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
internal/storage/ledger/store.go (1)
175-180: Same error visibility concern asapplyLedgerFilter.Similar to
applyLedgerFilter, errors are only recorded in the trace span. Consider adding structured logging at WARN or INFO level for better observability.
🧹 Nitpick comments (1)
internal/storage/ledger/store.go (1)
159-164: Consider logging optimization failures at a higher visibility level.Errors from
isSingleLedgerare only recorded in the trace span (line 163), which may not be visible in standard application logs. Database connection issues or other serious problems could go unnoticed, making it difficult to diagnose why the optimization isn't being applied.Consider adding structured logging at WARN or INFO level to surface these errors:
singleLedger, err := store.isSingleLedger(ctx) if err != nil { - // Log error but continue with conservative behavior (apply filter) trace.SpanFromContext(ctx).RecordError(err) + // Log at higher visibility since optimization failures may indicate issues + // Use structured logging if available + // logger.Warn("Failed to check single-ledger optimization, applying filter", "error", err) return query.Where(fmt.Sprintf("%s.ledger = ?", tableAlias), store.ledger.Name) }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
internal/storage/driver/module.go(2 hunks)internal/storage/ledger/store.go(4 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
internal/storage/ledger/store.go (1)
internal/storage/system/store.go (2)
Store(20-32)Option(168-168)
internal/storage/driver/module.go (4)
internal/storage/system/factory.go (2)
StoreFactory(7-9)NewStoreFactory(21-23)internal/storage/ledger/store.go (3)
WithTracer(351-355)WithCountLedgersInBucketFunc(357-361)New(239-326)internal/storage/driver/driver.go (3)
WithTracer(321-325)Driver(27-36)New(288-305)internal/storage/system/store.go (2)
WithTracer(170-174)New(156-166)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Tests
🔇 Additional comments (3)
internal/storage/ledger/store.go (1)
357-361: LGTM!The option follows the established pattern in the codebase and correctly injects the counting function dependency.
internal/storage/driver/module.go (2)
44-47: LGTM!The dependency injection is clean. Making observability providers optional improves flexibility for testing and environments where full observability isn't required.
67-67: LGTM!The
systemStoreFactoryparameter is correctly wired into theDriverconstructor, consistent with the updated constructor signature.Also applies to: 74-74
| options = append(options, ledgerstore.WithCountLedgersInBucketFunc( | ||
| func(ctx context.Context, bucketName string) (int, error) { | ||
| return params.SystemStoreFactory.Create(params.DB).CountLedgersInBucket(ctx, bucketName) | ||
| }, | ||
| )) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical performance issue: Store creation on every query negates optimization benefits.
The closure creates a new Store instance via params.SystemStoreFactory.Create(params.DB) on every invocation. Since this function is called by isSingleLedger() for every query operation that uses applyLedgerFilter or getLedgerFilterSQL, this introduces significant overhead:
- Store instantiation cost: Creating a new store may initialize metrics, histograms, and other resources
- Database query overhead: Queries the ledgers table on every application query
- Defeats optimization purpose: The overhead likely exceeds the 5-15% performance gain from skipping the WHERE clause
Impact: In high-throughput scenarios, this implementation may actually degrade performance rather than improve it.
Recommendation: Cache the ledger count result or compute it once at initialization:
// Option 1: Cache with expiry (if ledgers can be added/removed dynamically)
var (
cachedCount int
cachedBucket string
cacheExpiry time.Time
cacheMutex sync.RWMutex
cacheTTL = 5 * time.Minute
)
options = append(options, ledgerstore.WithCountLedgersInBucketFunc(
func(ctx context.Context, bucketName string) (int, error) {
cacheMutex.RLock()
if cachedBucket == bucketName && time.Now().Before(cacheExpiry) {
count := cachedCount
cacheMutex.RUnlock()
return count, nil
}
cacheMutex.RUnlock()
systemStore := params.SystemStoreFactory.Create(params.DB)
count, err := systemStore.CountLedgersInBucket(ctx, bucketName)
if err != nil {
return 0, err
}
cacheMutex.Lock()
cachedCount = count
cachedBucket = bucketName
cacheExpiry = time.Now().Add(cacheTTL)
cacheMutex.Unlock()
return count, nil
},
))// Option 2: Compute once per bucket at ledger store creation (if ledgers are static)
// This requires architectural changes to compute and store the flag in the Store struct itself🤖 Prompt for AI Agents
internal/storage/driver/module.go lines 56-60: the closure passed to
ledgerstore.WithCountLedgersInBucketFunc currently calls
params.SystemStoreFactory.Create(params.DB) on every invocation, causing
expensive store creation and DB hits on each query; change this to compute the
ledger count once and reuse it by either (a) caching the count per bucket with a
small TTL and mutex-protected reads/writes so the factory/create call and DB
query only happen on cache miss/expiry, or (b) compute the count once at store
initialization and store the value on the Store instance if ledgers are
effectively static; implement option (a) if ledgers change over time (use
sync.RWMutex, cachedBucket, cachedCount, cacheExpiry and a reasonable TTL) or
option (b) for static data, and ensure the closure reads the cached value
instead of creating a new SystemStore each call and only calls
SystemStoreFactory.Create when necessary.
| func (store *Store) isSingleLedger(ctx context.Context) (bool, error) { | ||
| if store.countLedgersInBucket == nil { | ||
| return false, nil | ||
| } | ||
| count, err := store.countLedgersInBucket(ctx, store.ledger.Bucket) | ||
| if err != nil { | ||
| return false, fmt.Errorf("failed to count ledgers in bucket: %w", err) | ||
| } | ||
| return count == 1, nil | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cache the single-ledger check to avoid querying on every operation.
The method queries the database on every call to determine if the bucket contains a single ledger. Since applyLedgerFilter and getLedgerFilterSQL call this method for every query operation, this introduces database overhead that may negate the optimization benefit.
The ledger count in a bucket is unlikely to change frequently during normal operation. Consider:
- Caching the result with a TTL or cache invalidation strategy
- Computing the value once at Store initialization
- Using a sync.Once pattern for lazy initialization
Without caching, the optimization may actually degrade performance in high-throughput scenarios.
- Handle errors properly in isSingleLedger by returning (bool, error) - Record errors in trace span instead of silently ignoring them - Add explanatory comment for SystemStoreFactory provider - Apply conservative behavior (use filter) when counting fails Addresses review comments from PR #1090
Optimize query performance by conditionally removing WHERE ledger = ? clauses when a bucket contains only one ledger. The optimization is automatically detected and applied at runtime, providing 5-15% query performance improvement for single-ledger deployments with no impact on multi-ledger scenarios.