Skip to content

Comments

Enhance --breakout with multiple values and regex patterns (fixes #110)#154

Merged
atheurer merged 8 commits intomasterfrom
multiple-breakout-values
Feb 23, 2026
Merged

Enhance --breakout with multiple values and regex patterns (fixes #110)#154
atheurer merged 8 commits intomasterfrom
multiple-breakout-values

Conversation

@atheurer
Copy link
Contributor

Summary

This PR enhances the --breakout option in get-metric-data.js with support for multiple values and regex pattern matching, addressing issue #110.

Features Implemented

1. Multiple Literal Values

Syntax: --breakout hostname=a,b,c
Result: Returns separate metrics for each value
Use Case: Compare specific hosts or components

2. Regex Pattern Matching

Syntax:

  • --breakout hostname=r/pattern/ → Separate metrics per match
  • --breakout hostname=R/pattern/ → Aggregated metric for all matches

Features:

  • Custom delimiters: Use r|pattern| when pattern contains /
  • Regex alternation for specific values: R/worker-1|worker-2|worker-3/
  • Consistent lowercase/uppercase convention

3. Enhanced Error Handling

Feature: Clear error messages when filters match nothing
Benefit: Actionable feedback instead of cryptic errors

Complete Syntax Matrix

Syntax Type Result Use Case
hostname=a Single literal 1 metric for 'a' Query specific host
hostname=a,b,c Multiple literals 3 separate metrics Compare specific hosts
hostname=r/pattern/ Regex separate N separate metrics Compare all matching
hostname=R/pattern/ Regex aggregated 1 combined metric Total of all matching
hostname=R/a|b|c/ Regex alternation 1 combined metric Aggregate specific values
hostname No filter All values (separate) Explore all options

Design Philosophy

The implementation uses a consistent, intuitive syntax pattern:

  • Lowercase r = Separate/individual metrics
  • Uppercase R = Aggregated/combined metrics

This provides maximum flexibility while maintaining a clean, learnable interface.

Example Usage

# Multiple literal values - separate metrics
node ./get-metric-data.js --period <UUID> --source sar-net --type L2-Gbps \
  --breakout csid=1,2,cstype=worker

# Regex with separate metrics per match
node ./get-metric-data.js --period <UUID> --source mpstat --type Busy-CPU \
  --breakout hostname=r/^worker-.*/

# Regex with aggregated metric for all matches
node ./get-metric-data.js --period <UUID> --source sar-net --type L2-Gbps \
  --breakout hostname=R/^client-.*/

# Aggregate specific values using regex alternation
node ./get-metric-data.js --period <UUID> --source mpstat --type Busy-CPU \
  --breakout hostname=R/worker-1|worker-2|worker-3/

# Custom delimiter when pattern contains slashes
node ./get-metric-data.js --period <UUID> --source iostat --type kB-sec \
  --breakout dev=r|/dev/sd.*|

# Mix different filter types
node ./get-metric-data.js --period <UUID> --source mpstat --type Busy-CPU \
  --breakout hostname=r/worker-[0-9]+/,cstype=physical

Use Cases

Separate Metrics (, or r/):

  • Compare performance across specific hosts
  • Identify outliers or problem nodes
  • Generate per-host charts and reports

Aggregated Metrics (R/):

  • Measure total throughput across a cluster: R/^worker-.*/
  • Aggregate specific hosts: R/worker-1|worker-2|worker-3/
  • Calculate combined CPU usage of worker nodes
  • Aggregate network traffic: R/eth0|eth1/

Technical Implementation

  • Parsing: Smart detection of separators in list() function
  • Query: Uses OpenSearch terms filter for multiple values, regexp for patterns
  • Aggregation: Controlled by excluding fields from nested aggregation structure
  • Error Handling: Detects empty results and provides actionable feedback
  • Documentation: Comprehensive guide with examples and feature matrix

Backward Compatibility

✅ Fully backward compatible - all existing usage patterns work unchanged

Testing

  • Code follows existing patterns
  • Comprehensive documentation with examples
  • Error handling tested with non-matching filters
  • Feature matrix documents all syntax options
  • Maintains backward compatibility
  • Regex alternation tested for aggregating specific values
  • R/pattern/ filtering tested and verified

Commits

  1. c879aea - Add support for multiple values in --breakout option
  2. 54b29df - Add regex pattern support for breakout filters
  3. 217d77e - Add helpful error message when regex filter matches nothing
  4. 054ae2e - Revert aggregated literal values (use regex alternation instead)
  5. c8ac61c - Update documentation for regex alternation approach
  6. 0f8c0c9 - Fix R/pattern/ to correctly filter aggregated results
  7. d721b0f - prettier

Closes #110

🤖 Generated with Claude Code

atheurer and others added 8 commits February 19, 2026 11:34
Enhanced the --breakout option in get-metric-data.js to support
comma-separated values (e.g., --breakout hostname=a,b,c) which returns
separate metrics for each specified value. This addresses issue #110.

Changes:
- Modified list() parser to distinguish between field separators and value lists
- Updated OpenSearch query builder to use "terms" query for multiple values
- Added documentation with examples and usage guidelines

The implementation maintains backward compatibility and is designed to
support future aggregation syntax (e.g., hostname=a+b).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implemented regex pattern matching in breakout filters with two modes:
- r/pattern/ (lowercase): Returns separate metrics for each matching value
- R/pattern/ (uppercase): Returns single aggregated metric for all matches

Features:
- Custom delimiter support: use any character after r/R as delimiter
  (e.g., r/pattern/, r|pattern|, r#pattern#)
- Consistent syntax with literal values (r vs R parallels , vs +)
- OpenSearch regexp query integration for efficient pattern matching

Examples:
- --breakout hostname=r/^worker-.*/ (separate metrics per worker)
- --breakout hostname=R/^client-.*/ (aggregated metric for all clients)
- --breakout dev=r|/dev/sd.*| (custom delimiter for patterns with /)

Implementation:
- Modified getBreakoutAggregation() to exclude fields with R/pattern/
- Updated getMetricGroupsFromBreakouts() to detect and apply regexp filters
- Added comprehensive documentation with examples and use cases

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When a regex breakout filter (r/pattern/ or R/pattern/) doesn't match
any metric values, the query previously failed with a cryptic error:
"number of generated data sets (0) does not match the number of
metric query sets (1)"

This commit adds detection for empty result sets caused by regex filters
and returns a clear, actionable error message explaining:
- Which source/type was queried
- Which regex filter(s) didn't match
- Suggestions for troubleshooting

Example error output:
  No metrics found matching the specified filter(s) for source=mpstat, type=Busy-CPU
    Regex filter hostname=r/^nonexistent-.*/ did not match any values.
  Please verify:
    1. The regex pattern is correct
    2. Metrics exist for this source/type with the specified field
    3. The field values match the pattern

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implemented aggregation of multiple literal values using the plus (+)
separator, completing the unified syntax design for breakout filters.

Syntax:
- hostname=a,b,c (comma): Returns 3 separate metrics
- hostname=a+b+c (plus): Returns 1 aggregated metric combining a, b, and c

Features:
- Consistent with regex syntax (r vs R parallels , vs +)
- Uses same OpenSearch "terms" filter for both , and +
- Aggregation controlled by getBreakoutAggregation() (excludes field)
- Enhanced error messages for both comma and plus separated filters

Implementation:
- Modified getBreakoutAggregation() to detect + and exclude from aggregation
- Updated query builder to split on + and create terms filter
- Extended error handling to cover literal value filters
- Added comprehensive documentation with examples and feature matrix

Examples:
- --breakout hostname=worker-1+worker-2+worker-3 (aggregated)
- --breakout hostname=worker-1,worker-2,worker-3 (separate)
- --breakout cstype=worker+master (combined metric for both types)

Complete Feature Matrix:
| Syntax | Result |
|--------|--------|
| hostname=a | 1 metric for 'a' |
| hostname=a,b,c | 3 separate metrics |
| hostname=a+b+c | 1 aggregated metric |
| hostname=r/pattern/ | N separate metrics |
| hostname=R/pattern/ | 1 aggregated metric |

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…c values

Instead of the reverted a+b syntax, document the use of regex alternation
with uppercase R to aggregate specific literal values:

- hostname=R/worker-1|worker-2|worker-3/ aggregates those 3 specific hosts
- This approach works correctly with the existing regex implementation
- Provides the same functionality without additional code complexity

Updated examples and removed references to future a+b+c syntax.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixed critical bug where R/pattern/ (aggregated regex) was including ALL
values instead of only those matching the pattern.

Root Cause:
When using R/pattern/, the field is excluded from the aggregation structure
(correct - this causes aggregation). However, the regexp filter was only
applied to the initial aggregation query, not when querying for metric IDs.
This meant the metric ID query had no regexp filter, resulting in ALL
metric IDs being included in the aggregated result.

Solution:
1. Extract regexp filters for aggregated fields (R/pattern/) after aggregation
2. Pass these filters to mgetMetricIdsFromTerms via termsSets
3. Apply the regexp filters when building metric ID queries

Example:
--breakout hostname=R/worker-1|worker-2/
Before: Aggregated ALL hostnames (worker-1, worker-2, worker-3)
After:  Aggregates ONLY worker-1 and worker-2 (correct)

Technical Details:
- Modified getMetricGroupsFromBreakouts to extract and preserve R/ filters
- Modified mgetMetricIdsFromTerms to apply preserved regexp filters
- Filters are added to the query.bool.filter array for metric ID lookups

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Run prettier --write on modified files to fix CI formatting checks

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@project-crucible-tracking project-crucible-tracking bot moved this from Queued to In Progress in Crucible Tracking Feb 23, 2026
@atheurer atheurer merged commit 5bf6086 into master Feb 23, 2026
48 of 147 checks passed
@github-project-automation github-project-automation bot moved this from In Progress to Done in Crucible Tracking Feb 23, 2026
@k-rister k-rister deleted the multiple-breakout-values branch February 23, 2026 19:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

Enhance "--breakout hostname=" to support a list i.e hostname=A,B,C

2 participants