-
Notifications
You must be signed in to change notification settings - Fork 42
Rutefig/reg 574 fix prevent capture groups inside of private patterns #109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rutefig/reg 574 fix prevent capture groups inside of private patterns #109
Conversation
…re, preserves special groups
…oups inside of public patterns into non capture
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughRefactors control-flow in two IR modules by splitting combined conditional guards into nested if blocks without changing behavior. Introduces a private helper to convert bare capturing groups to non-capturing ones, applies it when composing regex parts, and adds comprehensive unit and end-to-end tests validating regex conversion and capture counts. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🔇 Additional comments (4)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
compiler/src/ir/graph.rs(1 hunks)compiler/src/ir/intermediate.rs(2 hunks)compiler/src/utils.rs(3 hunks)
…group conversion
The convert_capturing_to_non_capturing function was naively converting
all '(' to '(?:' without considering:
- Escaped parentheses like \( which match literal parens
- Parentheses inside character classes like [()] which are literals
This caused regex semantics to break. For example:
- \(foo\) would become \(?:foo\) (wrong)
- [()] would become [(?:)] (matches extra chars)
Fixed by tracking two states during string scan:
- in_char_class: are we inside [...]?
- escaped: was previous char a backslash?
Only convert ( to (?:) when it's not escaped, not in a char class,
and not already followed by ?.
Added 5 new tests covering escaped parens, character classes, and
mixed cases. All 23 tests pass.
Addresses PR review feedback: #109 (comment)
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Extended convert_capturing_to_non_capturing to handle both PCRE-style (?<name>...) and Rust-style (?P<name>...) named capture groups, not just bare capturing groups. This prevents named captures in private patterns from interfering with public capture group numbering. Key changes: - Detect and convert both named capture syntaxes to (?:...) - Distinguish lookbehind assertions (?<=, ?<!) from named captures - Updated test_preserve_named_groups to test_convert_pcre_named_groups with correct expected behavior - Added 7 new tests covering both named capture styles and edge cases Fixes code review finding that named captures were being preserved instead of converted, which would cause capture group numbering mismatches in generated circuits.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Problem
When users create DecomposedRegexConfig with Pattern parts (non-public patterns), they can accidentally use capturing groups (...) which breaks the circuit's expectation of capture group numbering.
Example:
The
PublicPatternshould be capture group 1 for the circuit, but the capturing group in thePatternpart shifts it to group 2, causing circuit verification failures.This PR fixes this issue by converting capture groups inside of private patterns into non capture groups while still preserving special groups or even non capture groups if the user intentionally writes them.
Summary by CodeRabbit
Refactor
Bug Fixes
Tests