Skip to content

Add pr review skill#584

Draft
st0012 wants to merge 2 commits intomainfrom
add-pr-review-skill
Draft

Add pr review skill#584
st0012 wants to merge 2 commits intomainfrom
add-pr-review-skill

Conversation

@st0012
Copy link
Member

@st0012 st0012 commented Feb 11, 2026

I don't really plan to introduce such skill to this project. I'm mostly testing a skill I wrote that generates skills.

Test using small PR (#573)

With this skill

Details
Summary: Replaces Option<NameId> on MethodDefinition.receiver with a Receiver enum (SelfReceiver(DefinitionId) | ConstantReceiver(NameId)), making
 def self.foo vs def Foo.bar distinguishable at the type level. SelfReceiver enables direct definition_id_to_declaration_id lookup instead of name
 resolution. Adds UnresolvedMethodReceiver diagnostic for unresolvable receiver cases.                                                            
                                                                                                                                                  
Critical Issues (must fix):                                                                                                                       

- resolution.rs (diff, ivar handling for SelfReceiver) — .expect() will panic when the enclosing class is unresolvable. In the instance variable
resolution path, the SelfReceiver branch uses:
Receiver::SelfReceiver(def_id) => *self
    .graph
    .definition_id_to_declaration_id(*def_id)
    .expect("SelfReceiver definition should have a declaration"),
- Consider class Foo::Bar; def self.baz; @x = 1; end; end where Foo is undefined:
  a. Foo::Bar fails resolution → no declaration created
  b. Method def self.baz has SelfReceiver(class_def_id) → method resolution correctly emits UnresolvedMethodReceiver diagnostic and continues
  c. Instance variable @x has lexical_nesting_id pointing to the method → method.receiver() returns Some(SelfReceiver(class_def_id)) →
definition_id_to_declaration_id(class_def_id) returns None → panic

The method resolution path handles this gracefully; the ivar path should do the same:
Receiver::SelfReceiver(def_id) => {
    let Some(&decl_id) = self.graph.definition_id_to_declaration_id(*def_id) else {
        // Enclosing class unresolvable — skip ivar
        continue;
    };
    decl_id
}

Recommendations (should fix):

- graph.rs:282 — add_definition_diagnostic is pub but only used within the crate. Should be pub(crate) per project conventions:
pub(crate) fn add_definition_diagnostic(...)

Observations:

- No memory regression: Receiver is 16 bytes. Option<Receiver> is also 16 bytes via niche optimization on the enum discriminant — same as the
original Option<NameId> (16 bytes). MethodDefinition stays at 112 bytes. Verified by assert_mem_size! macros.
- MethodDefinition::id() hash changes: SelfReceiver now uses def_id.to_string() instead of the old NameId.to_string(). This changes the hash
computation for def self.foo style methods. Safe since the graph is rebuilt from scratch during resolution (clear_declarations()), but
serialized/cached IDs (if any) would be invalidated.
- Reviewer feedback addressed: OrphanInstanceVariable diagnostic removed per vinistock's feedback (root cause is always an unresolved constant,
not the orphaned definition). Clone/Copy removed from Receiver derives per vinistock. assert_mem_size! for Receiver added per Morriar. Offset
clone explained by Stan (Diagnostic::new takes Offset by value).
- Type-level correctness: The Receiver enum is a clean design that eliminates the previous ambiguity where def self.foo and def Foo.bar were both
represented as NameId and indistinguishable during resolution. This is necessary groundwork for PR #515 (anonymous Class/Module support), where
anonymous classes have no names.

Test Coverage:

- diagnostic_for_self_receiver_in_unresolvable_class — covers class Foo::Bar; def self.baz; end
- diagnostic_for_singleton_method_on_undefined_constant — covers def Foo.bar; end
- assert_method_has_receiver! macro updated to handle both Receiver variants
- Two Class.new tests updated to assert SelfReceiver variant specifically
- Missing: Test for ivar inside a singleton method in an unresolvable class (would catch the .expect() panic above)

Nits:

- resolution.rs (diff, SelfReceiver diagnostic message) — "Singleton method in unresolved namespace" could be more specific: "Singleton method
receiver self in unresolved namespace" to distinguish from the ConstantReceiver case at a glance.

Files Reviewed:

- rust/rubydex/src/diagnostic.rs — 1 finding (no issues, clean addition)
- rust/rubydex/src/indexing/ruby_indexer.rs — No issues found (receiver mapping and test macro updates are correct)
- rust/rubydex/src/model/definitions.rs — No issues found (Receiver enum, size assertions, id computation all correct)
- rust/rubydex/src/model/graph.rs — 1 finding (pub visibility)
- rust/rubydex/src/resolution.rs — 2 findings (1 critical .expect() panic, 1 nit on message)

With the existing review_changes skill (after checking out locally)

Details
Review: Introduce Receiver enum for singleton method definitions                                                                                  
                                                                                                                                                  
Summary                                               
                                                       
The branch replaces Option<NameId> with Option<Receiver> on MethodDefinition, distinguishing between def self.foo (SelfReceiver(DefinitionId)) and
 def Foo.bar (ConstantReceiver(NameId)). It also adds diagnostics for unresolvable method receivers during resolution. The design is sound and the
 overall direction is good.

Bug: Panic in instance variable resolution for SelfReceiver

resolution.rs:377-380 — The instance variable path uses .expect() for SelfReceiver:

Receiver::SelfReceiver(def_id) => *self
    .graph
    .definition_id_to_declaration_id(*def_id)
    .expect("SelfReceiver definition should have a declaration"),

This will panic when the enclosing class/module couldn't be resolved. For example:

class Foo::Bar     # Foo is undefined, Foo::Bar can't resolve
  def self.baz
    @x = 1         # instance variable triggers the panic path
  end
end

The method resolution branch at resolution.rs:280 handles this exact scenario gracefully with let Some(&owner_decl_id) = ... else { continue; }.
The instance variable path should use the same pattern (emit a diagnostic or silently continue), not .expect().

Correctness

- The indexer correctly distinguishes self receivers (line 1428: SelfReceiver(DefinitionId)) from constant receivers (line 1430-1432:
ConstantReceiver(NameId)).
- The SelfReceiver path in method resolution (line 280) correctly uses definition_id_to_declaration_id for a direct lookup instead of resolving
through the name system — this is a nice optimization that avoids the intermediate name resolution step.
- The name generation in ruby_indexer.rs:1042-1053 correctly reconstructs the name for SelfReceiver by looking up the definition's name_id,
preserving the existing behavior.
- Singleton class method definitions (line 1459-1462) correctly propagate SelfReceiver with the current nesting's definition ID.

Code Quality

- Receiver enum design: Clean 2-variant enum with clear doc comments. Good use of assert_mem_size! at 16 bytes.
- MethodDefinition size unchanged at 112 bytes — Option<Receiver> (16 bytes via niche optimization) is the same size as the previous
Option<NameId> (16 bytes), so no memory regression.
- add_definition_diagnostic on Graph (graph.rs:282-286): Clean helper. The silent no-op when the document doesn't exist is acceptable as a
defensive guard, but worth noting — it could mask bugs if a uri_id is somehow wrong.

Test Coverage

- The two new resolution tests (diagnostic_for_self_receiver_in_unresolvable_class and diagnostic_for_singleton_method_on_undefined_constant)
cover the happy paths for both diagnostic scenarios.
- The updated assert_method_has_receiver! macro correctly handles both Receiver variants.
- Missing test: Instance variable inside a singleton method in an unresolvable namespace (the bug above). This would catch the panic:
class Foo::Bar
  def self.baz
    @x = 1
  end
end

Performance

No concerns. The SelfReceiver path is actually an improvement — it does a direct definition_id_to_declaration_id lookup instead of going through
the name resolution system (resolve_lexical_owner → name lookup → declaration lookup).

Documentation / AGENTS.md

No changes needed. The Receiver enum is self-documenting and the refactor is internal to the resolution/indexing logic.

Test using big PR (#515)

With the skill

Details
PR Review: Capture DSL definitions and support Class|Module.new                                                                                   
                                                       
Summary: This PR replaces the inline Class.new/Module.new handling (which created ClassDefinition/ModuleDefinition during indexing) with a        
generalized DSL framework that defers processing to resolution. The architecture is well-designed — a DslDefinition captures DSL calls during   
indexing, DslProcessor registry enables future extensibility, and Unit::Dsl integrates DSL processing into the resolution loop. Resolution is 68%
faster due to compute_all_name_depths memoization, but memory increased by 22% (+625 MB) which needs investigation.

Critical Issues (must fix):

- model/graph.rs:~323-352 — O(n) linear scan in nesting_name_id() and definition_to_declaration_id() for Definition::Dsl variants. Both iterate
ALL definitions to find the corresponding DynamicClass/DynamicModule. The deterministic ID pattern (DynamicClassDefinition::id_for_dsl()) is
already used in dsl_processors.rs for O(1) lookup — use it here too:
// In nesting_name_id(), replace the for-loop with:
if let Definition::Dsl(_) = definition {
    let dc_id = DynamicClassDefinition::id_for_dsl(definition_id);
    if let Some(Definition::DynamicClass(d)) = self.definitions().get(&dc_id) {
        return d.name_id().copied();
    }
    let dm_id = DynamicModuleDefinition::id_for_dsl(definition_id);
    if let Some(Definition::DynamicModule(d)) = self.definitions().get(&dm_id) {
        return d.name_id().copied();
    }
}
- Same fix for the Definition::Dsl arm in definition_to_declaration_id(). These functions are called per-definition during resolution — the linear
 scan makes them O(n) per call, potentially O(n^2) overall.
- resolution.rs:~634-660 — resolve_lexical_owner returns None for unprocessed DSLs, causing silent declaration drops. When a DSL fails to process
(receiver never resolves), resolve_lexical_owner returns None via get_dsl_namespace_declaration(id). All callers use let Some(owner_id) = ... else
 { continue; }, silently dropping method/attr declarations inside the DSL block. Instead, when the DSL has no declaration, fall back to the DSL's
own lexical_nesting_id:
if matches!(definition, Definition::Dsl(_)) {
    if let Some(decl_id) = self.get_dsl_namespace_declaration(id) {
        return Some(decl_id);
    }
    // Fallback: walk up through the DSL's own lexical nesting
    let dsl_nesting = *definition.lexical_nesting_id();
    return self.resolve_lexical_owner(dsl_nesting);
}

Recommendations (should fix):

- model/dsl.rs:~56-78 — DslArgument contains heap-allocated String values, NOT interned. Splat(String), DoubleSplat(String), BlockArg(String),
KeywordArg { key: String, value: String } all allocate heap strings. At Rails scale (469 DynamicClass + many more DSL captures), these contribute
to the +625 MB memory increase. Use StringId for argument strings, or at minimum Box<str> (16 bytes vs 24 bytes for String).
- model/definitions.rs — Missing assert_mem_size! for DslDefinition, DynamicClassDefinition, DynamicModuleDefinition. Every other definition type
has size assertions. Given the +22% memory regression, knowing exact struct sizes is critical:
assert_mem_size!(DslDefinition, /* measure */);
assert_mem_size!(DynamicClassDefinition, /* measure */);
assert_mem_size!(DynamicModuleDefinition, /* measure */);
- resolution.rs:handle_dsl_unit — String allocation on every DSL processing attempt. s.as_str().to_string() allocates a new String for the method
name on every call, including retries. Borrow from the string pool instead:
let method_name: &str = self.graph.strings().get(&method_name_str_id)
    .map(|s| s.as_str()).unwrap_or("");
- model/dsl.rs:~95-104 — positional_args() returns Vec<&DslValue>, allocating on every call. Return an iterator instead:
pub fn positional_args(&self) -> impl Iterator<Item = &DslValue> {
    self.arguments.iter().filter_map(|arg| match arg {
        DslArgument::Positional(v) => Some(v),
        _ => None,
    })
}
- model/dsl_processors.rs:extract_dsl_context — Clones mixins and comments vectors from the DSL definition. Since the DSL is consumed during
processing, consider using std::mem::take() via mutable accessors to move data instead of cloning, reducing allocation pressure.
- resolution.rs:~207-209 — Outcome::Unresolved(None) now retries instead of being terminal. This is safe (the made_progress flag prevents infinite
 loops), but legitimately unresolvable constants now stay in the queue every pass instead of being dropped immediately. After the resolution loop
terminates, consider emitting diagnostics for remaining Unresolved items in the queue.
- resolution.rs:prepare_units — Constants linked to DSLs are excluded via dsl_assigned_constants. When a DSL's WillNotMatch path returns
assigned_to(), the constant ID is re-enqueued as Unit::Definition. Verify with a test that Foo = SomeUnknownClass.new (non-Class/Module receiver)
correctly creates a ConstantDeclaration for Foo.

Observations (trade-offs, architecture):

- The 68% resolution speedup primarily comes from the compute_all_name_depths memoization (pre-computing all name depths upfront instead of
recursive per-call). This optimization is independent of DSL support and could be a separate commit.
- The DSL processor registry pattern (DslProcessor with matches + handle function pointers) is extensible — Struct.new, Data.define can be added
as new processors. The existing TODO comments about Struct.new hacks (5 occurrences in resolution.rs, 1 in graph.rs) should be updated to
reference only Struct.new since Class.new/Module.new are now properly handled.
- The Receiver enum change (Option<NameId> -> Option<Receiver>) with SelfReceiver(DefinitionId) | ConstantReceiver(NameId) is bundled here but was
 previously explored as a separate refactor. The niche optimization keeps Option<Receiver> at 16 bytes (same as Option<NameId>), so
MethodDefinition stays at 88 bytes. This is clean.
- The +625 MB memory regression (+22%) is significant for a tool targeting "hyper scale Ruby projects". Main contributors: DslArgumentList with
uninterned strings, DynamicClassDefinition/DynamicModuleDefinition duplicating data from DslDefinition (comments, mixins, offsets), and DSL
definitions persisting until remove_orphan_dsl_definitions() runs.
- The benchmark shows -3,689 total declarations, which is expected — previously Class.new/Module.new created class/module declarations during
indexing even for anonymous usages. Now anonymous DSLs correctly produce no declarations.

Test Coverage:
- Extensive new indexer tests: named/anonymous Class.new/Module.new, nested DSLs, singleton classes, constant scoping, chained assignments, ||=
assignments, path assignments (Foo::Bar = Class.new).
- Extensive new resolution tests: inheritance via Class.new, deeply nested (3-level) DSLs, instance variables in DSL blocks, class variables,
constants in DSL blocks (lexical scope), non-matching DSLs, singleton class methods, alias_method.
- Missing: test for remove_orphan_dsl_definitions cleanup behavior, test for DSL without block (Class.new with no do...end), test for Foo =
SomeRandomClass.new where the receiver resolves but doesn't match any processor.

Nits:
- resolution.rs:prepare_units — .collect::<VecDeque<_>>() before .extend() is unnecessary; VecDeque::extend accepts any IntoIterator directly.
- PR description body has a typo: "DyanmicClassDefinition" → "DynamicClassDefinition".
- DslDefinition::new is marked const fn but initializes members: Vec::new() and mixins: Vec::new(). This works because Vec::new() is const fn, but
 is fragile if default values are added later.

Files Reviewed:
- rust/rubydex-sys/src/definition_api.rs — FFI enum extension, no issues
- rust/rubydex-sys/src/graph_api.rs — Test signature update, no issues
- rust/rubydex/src/indexing.rs — DSL method names threading, no issues
- rust/rubydex/src/indexing/ruby_indexer.rs — 3 findings (Nesting::Dsl, constant_name_location extraction, DSL call handling)
- rust/rubydex/src/model.rs — Module declarations, no issues
- rust/rubydex/src/model/definitions.rs — 2 findings (missing assert_mem_size!, Receiver enum)
- rust/rubydex/src/model/dsl.rs — 2 findings (uninterned strings, positional_args allocation)
- rust/rubydex/src/model/dsl_processors.rs — 1 finding (extract_dsl_context cloning)
- rust/rubydex/src/model/graph.rs — 2 findings (O(n) linear scans, nesting_name_id)
- rust/rubydex/src/resolution.rs — 4 findings (resolve_lexical_owner fallback, Unresolved retry, handle_dsl_unit allocation, prepare_units
constant exclusion)
- rust/rubydex/src/test_utils/graph_test.rs — Signature update, no issues
- rust/rubydex/src/test_utils/local_graph_test.rs — Signature update, no issues

With the existing review_changes skill (after checking out locally)

Details
PR #515 Review: DSL Support for Class.new and Module.new

Overview

This PR introduces a two-phase DSL processing framework: indexing captures DslDefinition placeholders, and resolution converts them into
DynamicClassDefinition/DynamicModuleDefinition via pluggable DslProcessor handlers. The implementation is well-structured and handles most Ruby
semantics correctly.

---
Correctness

class << self inside Class.new -- Fixed. The PR correctly changes class << self resolution from current_lexical_scope_name_id() to
current_owner_name_id() (diff line 675-676), and introduces parent_for_singleton_class() (diff line 614) which properly recognizes Nesting::Dsl
entries with assigned constants. Tests index_singleton_class_inside_class_new and singleton_class_inside_anonymous_class verify this.

extend self / include self inside Class.new -- Still Incorrect. The mixin handler at ruby_indexer.rs:929 still uses
current_lexical_scope_name_id() to resolve self. Inside a Class.new block, this skips Nesting::Dsl and resolves to the enclosing lexical scope
instead of the anonymous class. In Ruby, self inside Class.new IS the new class, so extend self should extend that class. This is a pre-existing
issue but worth noting since class << self was fixed in this PR -- the same fix should be applied here (use current_owner_name_id() instead).

resolve_lexical_owner returning Option -- Sound. The change from DeclarationId to Option<DeclarationId> is handled correctly. Callers in
handle_remaining_definitions properly propagate None (skip creating the declaration), which is correct when a DSL wasn't successfully processed.

maybe_handle_constant_inside_dsl -- Correct. The function properly checks that the nesting is a DSL, the name has no explicit parent scope, and
the DSL's declaration exists. The Outcome::Retry(None) fallback when the DSL hasn't been resolved yet ensures correctness.

Resolution loop termination -- Preserved. The fixed-point loop correctly tracks progress via made_progress. DSL units that can't be processed are
pushed back without setting progress, ensuring termination when no forward progress is possible. The dsl_nesting_depth sorting optimizes the
common case by processing outer DSLs first.

WillNotMatch branch returning assigned_to -- Subtle but correct. When no processor matches, the assigned constant definition ID is returned so it
can be processed as a regular Unit::Definition in the resolution loop. The filter ensuring it's a Constant or ConstantAlias is appropriate.

---
Performance

O(n) linear scan in nesting_name_id() -- Significant concern. The function at graph.rs (diff line 2813) iterates all definitions to find a
DynamicClass/DynamicModule matching a DSL definition ID:

for def in self.definitions().values() {
    match def {
        Definition::DynamicClass(d) if d.dsl_definition_id() == definition_id => { ... }
        Definition::DynamicModule(d) if d.dsl_definition_id() == definition_id => { ... }
        _ => {}
    }
}

This is called from prepare_units for every method/attr/alias/etc. that has a DSL as its lexical_nesting_id. For a large codebase with 100K+
definitions, this is expensive.

Recommendation: Maintain a reverse index HashMap<DefinitionId, DefinitionId> mapping DSL definition IDs to their resolved
DynamicClass/DynamicModule definition IDs. Populate it when handle_class_new/handle_module_new creates the new definition. This changes O(n)
lookups to O(1).

Alternatively, since DynamicClassDefinition::id_for_dsl() likely generates a deterministic ID from the DSL definition ID, you could use that
directly for lookup without maintaining a separate index.

handle_dsl_unit clones method name string on every attempt (diff line ~3970):
let method_name = self.graph.strings().get(&method_name_str_id)
    .map(|s| s.as_str().to_string())
    .unwrap_or_default();
This creates a heap allocation per DSL processing attempt, including retries. Consider passing StringId to find_matching_processor and comparing
against interned strings instead.

compute_all_name_depths -- Good optimization. Replacing the recursive name_depth with memoized computation is strictly better. One note: there's
no cycle protection. If names somehow form a cycle, the recursive compute_depth would stack overflow. Consider inserting a sentinel value before
recursing (e.g., depths.insert(name_id, u32::MAX) as a "computing" marker) to detect cycles.

remove_orphan_dsl_definitions -- Acceptable. Called once after the resolution loop (diff line 2997). The two-pass approach (collect linked IDs,
then scan for orphans) is idiomatic Rust and O(n + m).

---
Code Quality

Nesting::Dsl replacing Nesting::Owner -- Clean rename. The new name better communicates the purpose. All call sites properly updated.

parent_for_singleton_class() -- Well designed. Correctly handles DSL entries by checking assigned_to().is_some(), ensuring only named DSLs create
singleton class nesting. Anonymous Class.new without assignment correctly returns None.

DslArgument uses String not StringId -- The DslArgument enum variants use owned String (24 bytes + heap) instead of StringId (8 bytes). For the
KeywordArg { key, value } variant, this means 48+ bytes per argument. Since DSL definitions are temporary (consumed during resolution, then
orphans cleaned up), the allocation churn is bounded but could be reduced by using StringId. The plan documents note this was a conscious choice
for simplicity -- worth documenting in a code comment.

DslValue with Reference(ReferenceId) -- Excellent. Capturing the ReferenceId during indexing avoids the need to scan all constant references
during resolution. This is a much better approach than what the plan document suggested.

positional_args() returns Vec<&DslValue> (diff line 2262) -- Allocates on every call. Consider returning an iterator instead, or providing
first_positional_reference() (which already exists) for the common case. The Vec allocation is not on a hot path currently, so this is minor.

Field duplication between DslDefinition and DynamicClassDefinition -- Consistent with the existing pattern where ClassDefinition,
ModuleDefinition, and SingletonClassDefinition all duplicate similar fields. Not a concern for this PR.

DslProcessor using function pointers -- Appropriate. No per-processor state, direct calls, 8 bytes per pointer. Good choice over trait objects.

---
Test Coverage

Tests are comprehensive and cover:
- Non-class-new DSL fallback
- Class.new with inheritance
- Constants inside Class.new (correct lexical scoping)
- Nested Class.new
- Instance variables in DSL blocks
- Class variables in anonymous classes
- Deeply nested DSLs
- Singleton classes inside Class.new
- Class methods (via def self.foo)
- alias_method inside DSL blocks
- Anonymous (unassigned) Class.new
- Module.new patterns

Gaps:
- No test for extend self / include self inside Class.new -- would expose the current_lexical_scope_name_id() bug noted above
- No test for SomeConstant::Class.new (namespaced receiver) -- should verify it's handled or gracefully ignored
- No test for DSL retry exhaustion (all DSLs fail to process in a given pass)

---
Documentation

The inline documentation is thorough. maybe_handle_constant_inside_dsl, handle_dsl_unit, dsl_nesting_depth, and compute_all_name_depths all have
clear doc comments explaining their purpose. The Nesting enum doc comment at diff line 302 correctly notes that DSL blocks are skipped for lexical
 scope resolution.

---
FFI Changes (rubydex-sys)

The addition of DefinitionKind::Dsl = 14, DynamicClass = 15, DynamicModule = 16 to the FFI enum is straightforward. The corresponding match arms
in map_definition_to_kind are correct. The test updates adding vec![] for dsl_method_names parameter are minimal and correct.

---
Summary of Actionable Items
┌────────────────────────────────────────────────────────────────────────┬──────────┬───────────────┐
│                                  Item                                  │ Severity │     Type      │
├────────────────────────────────────────────────────────────────────────┼──────────┼───────────────┤
│ extend self / include self inside Class.new uses wrong self resolution │ Medium   │ Bug           │
├────────────────────────────────────────────────────────────────────────┼──────────┼───────────────┤
│ O(n) linear scan in nesting_name_id() for DSL→Dynamic lookup           │ Medium   │ Performance   │
├────────────────────────────────────────────────────────────────────────┼──────────┼───────────────┤
│ handle_dsl_unit clones method name string per attempt                  │ Low      │ Performance   │
├────────────────────────────────────────────────────────────────────────┼──────────┼───────────────┤
│ compute_all_name_depths has no cycle protection                        │ Low      │ Robustness    │
├────────────────────────────────────────────────────────────────────────┼──────────┼───────────────┤
│ positional_args() allocates Vec unnecessarily                          │ Low      │ Performance   │
├────────────────────────────────────────────────────────────────────────┼──────────┼───────────────┤
│ Missing test for extend self inside Class.new                          │ Low      │ Test coverage │
└────────────────────────────────────────────────────────────────────────┴──────────┴───────────────┘

@st0012
Copy link
Member Author

st0012 commented Feb 17, 2026

cc @Morriar since you were interested in review skills

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant