Skip to content

Optimize OpenAPI routing#462

Merged
ahx merged 5 commits intoahx:mainfrom
moberegger:moberegger/optimize-find_path_item
Apr 7, 2026
Merged

Optimize OpenAPI routing#462
ahx merged 5 commits intoahx:mainfrom
moberegger:moberegger/optimize-find_path_item

Conversation

@moberegger
Copy link
Copy Markdown
Contributor

@moberegger moberegger commented Apr 1, 2026

Introduces a few optimizations for OpenapiFirst::Router#match.

The first one is for OpenapiFirst::Router#find_path_item. I replaced the filter_map (builds intermediate array of all matching routes) + min_by (iterates it again) with a single-pass each_value.reduce that tracks the best match as it goes. This eliminates the intermediate array allocation and runs a bit faster. A simple benchmark comparing the old vs new approach directly.

  === IPS Benchmark ===                                                                                        
  ruby 4.0.2 (2026-03-17 revision d3da9fec82) +YJIT +PRISM [arm64-darwin25]
  Warming up --------------------------------------                                                            
  old (filter_map + min_by)
                           5.431k i/100ms                                                                      
          new (reduce)     6.003k i/100ms                   
  Calculating -------------------------------------                                                            
  old (filter_map + min_by)                                 
                           52.084k (± 2.6%) i/s   (19.20 μs/i) -    260.688k in   5.008974s
          new (reduce)     59.802k (± 1.9%) i/s   (16.72 μs/i) -    300.150k in   5.020979s                    
                                                                                                               
  Comparison:                                                                                                  
  old (filter_map + min_by):    52084.5 i/s                                                                    
          new (reduce):    59802.5 i/s - 1.15x  faster      

                                                                                                               
  === Memory Benchmark ===
  Calculating -------------------------------------                                                            
  old (filter_map + min_by)                                 
                           5.104k memsize (     0.000  retained)
                         106.000  objects (     0.000  retained)
                           3.000  strings (     0.000  retained)
          new (reduce)     2.544k memsize (     0.000  retained)                                               
                          30.000  objects (     0.000  retained)
                           3.000  strings (     0.000  retained)                                               
                                                            
  Comparison:                                                                                                  
          new (reduce):       2544 allocated
  old (filter_map + min_by):       5104 allocated - 2.01x more 

The next one was to PathTemplate#match. I changed build_pattern to emit named capture groups (ex: (?<id>[^/?#]+)) instead of anonymous ones. This lets match call matches.named_captures to get a Hash directly from the regex engine, instead of @names.zip(values).to_h which allocates an intermediate array.

Since @names appeared to no longer be used for anything meaningful, I also removed that along with the now unused TEMPLATE_EXPRESSION_NAME and ALLOWED_PARAMETER_CHARACTERS constants.

A simple benchmark comparing the two approaches

     === IPS Benchmark ===
     ruby 4.0.2 (2026-03-17 revision d3da9fec82) +YJIT +PRISM [arm64-darwin25]
     Warming up --------------------------------------
     original (zip + to_h)
                             43.626k i/100ms
           named_captures    50.474k i/100ms
     Calculating -------------------------------------
     original (zip + to_h)
                             450.669k (± 3.4%) i/s    (2.22 μs/i) -      2.269M in   5.040555s
           named_captures    522.692k (± 1.8%) i/s    (1.91 μs/i) -      2.625M in   5.023137s

     Comparison:
     original (zip + to_h):   450669.5 i/s
           named_captures:   522692.1 i/s - 1.16x  faster


     === Memory Benchmark ===
     Calculating -------------------------------------
     original (zip + to_h)
                              2.352k memsize (     0.000  retained)
                             34.000  objects (     0.000  retained)
                              9.000  strings (     0.000  retained)
           named_captures     1.832k memsize (     0.000  retained)
                             21.000  objects (     0.000  retained)
                              9.000  strings (     0.000  retained)

     Comparison:
           named_captures:       1832 allocated
     original (zip + to_h):       2352 allocated - 1.28x more

The last one was to FindContent.call. I replaced content_type.split(';')[0] and type.split('/')[0] with index + slice (content_type[0, semi], type[0, slash]). Originally, each split allocated an array just to grab the first element. Now index/slice creates only the substring. I had to use the || type.length as a fallback when there's no / in the content type (ex: the was a test for an 'unknown' content type). A quick benchmark

     === IPS Benchmark ===
     ruby 4.0.2 (2026-03-17 revision d3da9fec82) +YJIT +PRISM [arm64-darwin25]
     Warming up --------------------------------------
         original (split)   115.021k i/100ms
     optimized (index/slice)
                            158.262k i/100ms
     Calculating -------------------------------------
         original (split)      1.261M (± 1.6%) i/s  (793.22 ns/i) -      6.326M in   5.019280s
     optimized (index/slice)
                               1.668M (± 1.5%) i/s  (599.59 ns/i) -      8.388M in   5.030398s

     Comparison:
         original (split):  1260689.6 i/s
     optimized (index/slice):  1667814.1 i/s - 1.32x  faster


     === Memory Benchmark ===
     Calculating -------------------------------------
         original (split)   400.000  memsize (     0.000  retained)
                              9.000  objects (     0.000  retained)
                              6.000  strings (     0.000  retained)
     optimized (index/slice)
                            160.000  memsize (     0.000  retained)
                              3.000  objects (     0.000  retained)
                              3.000  strings (     0.000  retained)

     Comparison:
     optimized (index/slice):        160 allocated
         original (split):        400 allocated - 2.50x more

Putting it all together, you see an IPS and memory improvement in Router#find_path_item

=== IPS Benchmark ===
ruby 4.0.2 (2026-03-17 revision d3da9fec82) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
                main     5.226k i/100ms
                 new     6.200k i/100ms
Calculating -------------------------------------
                main     54.576k (± 2.0%) i/s   (18.32 μs/i) -    276.978k in   5.077218s
                 new     61.077k (± 2.4%) i/s   (16.37 μs/i) -    310.000k in   5.078605s

Comparison:
                main:    54575.8 i/s
                 new:    61076.8 i/s - 1.12x  faster

=== Memory Benchmark ===
Calculating -------------------------------------
                main     5.104k memsize (     0.000  retained)
                       106.000  objects (     0.000  retained)
                         3.000  strings (     0.000  retained)
Calculating -------------------------------------
                 new     2.224k memsize (     0.000  retained)
                        22.000  objects (     0.000  retained)
                         5.000  strings (     0.000  retained)

Comparison:
                 new:       2224 allocated
                main:       5104 allocated - 2.29x more

For Router#match, IPS gets diluted, but there is still a noticeable improvement in memory allocation

=== IPS Benchmark ===
ruby 4.0.2 (2026-03-17 revision d3da9fec82) +YJIT +PRISM [arm64-darwin25]
Warming up --------------------------------------
                main     4.760k i/100ms
                 new     5.139k i/100ms
Calculating -------------------------------------
                main     46.806k (± 5.7%) i/s   (21.36 μs/i) -    233.240k in   5.003741s
                 new     50.032k (± 2.4%) i/s   (19.99 μs/i) -    251.811k in   5.036026s

Comparison:
                main:    46805.8 i/s
                 new:    50032.2 i/s - same-ish: difference falls within error

=== Memory Benchmark ===
Calculating -------------------------------------
                main     7.864k memsize (     0.000  retained)
                       131.000  objects (     0.000  retained)
                         6.000  strings (     0.000  retained)
                 new     4.904k memsize (     0.000  retained)
                        45.000  objects (     0.000  retained)
                         7.000  strings (     0.000  retained)

Comparison:
                 new:       4904 allocated
                main:       7864 allocated - 1.60x more

@moberegger moberegger requested a review from ahx as a code owner April 1, 2026 19:59
parts = template.split(TEMPLATE_EXPRESSION).map! do |part|
part.start_with?('{') ? ALLOWED_PARAMETER_CHARACTERS : Regexp.escape(part)
if part.start_with?('{')
"(?<#{part[1..-2]}>[^/?#]+)"
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a minute to understand this. ⌛

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it at all helps: https://docs.ruby-lang.org/en/master/MatchData.html#method-i-named_captures

It's to allow the usage of named captures.

 "/hello/123".match(/^\/hello\/(?<id>[^/?#]+)$/).named_captures                                                           
 # => {"id" => "123"}

 # vs

  m = "/hello/123".match(/^\/hello\/([^/?#]+)$/)
  m.captures  # => ["123"]
  ["id"].zip(m.captures).to_h  # => {"id" => "123"}

So really just allows us to save on the extra work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is any concern over this, I can pull this change out (and perhaps file into a separate PR for us to assess more thoroughly). Regexes are never fun 😆.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the change to use named_captures, but I don't understand the change from
%r{([^/?#]+)}
to
"(?<#{part[1..-2]}>[^/?#]+)"

I think I have to re-understand my own code here, but I don't have time right now. Can you explain that part?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a template string like

"/users/{userId}/posts/{postId}/comments/{commentId}"

The old way would

# First split the template into...
["/users/", "{userId}", "/posts/", "{postId}", "/comments/", "{commentId}"]
# Iterate and replace dynamic path segments...
["/users/", /([^\/?#]+)/, "/posts/", /([^\/?#]+)/, "/comments/", /([^\/?#]+)/]
# Then finally join...
/^\/users\/(?-mix:([^\/?#]+))\/posts\/(?-mix:([^\/?#]+))\/comments\/(?-mix:([^\/?#]+))$/

(No idea what that -mix thing is about... something that Ruby is doing). Each matched segment is anonymous, so when you match it against "/users/123/posts/456/comments/789", you just get

["123", "456", "789"]

Which required you to zip it with the names of the dynamic path segments.

The new way pretty much does the same thing, it just names each capture group. So

# First split the template into...
["/users/", "{userId}", "/posts/", "{postId}", "/comments/", "{commentId}"]
# Iterate and replace dynamic path segments with named groups...
["/users/", "(?<userId>[^/?#]+)", "/posts/", "(?<postId>[^/?#]+)", "/comments/", "(?<commentId>[^/?#]+)"]
# Then finally join...
/^\/users\/(?<userId>[^\/?#]+)\/posts\/(?<postId>[^\/?#]+)\/comments\/(?<commentId>[^\/?#]+)$/

In this case the match gives you

{"userId" => "123", "postId" => "456", "commentId" => "789"}

The resulting regex is the same, just with the capture groups named. The "(?<#{part[1..-2]}>[^/?#]+)" is building the same regex, but part[1..-2] is just grabbing the name between {..}, so {id} becomes id.

That said, I think there is an even better way to do this that doesn't require us to split the template and iterate over each part. We basically just need to find and replace {...} segments in the template with the appropriate capture group. I'm gonna try something out to see if it works.

Copy link
Copy Markdown
Contributor Author

@moberegger moberegger Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so there is another way to build the same regex by substituting dynamic path segments with gsub. Best I could come up with for now is something like

# Matches dynamic segments like {id} (captured in group 1) or static segments like /hello
PATH_SEGMENT = /\{([^{}]+)\}|[^{}]+/

def build_pattern(template)
  pattern = template.gsub(PATH_SEGMENT) do
    param_name = Regexp.last_match(1)
    if param_name
      # Dynamic segment like {id} — build a named capture group
      "(?<#{param_name}>[^/?#]+)"
    else
      # Static segment like /hello — escape for use in regex
      Regexp.escape(Regexp.last_match(0))
    end
  end
  /^#{pattern}$/
end

I am not sure if that is easier to understand, though. The last_match stuff isn't super intuitive.

There is also something like the following that only attempts to match and replace on the dynamic segments with named captured groups. I didn't like the delete('\\\\') being there, but couldn't figure out another way to do it.

ESCAPED_TEMPLATE_EXPRESSION = /\\\{([^{}]+)\\\}/

def build_pattern(template)
  pattern = Regexp.escape(template).gsub(ESCAPED_TEMPLATE_EXPRESSION) do
    "(?<#{Regexp.last_match(1).delete('\\\\')}>[^/?#]+)"
  end
  /^#{pattern}$/
end

These all end up producing the same regular expression in the end.

The performance win would be from the resulting regular expression to enable named_captures. build_pattern is only run when the OpanAPI document loads, so how it's built is less important in terms of performance characteristics. We should pick whatever you think is the most readable and easiest to understand.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks a lot for the explanation and the attention to detail. I don’t have the capacity to pick one right now. I understand that all options result in the same performance and I have no security concerns about this change. So I would like you to pick one and ping me when you think this is ready to merge. Thank you! 👍🏻👍🏻

Copy link
Copy Markdown
Contributor Author

@moberegger moberegger Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahx

I'd pick the solution that is currently in the pull request. It's the closest to the original implementation, so perhaps it is the easiest one to understand.

Even though performance isn't critical here, I benchmarked the three approaches anyways. Of the three options, the one proposed in this PR is both the fastest and allocates the least amount of memory. If we're looking for a tie breaker, might as well pick the one that performs the best.

@ahx
Copy link
Copy Markdown
Owner

ahx commented Apr 5, 2026

Thanks. I will merge this until Tuesday. Greetings from Mecklenburg.

@ahx ahx force-pushed the moberegger/optimize-find_path_item branch from 6a0e3da to c1ba08b Compare April 7, 2026 07:34
@ahx ahx merged commit b6de667 into ahx:main Apr 7, 2026
22 checks passed
@moberegger
Copy link
Copy Markdown
Contributor Author

Thank you from Toronto, Ontario, Canada! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants