Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,8 @@
## Unreleased

- OpenapiFirst will now cache the contents of files that have been loaded. If you need to reload your OpenAPI definition for tests or server hot reloading, you can call `OpenapiFirst.clear_cache!`.

- Optimized `OpenapiFirst::Router#match` for faster path matching and reduced memory allocation.
-
## 3.2.1

- Don't raise `UnknownQueryParameterError` if request is ignored in tests. Fixes [#441](https://github.com/ahx/openapi_first/issues/441).
Expand Down
12 changes: 6 additions & 6 deletions lib/openapi_first/router.rb
Original file line number Diff line number Diff line change
Expand Up @@ -97,15 +97,15 @@ def find_path_item(request_path)
found = @static[request_path]
return [found, {}] if found

matches = @dynamic.filter_map do |_path, path_item|
@dynamic.each_value.reduce(nil) do |best, path_item|
params = path_item[:template].match(request_path)
next unless params
next best unless params

[path_item, params]
end
return matches.first if matches.length == 1
candidate = [path_item, params]
next candidate unless best

matches&.min_by { |match| match[1].values.sum(&:length) }
params.values.sum(&:length) < best[1].values.sum(&:length) ? candidate : best
end
end
end
end
6 changes: 4 additions & 2 deletions lib/openapi_first/router/find_content.rb
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,10 @@ def self.call(contents, content_type)
return contents[nil] if content_type.nil? || content_type.empty?

contents.fetch(content_type) do
type = content_type.split(';')[0]
contents[type] || contents["#{type.split('/')[0]}/*"] || contents['*/*'] || contents[nil]
semi = content_type.index(';')
type = semi ? content_type[0, semi] : content_type
slash = type.index('/') || type.length
contents[type] || contents["#{type[0, slash]}/*"] || contents['*/*'] || contents[nil]
end
end
end
Expand Down
13 changes: 6 additions & 7 deletions lib/openapi_first/router/path_template.rb
Original file line number Diff line number Diff line change
Expand Up @@ -6,16 +6,13 @@ class Router
class PathTemplate
# See also https://spec.openapis.org/oas/v3.1.0#path-templating
TEMPLATE_EXPRESSION = /(\{[^{}]+\})/
TEMPLATE_EXPRESSION_NAME = /\{([^{}]+)\}/
ALLOWED_PARAMETER_CHARACTERS = %r{([^/?#]+)}

def self.template?(string)
string.include?('{')
end

def initialize(template)
@template = template
@names = template.scan(TEMPLATE_EXPRESSION_NAME).flatten
@pattern = build_pattern(template)
end

Expand All @@ -25,20 +22,22 @@ def to_s

def match(path)
return {} if path == @template
return if @names.empty?

matches = path.match(@pattern)
return unless matches

values = matches.captures
@names.zip(values).to_h
matches.named_captures
end

private

def build_pattern(template)
parts = template.split(TEMPLATE_EXPRESSION).map! do |part|
part.start_with?('{') ? ALLOWED_PARAMETER_CHARACTERS : Regexp.escape(part)
if part.start_with?('{')
"(?<#{part[1..-2]}>[^/?#]+)"
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need a minute to understand this. ⌛

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it at all helps: https://docs.ruby-lang.org/en/master/MatchData.html#method-i-named_captures

It's to allow the usage of named captures.

 "/hello/123".match(/^\/hello\/(?<id>[^/?#]+)$/).named_captures                                                           
 # => {"id" => "123"}

 # vs

  m = "/hello/123".match(/^\/hello\/([^/?#]+)$/)
  m.captures  # => ["123"]
  ["id"].zip(m.captures).to_h  # => {"id" => "123"}

So really just allows us to save on the extra work.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is any concern over this, I can pull this change out (and perhaps file into a separate PR for us to assess more thoroughly). Regexes are never fun 😆.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the change to use named_captures, but I don't understand the change from
%r{([^/?#]+)}
to
"(?<#{part[1..-2]}>[^/?#]+)"

I think I have to re-understand my own code here, but I don't have time right now. Can you explain that part?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given a template string like

"/users/{userId}/posts/{postId}/comments/{commentId}"

The old way would

# First split the template into...
["/users/", "{userId}", "/posts/", "{postId}", "/comments/", "{commentId}"]
# Iterate and replace dynamic path segments...
["/users/", /([^\/?#]+)/, "/posts/", /([^\/?#]+)/, "/comments/", /([^\/?#]+)/]
# Then finally join...
/^\/users\/(?-mix:([^\/?#]+))\/posts\/(?-mix:([^\/?#]+))\/comments\/(?-mix:([^\/?#]+))$/

(No idea what that -mix thing is about... something that Ruby is doing). Each matched segment is anonymous, so when you match it against "/users/123/posts/456/comments/789", you just get

["123", "456", "789"]

Which required you to zip it with the names of the dynamic path segments.

The new way pretty much does the same thing, it just names each capture group. So

# First split the template into...
["/users/", "{userId}", "/posts/", "{postId}", "/comments/", "{commentId}"]
# Iterate and replace dynamic path segments with named groups...
["/users/", "(?<userId>[^/?#]+)", "/posts/", "(?<postId>[^/?#]+)", "/comments/", "(?<commentId>[^/?#]+)"]
# Then finally join...
/^\/users\/(?<userId>[^\/?#]+)\/posts\/(?<postId>[^\/?#]+)\/comments\/(?<commentId>[^\/?#]+)$/

In this case the match gives you

{"userId" => "123", "postId" => "456", "commentId" => "789"}

The resulting regex is the same, just with the capture groups named. The "(?<#{part[1..-2]}>[^/?#]+)" is building the same regex, but part[1..-2] is just grabbing the name between {..}, so {id} becomes id.

That said, I think there is an even better way to do this that doesn't require us to split the template and iterate over each part. We basically just need to find and replace {...} segments in the template with the appropriate capture group. I'm gonna try something out to see if it works.

Copy link
Copy Markdown
Contributor Author

@moberegger moberegger Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so there is another way to build the same regex by substituting dynamic path segments with gsub. Best I could come up with for now is something like

# Matches dynamic segments like {id} (captured in group 1) or static segments like /hello
PATH_SEGMENT = /\{([^{}]+)\}|[^{}]+/

def build_pattern(template)
  pattern = template.gsub(PATH_SEGMENT) do
    param_name = Regexp.last_match(1)
    if param_name
      # Dynamic segment like {id} — build a named capture group
      "(?<#{param_name}>[^/?#]+)"
    else
      # Static segment like /hello — escape for use in regex
      Regexp.escape(Regexp.last_match(0))
    end
  end
  /^#{pattern}$/
end

I am not sure if that is easier to understand, though. The last_match stuff isn't super intuitive.

There is also something like the following that only attempts to match and replace on the dynamic segments with named captured groups. I didn't like the delete('\\\\') being there, but couldn't figure out another way to do it.

ESCAPED_TEMPLATE_EXPRESSION = /\\\{([^{}]+)\\\}/

def build_pattern(template)
  pattern = Regexp.escape(template).gsub(ESCAPED_TEMPLATE_EXPRESSION) do
    "(?<#{Regexp.last_match(1).delete('\\\\')}>[^/?#]+)"
  end
  /^#{pattern}$/
end

These all end up producing the same regular expression in the end.

The performance win would be from the resulting regular expression to enable named_captures. build_pattern is only run when the OpanAPI document loads, so how it's built is less important in terms of performance characteristics. We should pick whatever you think is the most readable and easiest to understand.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Thanks a lot for the explanation and the attention to detail. I don’t have the capacity to pick one right now. I understand that all options result in the same performance and I have no security concerns about this change. So I would like you to pick one and ping me when you think this is ready to merge. Thank you! 👍🏻👍🏻

Copy link
Copy Markdown
Contributor Author

@moberegger moberegger Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ahx

I'd pick the solution that is currently in the pull request. It's the closest to the original implementation, so perhaps it is the easiest one to understand.

Even though performance isn't critical here, I benchmarked the three approaches anyways. Of the three options, the one proposed in this PR is both the fastest and allocates the least amount of memory. If we're looking for a tie breaker, might as well pick the one that performs the best.

else
Regexp.escape(part)
end
end

/^#{parts.join}$/
Expand Down
Loading