Skip to content

fix(multimodal): fall back to config model type for aliased vision models#755

Open
YouNeedCryDear wants to merge 3 commits intolightseekorg:mainfrom
YouNeedCryDear:fix/add-fallback-mm
Open

fix(multimodal): fall back to config model type for aliased vision models#755
YouNeedCryDear wants to merge 3 commits intolightseekorg:mainfrom
YouNeedCryDear:fix/add-fallback-mm

Conversation

@YouNeedCryDear
Copy link
Contributor

@YouNeedCryDear YouNeedCryDear commented Mar 13, 2026

Fixes #754

Description

Problem

Multimodal family detection in the gRPC chat preparation path was keyed off the request-facing model name. When a worker exposed a supported vision model under an alias such as custom-model, tokenizer and config loading could succeed but multimodal spec selection and image processor lookup still failed before dispatch.

Solution

Use the loaded model config as a fallback signal during multimodal family detection. The multimodal registry now checks config.model_type when model_id does not match, and the image processor registry does the same while preserving the existing fast path on model_id.

Changes

  • add ModelMetadata::config_model_type() and matches_model_type_hints() helpers for shared family matching logic
  • update built-in multimodal specs to fall back from model_id to config.model_type
  • collapse image processor lookup into find(model_id, model_type) with model-type fallback
  • use the loaded model_type in model_gateway multimodal preprocessing before image processor selection
  • add regression tests for aliased model IDs and fast-path preservation across multimodal specs and image processor lookup

Test Plan

  1. Configure a gRPC worker to serve a supported multimodal model under an aliased external model ID such as custom-model.
  2. Keep tokenizer_path / model_path pointing at the real model artifacts containing the correct config.json and preprocessor_config.json.
  3. Send a chat completion request with image content using the aliased model name.
  4. Verify that SMG now resolves the correct multimodal spec and image processor instead of failing during preparation.
  5. Run local verification:
    • make fmt
    • make check
    • make test
Checklist
  • cargo +nightly fmt passes
  • cargo clippy --all-targets --all-features -- -D warnings passes
  • (Optional) Documentation updated
  • (Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Summary by CodeRabbit

  • Refactor

    • Model matching now relies on explicit model-type hints for more accurate spec resolution instead of identifier substring checks.
    • Image processor lookup now accepts an optional model type and falls back to model-type-based selection when direct model-id resolution fails.
  • Tests

    • Added unit tests verifying alias resolution via model-type hints and image-processor fallback and fast-path behavior.

@github-actions github-actions bot added grpc gRPC client and router changes model-gateway Model gateway crate changes labels Mar 13, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

Replace substring-based model family detection with model_type-hint matching: ModelMetadata gains helpers; vision specs use them instead of inspecting model_id; ImageProcessorRegistry.find accepts an optional model_type fallback; model_gateway passes config.model_type into the lookup. (47 words)

Changes

Cohort / File(s) Summary
Vision Model Specs
crates/multimodal/src/registry/llama4.rs, crates/multimodal/src/registry/llava.rs, crates/multimodal/src/registry/phi3_v.rs, crates/multimodal/src/registry/qwen3_vl.rs, crates/multimodal/src/registry/qwen_vl.rs
Replaced model_id substring heuristics with metadata.matches_model_type_hints([...]) and added unit tests that verify matching via config.model_type aliases.
Registry Traits
crates/multimodal/src/registry/traits.rs
Added pub fn config_model_type(&self) -> Option<&str>, pub fn matches_model_type_hints(&self, hints: &[&str]) -> bool, and private fn matches_all_hints(...) to support case-insensitive model_type-based matching.
Image Processor Registry
crates/multimodal/src/vision/image_processor.rs
Changed find(model_id)find(model_id, model_type: Option<&str>); added find_in_candidate() helper and two-step lookup: fast path by model_id, then fallback by model_type when provided. Also registered phi3_v processor and updated tests.
gRPC Multimodal Handler
model_gateway/src/routers/grpc/multimodal.rs
Extracts model_type from loaded model config and passes it into image_processor_registry.find(model_id, model_type), enabling config-based fallback for processor selection.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Gateway as ModelGateway
    participant Registry as ImageProcessorRegistry
    participant Metadata as ModelMetadata

    Client->>Gateway: request with model_id
    Gateway->>Metadata: load model config -> config.model_type
    Gateway->>Registry: find(model_id, Some(model_type))
    Registry->>Registry: find_in_candidate(model_id) (fast path)
    alt found by model_id
        Registry-->>Gateway: processor
    else not found
        Registry->>Registry: iterate processors by pattern using model_type
        Registry->>Metadata: matches_model_type_hints(hints)
        alt matches via model_type
            Registry-->>Gateway: processor
        else no match
            Registry-->>Gateway: None
        end
    end
    Gateway-->>Client: proceed or error
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~30 minutes

Possibly related PRs

Suggested labels

multimodal, tests

Suggested reviewers

  • CatherineSue
  • key4ng
  • slin1237

Poem

🐰 I hopped through configs, sniffed the hints,
No more relying on model name glints.
When aliases hide and IDs astray,
model_type guides the processor's way.
Hooray — every model finds home today!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main change: adding fallback to config model type for aliased vision models, which directly addresses the PR's core objective.
Linked Issues check ✅ Passed Changes comprehensively address #754 requirements: adds config_model_type() and matches_model_type_hints() helpers [traits.rs], updates all multimodal specs (llama4, llava, phi3_v, qwen3_vl, qwen_vl) to use config fallback [registry/*.rs], implements model_type fallback in image processor registry [image_processor.rs], and integrates fallback in gRPC multimodal preprocessing [multimodal.rs].
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing model_type fallback for aliased vision models: trait helpers, registry spec updates, image processor registry modifications, and gRPC integration—no unrelated changes detected.
Docstring Coverage ✅ Passed Docstring coverage is 80.95% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves an issue where multimodal family detection and image processor lookup failed when vision models were exposed under aliased names, as the system previously relied solely on the request-facing model ID. The solution introduces a robust fallback mechanism that utilizes the model_type found in the model's configuration, ensuring that aliased models are correctly identified and processed. This enhancement significantly improves the reliability of multimodal chat preparation within the gRPC path.

Highlights

  • Multimodal Model Identification: Added ModelMetadata::config_model_type() and matches_model_type_hints() helper methods to enable shared logic for identifying multimodal models based on their configuration type.
  • Fallback for Aliased Models: Updated built-in multimodal specifications (Llama4, Llava, Phi3-V, Qwen3-VL, Qwen-VL) to use config.model_type as a fallback when the model_id does not directly match, improving support for aliased vision models.
  • Image Processor Lookup Enhancement: Refactored the image processor lookup to find(model_id, model_type), allowing it to fall back to the model_type from the model configuration for better identification.
  • Multimodal Preprocessing Integration: Integrated the loaded model_type into the model_gateway's multimodal preprocessing logic, ensuring it is used during image processor selection.
  • Regression Tests: Introduced new regression tests to verify correct behavior for aliased model IDs and to ensure fast-path preservation across multimodal specifications and image processor lookups.
Changelog
  • crates/multimodal/src/registry/llama4.rs
    • Updated the matches method to use the new matches_model_type_hints helper for model identification.
    • Added a new test case llama4_matches_alias_via_model_type to verify correct matching with aliased model IDs.
  • crates/multimodal/src/registry/llava.rs
    • Updated the matches method to use the new matches_model_type_hints helper for model identification.
    • Added a new test case llava_matches_alias_via_model_type to verify correct matching with aliased model IDs.
  • crates/multimodal/src/registry/phi3_v.rs
    • Updated the matches method to use the new matches_model_type_hints helper for model identification.
    • Added a new test case phi3_matches_alias_via_model_type to verify correct matching with aliased model IDs.
  • crates/multimodal/src/registry/qwen3_vl.rs
    • Updated the matches method to use the new matches_model_type_hints helper for model identification.
    • Added a new test case qwen3_vl_matches_alias_via_model_type to verify correct matching with aliased model IDs.
  • crates/multimodal/src/registry/qwen_vl.rs
    • Updated the matches method to use the new matches_model_type_hints helper for model identification.
    • Added a new test case qwen_vl_matches_alias_via_model_type to verify correct matching with aliased model IDs.
  • crates/multimodal/src/registry/traits.rs
    • Added config_model_type method to ModelMetadata to retrieve the model type from the configuration.
    • Introduced matches_model_type_hints method to ModelMetadata for flexible model type matching using hints.
    • Implemented matches_all_hints as a private helper for case-insensitive substring matching against candidate strings.
  • crates/multimodal/src/vision/image_processor.rs
    • Modified the find method to accept an optional model_type for fallback lookup, enhancing model identification.
    • Updated has_processor to use the new find method signature.
    • Introduced a private find_in_candidate helper method to encapsulate pattern matching logic.
    • Adjusted existing test calls to find to include the new model_type parameter as None.
    • Added test_registry_find_falls_back_to_model_type to test fallback behavior for aliased models.
    • Added test_registry_find_preserves_fast_path to ensure direct model ID matches are prioritized.
  • model_gateway/src/routers/grpc/multimodal.rs
    • Extracted the model_type from the loaded model configuration for use in subsequent steps.
    • Passed the extracted model_type to the image_processor_registry.find method for improved image processor selection.
    • Removed a significant amount of unused code related to token expansion and multimodal data assembly, simplifying the file.
Activity
  • No specific human activity (comments, reviews, progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@mergify
Copy link

mergify bot commented Mar 13, 2026

Hi @YouNeedCryDear, the DCO sign-off check has failed. All commits must include a Signed-off-by line.

To fix existing commits:

# Sign off the last N commits (replace N with the number of unsigned commits)
git rebase HEAD~N --signoff
git push --force-with-lease

To sign off future commits automatically:

  • Use git commit -s every time, or
  • VSCode: enable Git: Always Sign Off in Settings
  • PyCharm: enable Sign-off commit in the Commit tool window

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request correctly implements a fallback mechanism to use the model_type from the configuration when the model_id does not match a known multimodal model family. This is achieved by introducing a new helper matches_model_type_hints and updating the matches methods in the model processor specs. The image processor lookup is also updated to use this fallback. The changes are well-tested with new regression tests. However, there appears to be a significant accidental deletion of code in model_gateway/src/routers/grpc/multimodal.rs, which is a critical issue that needs to be addressed.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ddf2bb065d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
model_gateway/src/routers/grpc/multimodal.rs (1)

376-384: ⚠️ Potential issue | 🔴 Critical

This PR has critical compilation errors that must be fixed before merging.

The code cannot compile. Two blocking issues:

  1. The expand_tokens function in multimodal.rs (line 370–383) is incomplete—it ends with an unclosed brace and missing function body closure.

  2. The assemble_multimodal_data function is called in request_building.rs (line 81, imported line 13) but does not exist in multimodal.rs or anywhere else in the codebase. This will produce an undefined reference error.

Confirm that the intended final state includes:

  • A complete, properly closed expand_tokens function implementation
  • Either restoration of assemble_multimodal_data or removal of its call site in request_building.rs
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/multimodal.rs` around lines 376 - 384, The
expand_tokens function is unterminated and missing its final return/closing
brace: finish its implementation in multimodal.rs (complete the logic after
resolving placeholder_token_id and ensure it returns an ExpandedTokens struct
with token_ids, placeholders, and patch_offsets) and add the missing closing
brace to properly end the function; additionally resolve the undefined
assemble_multimodal_data symbol by either restoring/implementing
assemble_multimodal_data in multimodal.rs with the same public signature
expected by request_building.rs (so request_building.rs can call it), or
remove/replace the call in request_building.rs (where assemble_multimodal_data
is imported and invoked) to use the new expand_tokens API — make sure to
reference the expand_tokens function and assemble_multimodal_data symbol when
editing so compilation errors are eliminated.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@model_gateway/src/routers/grpc/multimodal.rs`:
- Around line 376-384: The expand_tokens function is unterminated and missing
its final return/closing brace: finish its implementation in multimodal.rs
(complete the logic after resolving placeholder_token_id and ensure it returns
an ExpandedTokens struct with token_ids, placeholders, and patch_offsets) and
add the missing closing brace to properly end the function; additionally resolve
the undefined assemble_multimodal_data symbol by either restoring/implementing
assemble_multimodal_data in multimodal.rs with the same public signature
expected by request_building.rs (so request_building.rs can call it), or
remove/replace the call in request_building.rs (where assemble_multimodal_data
is imported and invoked) to use the new expand_tokens API — make sure to
reference the expand_tokens function and assemble_multimodal_data symbol when
editing so compilation errors are eliminated.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 50cdc872-a864-45a4-b66c-b3e3ebff2d42

📥 Commits

Reviewing files that changed from the base of the PR and between f687596 and ddf2bb0.

📒 Files selected for processing (8)
  • crates/multimodal/src/registry/llama4.rs
  • crates/multimodal/src/registry/llava.rs
  • crates/multimodal/src/registry/phi3_v.rs
  • crates/multimodal/src/registry/qwen3_vl.rs
  • crates/multimodal/src/registry/qwen_vl.rs
  • crates/multimodal/src/registry/traits.rs
  • crates/multimodal/src/vision/image_processor.rs
  • model_gateway/src/routers/grpc/multimodal.rs

Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>
Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 268c1fed01

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/multimodal/src/vision/image_processor.rs`:
- Around line 358-364: The current find_in_candidate method returns the first
substring match from self.processors (a HashMap), which is non-deterministic and
can pick shorter/overlapping patterns like "llava" before "llava-next"; change
it to collect all processors whose pattern (case-insensitive) is contained in
the candidate, then choose the most specific match deterministically by sorting
matches by descending pattern length and then by a stable tie-breaker (e.g.,
lexicographic order) and returning the first entry. Update find_in_candidate to
(1) lowercase candidate once, (2) iterate through self.processors to collect
matching (pattern, processor) pairs, (3) sort the collected matches by
(-pattern.len(), pattern) and (4) return the processor reference of the top
match or None if empty, ensuring case-insensitive comparisons and deterministic
selection.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: fd498390-1608-4021-a037-50d21ca8a048

📥 Commits

Reviewing files that changed from the base of the PR and between ddf2bb0 and 268c1fe.

📒 Files selected for processing (8)
  • crates/multimodal/src/registry/llama4.rs
  • crates/multimodal/src/registry/llava.rs
  • crates/multimodal/src/registry/phi3_v.rs
  • crates/multimodal/src/registry/qwen3_vl.rs
  • crates/multimodal/src/registry/qwen_vl.rs
  • crates/multimodal/src/registry/traits.rs
  • crates/multimodal/src/vision/image_processor.rs
  • model_gateway/src/routers/grpc/multimodal.rs

Comment on lines +358 to +364
fn find_in_candidate(&self, candidate: &str) -> Option<&dyn ImagePreProcessor> {
let candidate = candidate.to_ascii_lowercase();
for (pattern, processor) in &self.processors {
if candidate.contains(&pattern.to_ascii_lowercase()) {
return Some(processor.as_ref());
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

FILE="$(fd -p 'image_processor.rs' | head -n1)"
echo "Inspecting: $FILE"

echo
echo "[1] Confirm HashMap + first-match iteration:"
rg -n -C2 'processors:\s*HashMap|for \(pattern, processor\) in &self\.processors|return Some\(processor\.as_ref\(\)\)' "$FILE"

echo
echo "[2] List registered patterns and detect substring overlaps:"
python - <<'PY'
import re
from pathlib import Path

p = next(Path('.').rglob('image_processor.rs'))
text = p.read_text()

patterns = re.findall(r'registry\.register\(\s*"([^"]+)"', text)
print("patterns:", patterns)

overlaps = []
for a in patterns:
    for b in patterns:
        if a != b and b in a:
            overlaps.append((a, b))

print("overlaps:")
for a, b in sorted(set(overlaps)):
    print(f"  '{a}' contains '{b}'")
PY

Repository: lightseekorg/smg

Length of output: 1156


Deterministic processor selection is not guaranteed.

Line 360 returns the first substring match from a HashMap, but iteration order is non-deterministic. With overlapping patterns (for example llava-next and llava), this can select the wrong processor unpredictably. The patterns registered include both llava-next and llava, where one is a substring of the other, making the issue concrete.

🔧 Proposed fix (choose most specific match deterministically)
 fn find_in_candidate(&self, candidate: &str) -> Option<&dyn ImagePreProcessor> {
     let candidate = candidate.to_ascii_lowercase();
-    for (pattern, processor) in &self.processors {
-        if candidate.contains(&pattern.to_ascii_lowercase()) {
-            return Some(processor.as_ref());
-        }
-    }
-    None
+    self.processors
+        .iter()
+        .filter(|(pattern, _)| candidate.contains(&pattern.to_ascii_lowercase()))
+        .max_by(|(a, _), (b, _)| {
+            a.len()
+                .cmp(&b.len())
+                .then_with(|| a.as_str().cmp(b.as_str()))
+        })
+        .map(|(_, processor)| processor.as_ref())
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/multimodal/src/vision/image_processor.rs` around lines 358 - 364, The
current find_in_candidate method returns the first substring match from
self.processors (a HashMap), which is non-deterministic and can pick
shorter/overlapping patterns like "llava" before "llava-next"; change it to
collect all processors whose pattern (case-insensitive) is contained in the
candidate, then choose the most specific match deterministically by sorting
matches by descending pattern length and then by a stable tie-breaker (e.g.,
lexicographic order) and returning the first entry. Update find_in_candidate to
(1) lowercase candidate once, (2) iterate through self.processors to collect
matching (pattern, processor) pairs, (3) sort the collected matches by
(-pattern.len(), pattern) and (4) return the processor reference of the top
match or None if empty, ensuring case-insensitive comparisons and deterministic
selection.

Signed-off-by: Arthur Cheng <arthur.cheng@oracle.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
crates/multimodal/src/vision/image_processor.rs (1)

358-366: ⚠️ Potential issue | 🟠 Major

Non-deterministic processor selection for overlapping patterns.

This helper returns the first substring match from a HashMap, but iteration order is non-deterministic. With patterns like "llava-next" and "llava" both registered, a candidate like "llava-next-7b" matches both patterns, and either processor could be returned depending on HashMap layout.

The fix from the previous review (selecting the longest matching pattern) was not applied.

🔧 Proposed fix (select longest matching pattern deterministically)
 fn find_in_candidate(&self, candidate: &str) -> Option<&dyn ImagePreProcessor> {
     let candidate = candidate.to_ascii_lowercase();
-    for (pattern, processor) in &self.processors {
-        if candidate.contains(&pattern.to_ascii_lowercase()) {
-            return Some(processor.as_ref());
-        }
-    }
-    None
+    self.processors
+        .iter()
+        .filter(|(pattern, _)| candidate.contains(&pattern.to_ascii_lowercase()))
+        .max_by(|(a, _), (b, _)| {
+            a.len()
+                .cmp(&b.len())
+                .then_with(|| a.as_str().cmp(b.as_str()))
+        })
+        .map(|(_, processor)| processor.as_ref())
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/multimodal/src/vision/image_processor.rs` around lines 358 - 366,
find_in_candidate currently returns the first matching processor by iterating a
HashMap (non-deterministic) which causes wrong selection for overlapping
patterns; change it to scan all entries in self.processors, compare lowercase
pattern against the already-lowercased candidate, track the longest-matching
pattern (e.g., keep best_len and best_processor) and return the processor for
the longest match at the end; reference the find_in_candidate function, the
processors field, and the pattern/processor variables when making this change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/multimodal/src/vision/image_processor.rs`:
- Around line 358-366: find_in_candidate currently returns the first matching
processor by iterating a HashMap (non-deterministic) which causes wrong
selection for overlapping patterns; change it to scan all entries in
self.processors, compare lowercase pattern against the already-lowercased
candidate, track the longest-matching pattern (e.g., keep best_len and
best_processor) and return the processor for the longest match at the end;
reference the find_in_candidate function, the processors field, and the
pattern/processor variables when making this change.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6c85fbea-d2c6-4497-af8b-ae038a774d4e

📥 Commits

Reviewing files that changed from the base of the PR and between 268c1fe and d5a6490.

📒 Files selected for processing (1)
  • crates/multimodal/src/vision/image_processor.rs

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d5a64908f5

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +344 to +345
self.find_in_candidate(model_id)
.or_else(|| model_type.and_then(|model_type| self.find_in_candidate(model_type)))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Normalize model_type separators before fallback lookup

ImageProcessorRegistry::find now falls back to model_type, but it feeds the raw string into substring matching without normalizing separators. Because built-in keys include hyphenated patterns like llava-next/llava-v1.6, an underscore-form config value such as llava_next will miss the intended specific match and can fall through to the generic llava processor, causing aliased LLaVA-NeXT requests to be preprocessed with the wrong pipeline. Please normalize model_type (e.g., _/-) or add equivalent aliases before matching.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

grpc gRPC client and router changes model-gateway Model gateway crate changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Multimodal family detection fails for aliased/custom model IDs even when tokenizer/config point to the correct vision model

1 participant