Skip to content

feat(gateway): add DIFFUSION model capability detection#736

Open
Kangyan-Zhou wants to merge 2 commits intolightseekorg:mainfrom
Kangyan-Zhou:add_capability
Open

feat(gateway): add DIFFUSION model capability detection#736
Kangyan-Zhou wants to merge 2 commits intolightseekorg:mainfrom
Kangyan-Zhou:add_capability

Conversation

@Kangyan-Zhou
Copy link
Contributor

@Kangyan-Zhou Kangyan-Zhou commented Mar 11, 2026

Summary

  • Add DIFFUSION bitflag (1 << 12) to ModelType for detecting diffusion models (Stable Diffusion, Flux, SDXL, SD3, etc.)
  • External model discovery: pattern matching on model IDs (stable-diffusion, flux*, sd-*, sdxl*, *diffusion*)
  • Local worker discovery: detect via model_type="diffusion" label from SGLang multimodal_gen server's /model_info response
  • Includes DIFFUSION_MODEL composite, supports_diffusion() / is_diffusion_model() helpers, serde + JSON schema support

Test plan

  • 6 unit tests in model_type.rs (flag basics, composite, serde roundtrip, display)
  • 2 unit tests in discover_models.rs (diffusion ID patterns, non-diffusion negative cases)
  • 4 unit tests in create_worker.rs (label detection, case insensitivity, precedence over embedding, negative case)
  • cargo clippy clean
  • pre-commit run --all-files passes

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added support for recognizing and handling diffusion models, including detection of popular variants like Stable Diffusion, SDXL, and Flux.
    • Improved model classification and capability tracking for diffusion-based models.
  • Tests

    • Added comprehensive test coverage for diffusion model detection and classification.

Add a new DIFFUSION bitflag (1 << 12) to ModelType for detecting
diffusion models (Stable Diffusion, Flux, SDXL, etc.) served by
SGLang's multimodal_gen server. Detection works via:
- External: model ID pattern matching (stable-diffusion, flux, sd-*, etc.)
- Local: model_type="diffusion" label from /model_info response

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link

coderabbitai bot commented Mar 11, 2026

📝 Walkthrough

Walkthrough

This pull request introduces a new diffusion model capability to the system. It adds DIFFUSION bitflags to ModelType, implements detection logic for diffusion models based on model identifiers, and updates model card creation to properly handle and enable the diffusion capability when appropriate.

Changes

Cohort / File(s) Summary
Model Type Core
crates/protocols/src/model_type.rs
Added DIFFUSION and DIFFUSION_MODEL bitflags, capability name mapping for "diffusion", and public helper methods (supports_diffusion(), is_diffusion_model()) to query diffusion support. Includes comprehensive unit tests for flag behavior and serialization.
Diffusion Model Detection
model_gateway/src/core/steps/worker/external/discover_models.rs
Added detection logic in infer_model_type_from_id to identify diffusion models from ID patterns (stable-diffusion, sd-, sdxl, flux, etc.). Includes test cases validating correct classification of diffusion vs. non-diffusion models.
Model Card Creation
model_gateway/src/core/steps/worker/local/create_worker.rs
Updated build_model_card logic to handle diffusion model types: enables DIFFUSION flag for user-provided models and applies precedence rules (diffusion > embedding > vision) in the non-user-provided path. Includes tests for diffusion label interpretation and precedence behavior.

Sequence Diagram

sequenceDiagram
    participant ModelDiscovery as Model Discovery
    participant TypeInference as Type Inference
    participant ModelCard as Model Card Builder

    ModelDiscovery->>ModelDiscovery: Parse model identifier
    ModelDiscovery->>TypeInference: Detect diffusion pattern<br/>(stable-diffusion, sd-, flux, etc.)
    TypeInference->>TypeInference: Infer model_type<br/>as DIFFUSION_MODEL
    TypeInference->>ModelCard: Pass detected type
    ModelCard->>ModelCard: Check if diffusion<br/>from model_type label
    alt User-Provided Model
        ModelCard->>ModelCard: Enable DIFFUSION flag<br/>if not supported
    else Non-User-Provided Model
        ModelCard->>ModelCard: Set DIFFUSION_MODEL<br/>with precedence over embedding
    end
    ModelCard->>ModelCard: Return configured<br/>model_card
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

protocols

Suggested reviewers

  • CatherineSue
  • key4ng

Poem

🐰 Hop hop, diffusion flows,
Through the model garden it goes,
Flags are set, detection's bright,
Cards are built with diffusion's might! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding DIFFUSION model capability detection to the gateway, which is accurately reflected across all three modified files.
Docstring Coverage ✅ Passed Docstring coverage is 91.67% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can customize the high-level summary generated by CodeRabbit.

Configure the reviews.high_level_summary_instructions setting to provide custom instructions for generating the high-level summary.

@github-actions github-actions bot added the model-gateway Model gateway crate changes label Mar 11, 2026
@gemini-code-assist
Copy link

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces comprehensive support for diffusion models within the gateway. It establishes a new ModelType for diffusion, enabling the system to correctly identify, categorize, and manage these models, whether discovered externally through naming conventions or locally via worker labels. This enhancement allows for better routing and handling of image generation and other diffusion-based tasks.

Highlights

  • New ModelType Bitflag: Added a DIFFUSION bitflag (1 << 12) to ModelType for identifying diffusion models (e.g., Stable Diffusion, Flux, SDXL).
  • External Model Discovery: Implemented external model discovery by pattern matching on model IDs, recognizing common diffusion model naming conventions like stable-diffusion, flux*, sd-*, sdxl*, and *diffusion*.
  • Local Worker Discovery: Enabled local worker discovery to detect diffusion models via the model_type="diffusion" label from SGLang multimodal_gen server's /model_info response.
  • API and Schema Support: Included a DIFFUSION_MODEL composite, supports_diffusion() and is_diffusion_model() helper functions, and extended serde and JSON schema support for the new diffusion model type.
Changelog
  • crates/protocols/src/model_type.rs
    • Added DIFFUSION bitflag.
    • Added DIFFUSION_MODEL composite.
    • Added diffusion to CAPABILITY_NAMES.
    • Added supports_diffusion and is_diffusion_model methods.
    • Updated JsonSchema for ModelType to include diffusion.
    • Added new unit tests for diffusion model type functionality.
  • model_gateway/src/core/steps/worker/external/discover_models.rs
    • Modified infer_model_type_from_id to detect diffusion models based on ID patterns.
    • Added unit tests for diffusion model ID inference.
  • model_gateway/src/core/steps/worker/local/create_worker.rs
    • Updated build_model_card to detect model_type="diffusion" labels.
    • Adjusted model type inference logic to prioritize diffusion model detection.
    • Added unit tests for diffusion model label detection.
Activity
  • 6 unit tests were added in model_type.rs covering flag basics, composite, serde roundtrip, and display.
  • 2 unit tests were added in discover_models.rs for diffusion ID patterns and negative cases.
  • 4 unit tests were added in create_worker.rs for label detection, case insensitivity, precedence, and negative cases.
  • cargo clippy ran clean.
  • pre-commit run --all-files passed.
  • The PR was generated with Claude Code.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for diffusion models by adding a DIFFUSION capability to ModelType. The changes include updating model discovery logic for both external and local workers to correctly identify these models based on their IDs or labels. The implementation is well-structured and accompanied by a comprehensive set of unit tests. I have one suggestion to simplify the model ID matching logic for better readability and maintainability.

Comment on lines +118 to +127
if id_lower.contains("stable-diffusion")
|| id_lower.contains("stable_diffusion")
|| id_lower.starts_with("sd-")
|| id_lower.starts_with("sd3")
|| id_lower.starts_with("sdxl")
|| id_lower.starts_with("flux")
|| id_lower.contains("diffusion")
{
return ModelType::DIFFUSION_MODEL;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The checks for stable-diffusion and stable_diffusion are redundant because the more general diffusion check at the end of the condition will catch them anyway. Removing these specific checks simplifies the code without changing the logic. Reordering to check starts_with before contains can also be slightly more performant.

    if id_lower.starts_with("sd-")
        || id_lower.starts_with("sd3")
        || id_lower.starts_with("sdxl")
        || id_lower.starts_with("flux")
        || id_lower.contains("diffusion")
    {
        return ModelType::DIFFUSION_MODEL;
    }

@Kangyan-Zhou Kangyan-Zhou marked this pull request as ready for review March 17, 2026 22:38
@mergify
Copy link

mergify bot commented Mar 17, 2026

Hi @Kangyan-Zhou, the DCO sign-off check has failed. All commits must include a Signed-off-by line.

To fix existing commits:

# Sign off the last N commits (replace N with the number of unsigned commits)
git rebase HEAD~N --signoff
git push --force-with-lease

To sign off future commits automatically:

  • Use git commit -s every time, or
  • VSCode: enable Git: Always Sign Off in Settings
  • PyCharm: enable Sign-off commit in the Commit tool window

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
crates/protocols/src/model_type.rs (1)

313-331: 🧹 Nitpick | 🔵 Trivial

Reduce schema drift risk by deriving enum values from CAPABILITY_NAMES.

The list at Line 317–Line 331 duplicates capability names already maintained in CAPABILITY_NAMES; this can silently diverge in future flag additions.

♻️ Proposed refactor
     fn json_schema(_gen: &mut SchemaGenerator) -> Schema {
         use schemars::schema::*;
         let items = SchemaObject {
             instance_type: Some(InstanceType::String.into()),
-            enum_values: Some(vec![
-                "chat".into(),
-                "completions".into(),
-                "responses".into(),
-                "embeddings".into(),
-                "rerank".into(),
-                "generate".into(),
-                "vision".into(),
-                "tools".into(),
-                "reasoning".into(),
-                "image_gen".into(),
-                "audio".into(),
-                "moderation".into(),
-                "diffusion".into(),
-            ]),
+            enum_values: Some(
+                CAPABILITY_NAMES
+                    .iter()
+                    .map(|(_, name)| (*name).into())
+                    .collect(),
+            ),
             ..Default::default()
         };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/protocols/src/model_type.rs` around lines 313 - 331, The hard-coded
enum_values array in the json_schema implementation for json_schema should be
replaced with values derived from the existing CAPABILITY_NAMES constant to
avoid duplication and drift: locate the json_schema function and replace the
literal vec![...strings...] with an expression that iterates over
CAPABILITY_NAMES (e.g., CAPABILITY_NAMES.iter().map(|s| s.into()).collect()) so
the SchemaObject.enum_values is built from CAPABILITY_NAMES; ensure the produced
collection matches schemars::schema::SingleOrVec<serde_json::Value> (or the
appropriate type expected by enum_values) and import or convert types as needed
so compilation continues to succeed.
model_gateway/src/core/steps/worker/external/discover_models.rs (1)

109-127: ⚠️ Potential issue | 🟠 Major

Diffusion classification is shadowed by the earlier generic image heuristic.

At Line 112, any ID containing "image" returns IMAGE_MODEL before the new diffusion checks run. That means IDs like flux-image-* or stable-diffusion-image-* will not be classified as diffusion.

🐛 Proposed fix (check diffusion before generic image matching)
-    // Image generation models
-    if id_lower.starts_with("dall-e")
-        || id_lower.starts_with("sora")
-        || (id_lower.contains("image") && !id_lower.contains("vision"))
-    {
-        return ModelType::IMAGE_MODEL;
-    }
-
     // Diffusion models (Stable Diffusion, Flux, SDXL, etc.)
     if id_lower.contains("stable-diffusion")
         || id_lower.contains("stable_diffusion")
         || id_lower.starts_with("sd-")
         || id_lower.starts_with("sd3")
         || id_lower.starts_with("sdxl")
         || id_lower.starts_with("flux")
         || id_lower.contains("diffusion")
     {
         return ModelType::DIFFUSION_MODEL;
     }
+
+    // Image generation models
+    if id_lower.starts_with("dall-e")
+        || id_lower.starts_with("sora")
+        || (id_lower.contains("image") && !id_lower.contains("vision"))
+    {
+        return ModelType::IMAGE_MODEL;
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/core/steps/worker/external/discover_models.rs` around lines
109 - 127, The image-vs-diffusion heuristic currently checks the generic image
condition on id_lower before diffusion checks, causing IDs like
"stable-diffusion-image-*" to be classified as IMAGE_MODEL; reorder or refine
the checks so diffusion detection runs first: move the block that returns
ModelType::DIFFUSION_MODEL (matching stable-diffusion, sd-, sdxl, flux,
diffusion) ahead of the generic image heuristic that returns
ModelType::IMAGE_MODEL, or alternatively narrow the image condition
(id_lower.contains("image") && !id_lower.contains("vision")) to explicitly
exclude diffusion keywords; update code paths that reference id_lower and the
ModelType::DIFFUSION_MODEL / ModelType::IMAGE_MODEL returns accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@crates/protocols/src/model_type.rs`:
- Around line 313-331: The hard-coded enum_values array in the json_schema
implementation for json_schema should be replaced with values derived from the
existing CAPABILITY_NAMES constant to avoid duplication and drift: locate the
json_schema function and replace the literal vec![...strings...] with an
expression that iterates over CAPABILITY_NAMES (e.g.,
CAPABILITY_NAMES.iter().map(|s| s.into()).collect()) so the
SchemaObject.enum_values is built from CAPABILITY_NAMES; ensure the produced
collection matches schemars::schema::SingleOrVec<serde_json::Value> (or the
appropriate type expected by enum_values) and import or convert types as needed
so compilation continues to succeed.

In `@model_gateway/src/core/steps/worker/external/discover_models.rs`:
- Around line 109-127: The image-vs-diffusion heuristic currently checks the
generic image condition on id_lower before diffusion checks, causing IDs like
"stable-diffusion-image-*" to be classified as IMAGE_MODEL; reorder or refine
the checks so diffusion detection runs first: move the block that returns
ModelType::DIFFUSION_MODEL (matching stable-diffusion, sd-, sdxl, flux,
diffusion) ahead of the generic image heuristic that returns
ModelType::IMAGE_MODEL, or alternatively narrow the image condition
(id_lower.contains("image") && !id_lower.contains("vision")) to explicitly
exclude diffusion keywords; update code paths that reference id_lower and the
ModelType::DIFFUSION_MODEL / ModelType::IMAGE_MODEL returns accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 43b5b55d-8f04-4741-b363-f0fa06175195

📥 Commits

Reviewing files that changed from the base of the PR and between 7f54c64 and cbde84f.

📒 Files selected for processing (3)
  • crates/protocols/src/model_type.rs
  • model_gateway/src/core/steps/worker/external/discover_models.rs
  • model_gateway/src/core/steps/worker/local/create_worker.rs


if !user_provided {
let is_diffusion = labels
.get("model_type")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is only supported for SGLang atm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model-gateway Model gateway crate changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants