feat(gateway): add message_utils and MessagePreparationStage for Messages API by slin1237 · Pull Request #741 · lightseekorg/smg

slin1237 · 2026-03-12T16:21:52Z

Summary

Add message_utils.rs with conversion functions for Anthropic Messages API types → internal chat template format
Add MessagePreparationStage (Stage 1) parallel to ChatPreparationStage
Make process_tool_call_arguments pub(crate) for reuse across message and chat paths

PR 2 in the Messages API gRPC pipeline series. PR 1 was #739 (type scaffolding).

What changed

New: `message_utils.rs`

Conversion utilities parallel to chat_utils.rs but for CreateMessageRequest / InputMessage:

process_messages() — top-level orchestrator (parallel to process_chat_messages())
process_message_content_format() — converts InputMessage[] to Vec<Value> for chat template
convert_user_message() — user messages with ToolResult splitting into separate "tool" role messages
convert_assistant_message() — extracts text, tool_calls, reasoning_content
extract_chat_tools() / convert_message_tool_choice() — type adapters from Messages API to chat types
extract_tool_result_text() — helper for ToolResult content extraction
7 unit tests covering all major conversion paths

New: `MessagePreparationStage`

Created via git-cp from ChatPreparationStage to preserve file history. Key differences:

Uses messages_request_arc() instead of chat_request_arc()
Calls message_utils::process_messages() instead of utils::process_chat_messages()
Converts tools via extract_chat_tools() + convert_message_tool_choice() adapters
Uses request.stop_sequences (Messages API) instead of request.stop (Chat API)
No multimodal processing (postponed, async preserved for future .await)
No filtered_request / Cow<ChatCompletionRequest> pattern

Modified

chat_utils.rs: process_tool_call_arguments visibility → pub(crate)
stages/preparation.rs: delegating stage uses Display-based error messages
context.rs: removed stale #[expect(dead_code)] from messages_request_arc

How

Follows the same architecture as chat — reuses shared utilities (resolve_tokenizer, filter_tools_by_tool_choice, generate_tool_constraints, create_stop_decoder, process_tool_call_arguments) and only replaces the message-specific conversion layer.

Shared utility	Reused as-is?
`resolve_tokenizer()`	Yes
`process_tool_call_arguments()`	Yes (made pub(crate))
`generate_tool_constraints()`	Yes (after adapter)
`filter_tools_by_tool_choice()`	Yes (after adapter)
`create_stop_decoder()`	Yes (small conversion from `Vec<String>`)
`process_content_format()`	No — replaced by `process_message_content_format()`

Test plan

cargo clippy -p smg --all-targets --all-features -- -D warnings — clean
cargo fmt -p smg -- --check — clean
cargo test -p smg --lib -- message_utils — 7/7 pass
cargo test -p smg -- message — all pass

Refs: #738

Summary by CodeRabbit

New Features
- Messages API pipeline with a preparation stage for CreateMessage requests.
- End-to-end message → chat-template processing, including tool handling, tokenization, stop-sequences, and template application.
- Tool constraint generation and stop-decoder stored for downstream stages.
Refactor
- Simplified unsupported-request-type error reporting.
- Adjusted visibility and added utility modules for message/tool processing and reuse.

gemini-code-assist · 2026-03-12T16:22:14Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly extends the gateway's capabilities by laying the groundwork for the Anthropic Messages API. It introduces a new pipeline stage and a set of utility functions to seamlessly convert Messages API requests into the internal chat template format, enabling consistent processing while reusing existing infrastructure. This is a foundational step in a series of changes to fully support the Messages API.

Highlights

Messages API Integration: Introduced core components for integrating the Anthropic Messages API, including a new utility module and a dedicated preparation stage.
New Utility Module: Added message_utils.rs to handle conversions from Anthropic Messages API types to the internal chat template format, mirroring existing chat utilities.
Dedicated Preparation Stage: Implemented MessagePreparationStage to preprocess Messages API requests, managing tool conversion, message processing, tokenization, and tool constraint building.
Visibility Change: Modified the visibility of process_tool_call_arguments in chat_utils.rs to pub(crate) to allow reuse across both chat and messages API paths.

Changelog

model_gateway/src/routers/grpc/context.rs
- Removed a dead_code attribute from messages_request_arc.
model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
- Added a new module to encapsulate Messages API-specific pipeline stages.
- Exported MessagePreparationStage for use in the pipeline factory.
model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
- Implemented the MessagePreparationStage to handle preprocessing of Anthropic Messages API requests.
- Included logic for tokenizer resolution, tool conversion and filtering, message processing, tokenization, and tool constraint generation.
- Stored processed data and a stop decoder in the request context.
model_gateway/src/routers/grpc/regular/stages/mod.rs
- Added the new messages module to the list of regular stages.
model_gateway/src/routers/grpc/regular/stages/preparation.rs
- Updated error logging and message formatting to use Display for RequestType variants.
model_gateway/src/routers/grpc/utils/chat_utils.rs
- Changed the visibility of process_tool_call_arguments to pub(crate) for broader access within the crate.
model_gateway/src/routers/grpc/utils/message_utils.rs
- Added comprehensive utility functions for converting Anthropic Messages API types (CreateMessageRequest, InputMessage) to the internal chat template format.
- Included functions for processing messages, converting user and assistant message content, extracting tool results, and adapting Messages API tool choices and custom tools to Chat API types.
- Provided unit tests for various conversion scenarios.
model_gateway/src/routers/grpc/utils/mod.rs
- Exported the new message_utils module.

Activity

No human activity has occurred on this pull request yet.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

coderabbitai · 2026-03-12T16:22:17Z

📝 Walkthrough

Walkthrough

Adds a Messages API preparation pipeline: new MessagePreparationStage, message transformation utilities converting Anthropic CreateMessageRequest into the internal chat-template flow, minor visibility and lint tweaks across gRPC context and utils, and wiring of the new messages submodule.

Changes

Cohort / File(s)	Summary
Stages wiring `model_gateway/src/routers/grpc/regular/stages/mod.rs`, `model_gateway/src/routers/grpc/regular/stages/messages/mod.rs`	Adds `messages` submodule and re-exports `MessagePreparationStage`.
Message preparation stage `model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs`	New `MessagePreparationStage` implementing pipeline prep for `CreateMessageRequest`: resolves tokenizer, processes messages, tokenizes text, builds tool constraints and stop decoder, and saves preparation output to request/response state.
Message utilities `model_gateway/src/routers/grpc/utils/message_utils.rs`, `model_gateway/src/routers/grpc/utils/mod.rs`	New module with `process_messages(...)` and helpers to convert Anthropic Messages API types to internal chat-template format; includes unit tests and extraction/formatting helpers.
Context & helpers `model_gateway/src/routers/grpc/context.rs`, `model_gateway/src/routers/grpc/utils/chat_utils.rs`, `model_gateway/src/routers/grpc/regular/stages/preparation.rs`	Removed `#[expect(dead_code)]` from `messages_request_arc` accessor, made `process_tool_call_arguments` `pub(crate)`, and simplified unsupported-request-type error logging/formatting.

Sequence Diagram

sequenceDiagram
    participant GRPC as GRPC Pipeline
    participant MPS as MessagePreparationStage
    participant CTX as RequestContext
    participant TKZ as Tokenizer
    participant TOOL as Tool Extractor
    participant CONS as Constraint Generator
    participant STATE as Response State

    GRPC->>MPS: execute(request, ctx)
    MPS->>CTX: resolve_tokenizer()
    CTX-->>MPS: tokenizer
    MPS->>TOOL: extract_chat_tools(request.tools)
    TOOL-->>MPS: chat_tools
    MPS->>MPS: process_messages(request, tokenizer, chat_tools)
    MPS->>TKZ: tokenize(formatted_text)
    TKZ-->>MPS: token_ids
    alt tools present
        MPS->>CONS: generate_tool_constraints(chat_tools)
        CONS-->>MPS: constraints
    end
    MPS->>MPS: create_stop_decoder(stop_sequences)
    MPS->>CTX: store_preparation_output(prep_output)
    MPS->>STATE: store_stop_decoder(decoder)
    MPS-->>GRPC: return success

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(gateway): add Messages API type scaffolding to gRPC router #739 — Removes #[expect(dead_code)] attribute from messages_request_arc accessor; touches the same accessor scaffolding referenced in this PR.

Suggested labels

anthropic, tests

Suggested reviewers

CatherineSue
key4ng

Poem

🐰 I hopped through pipelines, trimmed lint and thread,
Messages turned to tokens, tools politely led.
Stops tucked in pockets, templates snug and neat,
Rabbity scissors snipped—now requests are fleet.

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately summarizes the main changes: adding message_utils and MessagePreparationStage to support the Messages API, which aligns with the substantive additions across 6 files and removal of the stale dead_code annotation.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings (stacked PR)
📝 Generate docstrings (commit on current branch)

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch slin/msg-2

📝 Coding Plan for PR comments

Generate coding plan

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mergify · 2026-03-12T16:22:37Z

Hi @slin1237, the DCO sign-off check has failed. All commits must include a Signed-off-by line.

To fix existing commits:

# Sign off the last N commits (replace N with the number of unsigned commits)
git rebase HEAD~N --signoff
git push --force-with-lease

To sign off future commits automatically:

Use git commit -s every time, or
VSCode: enable Git: Always Sign Off in Settings
PyCharm: enable Sign-off commit in the Commit tool window

gemini-code-assist

Code Review

This pull request introduces the message_utils and MessagePreparationStage to support the Anthropic Messages API, aligning it with the existing chat pipeline. The changes include conversion functions for Messages API types to the internal chat template format, and a new preparation stage that leverages shared utilities. The visibility of process_tool_call_arguments in chat_utils.rs has been updated to pub(crate) to facilitate reuse. Overall, the changes are well-structured and follow the established architecture for handling different API types.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/utils/message_utils.rs`:
- Around line 223-242: The fold currently overwrites reasoning for
InputContentBlock::Thinking, losing prior blocks; change the accumulated
reasoning from Option<String> to Vec<String> (e.g., in the initial tuple passed
to blocks.iter().fold and the fold closure), push each t.thinking.clone() on
InputContentBlock::Thinking, and after the fold join the Vec<String> with the
desired separator (or keep the Vec if callers can handle it) so that the
produced (text_parts, tool_calls, reasoning) preserves all thinking blocks
instead of only the last one.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2c8607ce-0dea-40b6-b525-bff4a7920308

📥 Commits

Reviewing files that changed from the base of the PR and between e1d2183 and dee954d.

📒 Files selected for processing (8)

model_gateway/src/routers/grpc/context.rs
model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
model_gateway/src/routers/grpc/regular/stages/mod.rs
model_gateway/src/routers/grpc/regular/stages/preparation.rs
model_gateway/src/routers/grpc/utils/chat_utils.rs
model_gateway/src/routers/grpc/utils/message_utils.rs
model_gateway/src/routers/grpc/utils/mod.rs

💤 Files with no reviewable changes (1)

model_gateway/src/routers/grpc/context.rs

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dee954d7bc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-12T16:29:18Z

+        let tool_call_constraint = if filtered_tools.is_empty() {
+            None


Reject required tool_choice when no matching tools remain

This branch skips tool constraints whenever filtered_tools is empty, which lets requests continue even when tool_choice requires tool use (any or a specific tool) but filtering/adaptation left no usable custom tools. In that scenario the request silently degrades to unconstrained text generation rather than returning a client error, violating the caller’s explicit tool-choice contract.

Useful? React with 👍 / 👎.

…ages API Add the first message-specific pipeline stage (Stage 1: Preparation) and the utility functions it needs to convert Anthropic Messages API types into the internal chat template format. What changed: - New message_utils.rs with conversion functions: - process_messages(): top-level orchestrator parallel to process_chat_messages() - process_message_content_format(): converts InputMessage to Vec<Value> JSON - convert_user_message(): handles user messages, splits ToolResult into separate "tool" role messages - convert_assistant_message(): extracts text, tool_calls, reasoning_content - extract_chat_tools(): filters Custom tools and converts to chat::Tool - convert_message_tool_choice(): maps Messages ToolChoice to chat ToolChoice - extract_tool_result_text(): helper for ToolResult content extraction - 7 unit tests covering all major conversion paths - New MessagePreparationStage (parallel to ChatPreparationStage): - Same structure as ChatPreparationStage (impl method pattern) - Resolves tokenizer, converts/filters tools, processes messages, tokenizes, builds tool constraints, creates stop decoder - Multimodal processing postponed (marked with async for future .await) - Made process_tool_call_arguments pub(crate) in chat_utils.rs for reuse - Updated delegating PreparationStage to use Display-based error messages - Removed stale #[expect(dead_code)] from messages_request_arc (now used) Why: This is PR 2 in the Messages API gRPC pipeline series. PR 1 (#739) added type scaffolding. This PR adds the preparation stage that converts Messages API requests into the shared internal format, enabling the existing request building and response processing stages to work with Messages API requests in follow-up PRs. How: Follows the same architecture as chat: reuses shared utilities (resolve_tokenizer, filter_tools_by_tool_choice, generate_tool_constraints, create_stop_decoder, process_tool_call_arguments) and only replaces the message-specific conversion layer (process_content_format → process_message_content_format). Refs: #738 Signed-off-by: Simo Lin <linsimo.mark@gmail.com>

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/utils/message_utils.rs`:
- Around line 366-374: The current extract_chat_tools function silently drops
non-Custom messages::Tool variants which changes request semantics; change
extract_chat_tools to return a Result<Vec<ChatTool>, RpcError> (or your
project's equivalent error type) and validate each tool: if a messages::Tool is
Custom convert it via custom_tool_to_chat_tool, but if it is Bash, WebSearch,
TextEditor, McpToolset or any other non-Custom variant return an explicit client
error (400) with a clear message listing the unsupported tool(s). Update callers
(e.g., the MessagePreparationStage path) to propagate/handle this Result so
requests with only unsupported tools fail fast instead of being treated as
no-tools.
- Around line 160-191: The current fold over InputContent::Blocks collects all
user_parts and tool_msgs then appends user content first, which reorders mixed
sequences (e.g., [text, tool_result, text]); modify the logic in the
InputContent::Blocks handling (around the fold) to iterate with a for loop over
blocks, accumulating user_parts and whenever you encounter
InputContentBlock::ToolResult flush the accumulated user_parts by calling
format_content_parts (using the same content_format) and push a user json into
result, then push the tool message (use extract_tool_result_text and
tr.tool_use_id) immediately to preserve original ordering, and continue
accumulating subsequent user_parts; ensure any remaining user_parts are flushed
after the loop.
- Around line 266-280: The current format_content_parts (match arm
ChatTemplateContentFormat::String) collapses non-text-only parts into an empty
string; change it to detect when no text parts were extracted and in that case
return the original parts as a Value::Array (preserving image/document
placeholders) instead of Value::String(""), mirroring the behavior of
transform_content_field; locate format_content_parts and update the
ChatTemplateContentFormat::String branch to conditionally return the joined text
when present or the original parts array when text is absent.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 46453291-4bcd-410a-88ce-261e231359a0

📥 Commits

Reviewing files that changed from the base of the PR and between dee954d and 6219384.

📒 Files selected for processing (8)

model_gateway/src/routers/grpc/context.rs
model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
model_gateway/src/routers/grpc/regular/stages/mod.rs
model_gateway/src/routers/grpc/regular/stages/preparation.rs
model_gateway/src/routers/grpc/utils/chat_utils.rs
model_gateway/src/routers/grpc/utils/message_utils.rs
model_gateway/src/routers/grpc/utils/mod.rs

💤 Files with no reviewable changes (1)

model_gateway/src/routers/grpc/context.rs

coderabbitai · 2026-03-12T16:41:28Z

+pub(crate) fn extract_chat_tools(tools: &[messages::Tool]) -> Vec<ChatTool> {
+    tools
+        .iter()
+        .filter_map(|t| match t {
+            messages::Tool::Custom(custom) => Some(custom_tool_to_chat_tool(custom)),
+            _ => None,
+        })
+        .collect()
+}


⚠️ Potential issue | 🟠 Major

Reject unsupported Messages tool types instead of silently dropping them.

Filtering out non-Custom variants here changes request semantics. A request containing only Bash/WebSearch/TextEditor/McpToolset tools reaches MessagePreparationStage as if it had no tools at all, which also bypasses tool_choice enforcement for any or named-tool requests. This should fail fast with a 400 rather than degrade into an unconstrained generation request.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 366 - 374, The current extract_chat_tools function silently drops non-Custom messages::Tool variants which changes request semantics; change extract_chat_tools to return a Result<Vec<ChatTool>, RpcError> (or your project's equivalent error type) and validate each tool: if a messages::Tool is Custom convert it via custom_tool_to_chat_tool, but if it is Bash, WebSearch, TextEditor, McpToolset or any other non-Custom variant return an explicit client error (400) with a clear message listing the unsupported tool(s). Update callers (e.g., the MessagePreparationStage path) to propagate/handle this Result so requests with only unsupported tools fail fast instead of being treated as no-tools.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d826daa9ef

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-12T16:43:39Z

+            if !user_parts.is_empty() {
+                let content = format_content_parts(user_parts, content_format);
+                result.push(json!({"role": "user", "content": content}));
+            }
+            result.extend(tool_msgs);


Preserve tool-result block order when splitting user content

convert_user_message() accumulates all user text/media parts and all tool-result parts separately, then always appends the synthesized user message before appending tool messages. For mixed user content where a tool_result block appears before text (or is interleaved), this reorders the turn and can make the model consume follow-up user text before the tool output it depends on, producing incorrect tool-loop behavior.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-03-12T16:43:39Z

+        Some(ToolResultContent::Blocks(blocks)) => blocks
+            .iter()
+            .filter_map(|b| match b {
+                messages::ToolResultContentBlock::Text(t) => Some(t.text.as_str()),
+                _ => None,


Preserve non-text tool results instead of dropping them

extract_tool_result_text() only keeps ToolResultContentBlock::Text and discards other valid block types (Image, Document, SearchResult), so non-text tool results are silently converted to empty/partial tool messages. When tools return non-text output, the prompt loses the actual result content and the model receives an incorrect conversation state.

Useful? React with 👍 / 👎.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

model_gateway/src/routers/grpc/utils/message_utils.rs (3)

273-289: ⚠️ Potential issue | 🟠 Major

Keep non-text-only parts instead of returning "" in String mode.

When a user turn contains only image/document placeholders, this returns an empty string and silently drops the content. transform_content_field() in model_gateway/src/routers/grpc/utils/chat_utils.rs preserves the original array when no text parts exist, so the Messages path currently diverges from the Chat path.

Suggested fix

 fn format_content_parts(parts: Vec<Value>, content_format: ChatTemplateContentFormat) -> Value {
     match content_format {
         ChatTemplateContentFormat::String => {
-            // Extract text parts and join
-            let text: String = parts
+            let text_parts: Vec<String> = parts
                 .iter()
                 .filter_map(|p| {
                     p.as_object()
                         .and_then(|obj| obj.get("type")?.as_str().filter(|&t| t == "text"))
                         .and_then(|_| p.as_object()?.get("text")?.as_str())
                         .map(String::from)
                 })
-                .collect::<Vec<_>>()
-                .join(" ");
-            Value::String(text)
+                .collect();
+
+            if text_parts.is_empty() {
+                Value::Array(parts)
+            } else {
+                Value::String(text_parts.join(" "))
+            }
         }
         ChatTemplateContentFormat::OpenAI => Value::Array(parts),
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 273 -
289, The current format_content_parts function drops non-text-only content by
returning an empty string when no text parts are found; change the
ChatTemplateContentFormat::String branch to collect text parts as before but, if
the collected text Vec is empty, return the original parts as
Value::Array(parts) instead of Value::String(""), otherwise join and return
Value::String(joined_text); this keeps behavior consistent with
transform_content_field and preserves image/document placeholders.

368-381: ⚠️ Potential issue | 🟠 Major

Unsupported Messages tools should fail fast, not vanish.

Filtering out every non-Custom variant changes request semantics. A request containing only Bash/WebSearch/TextEditor/McpToolset tools reaches MessagePreparationStage as if it had no tools, which also bypasses named/required tool_choice handling.

Return a Result here and let MessagePreparationStage convert unsupported tool variants into a 400 with a clear error message instead of silently degrading the request.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 368 -
381, The current extract_chat_tools function silently drops non-Custom Tool
variants which changes request semantics; change extract_chat_tools to return a
Result<Vec<ChatTool>, ErrorType> (or a domain error) and scan tools: map Custom
-> custom_tool_to_chat_tool, and for any other variant return an Err containing
a clear validation message naming the unsupported variant(s) so
MessagePreparationStage can convert that into a 400; update callers (e.g.,
MessagePreparationStage) to handle the Result and propagate the error as a
bad-request response.

160-191: ⚠️ Potential issue | 🟠 Major

Preserve mixed user/tool-result ordering.

This fold buffers every user part and appends every synthesized tool message afterward. A source turn like [text, tool_result, text] becomes user -> tool instead of user -> tool -> user, which changes the message sequence before templating.

Suggested direction

 fn convert_user_message(
     content: &InputContent,
     content_format: ChatTemplateContentFormat,
     result: &mut Vec<Value>,
 ) {
     match content {
         InputContent::String(text) => {
             result.push(json!({"role": "user", "content": text}));
         }
         InputContent::Blocks(blocks) => {
-            let (user_parts, tool_msgs) = blocks.iter().fold(
-                (Vec::new(), Vec::new()),
-                |(mut user_parts, mut tool_msgs), block| {
-                    match block {
-                        InputContentBlock::Text(t) => {
-                            user_parts.push(json!({"type": "text", "text": t.text}));
-                        }
-                        InputContentBlock::Image(_) => {
-                            user_parts.push(json!({"type": "image"}));
-                        }
-                        InputContentBlock::Document(_) => {
-                            user_parts.push(json!({"type": "document"}));
-                        }
-                        InputContentBlock::ToolResult(tr) => {
-                            tool_msgs.push(json!({
-                                "role": "tool",
-                                "tool_call_id": tr.tool_use_id,
-                                "content": extract_tool_result_text(tr)
-                            }));
-                        }
-                        _ => {}
-                    }
-                    (user_parts, tool_msgs)
-                },
-            );
-
-            if !user_parts.is_empty() {
-                let content = format_content_parts(user_parts, content_format);
-                result.push(json!({"role": "user", "content": content}));
-            }
-            result.extend(tool_msgs);
+            let mut user_parts = Vec::new();
+            for block in blocks {
+                match block {
+                    InputContentBlock::Text(t) => {
+                        user_parts.push(json!({"type": "text", "text": t.text}));
+                    }
+                    InputContentBlock::Image(_) => {
+                        user_parts.push(json!({"type": "image"}));
+                    }
+                    InputContentBlock::Document(_) => {
+                        user_parts.push(json!({"type": "document"}));
+                    }
+                    InputContentBlock::ToolResult(tr) => {
+                        if !user_parts.is_empty() {
+                            let content =
+                                format_content_parts(std::mem::take(&mut user_parts), content_format);
+                            result.push(json!({"role": "user", "content": content}));
+                        }
+                        result.push(json!({
+                            "role": "tool",
+                            "tool_call_id": tr.tool_use_id,
+                            "content": extract_tool_result_text(tr)
+                        }));
+                    }
+                    _ => {}
+                }
+            }
+
+            if !user_parts.is_empty() {
+                let content = format_content_parts(user_parts, content_format);
+                result.push(json!({"role": "user", "content": content}));
+            }
         }
     }
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 160 -
191, The current InputContent::Blocks handling in the fold collects all
user_parts and tool_msgs separately, losing original ordering (e.g.,
InputContentBlock::Text and InputContentBlock::ToolResult interleaving) and
causing wrong sequencing before templating; change the accumulation to emit
entries in-order by folding into a single Vec of enum-like items (or serde
Values) that preserves each block as either a user-part (json produced by
format_content_parts for contiguous user segments) or a tool message (json
produced by extract_tool_result_text and tool metadata), flushing buffered
user_parts whenever a ToolResult is encountered; update the code paths around
format_content_parts, extract_tool_result_text, InputContentBlock handling, and
where result is pushed so result receives entries in the original sequence
rather than grouping all user parts first.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/regular/stages/preparation.rs`:
- Around line 43-55: The match in PreparationStage::execute currently handles
RequestType::Chat and RequestType::Generate but misses RequestType::Messages,
causing Messages requests to hit the catch-all; add a branch for
RequestType::Messages(_) that delegates to the MessagePreparationStage (e.g.,
call self.message_stage.execute(ctx).await) before the fallback, ensuring the
new MessagePreparationStage is invoked like chat_stage and generate_stage are.

---

Duplicate comments:
In `@model_gateway/src/routers/grpc/utils/message_utils.rs`:
- Around line 273-289: The current format_content_parts function drops
non-text-only content by returning an empty string when no text parts are found;
change the ChatTemplateContentFormat::String branch to collect text parts as
before but, if the collected text Vec is empty, return the original parts as
Value::Array(parts) instead of Value::String(""), otherwise join and return
Value::String(joined_text); this keeps behavior consistent with
transform_content_field and preserves image/document placeholders.
- Around line 368-381: The current extract_chat_tools function silently drops
non-Custom Tool variants which changes request semantics; change
extract_chat_tools to return a Result<Vec<ChatTool>, ErrorType> (or a domain
error) and scan tools: map Custom -> custom_tool_to_chat_tool, and for any other
variant return an Err containing a clear validation message naming the
unsupported variant(s) so MessagePreparationStage can convert that into a 400;
update callers (e.g., MessagePreparationStage) to handle the Result and
propagate the error as a bad-request response.
- Around line 160-191: The current InputContent::Blocks handling in the fold
collects all user_parts and tool_msgs separately, losing original ordering
(e.g., InputContentBlock::Text and InputContentBlock::ToolResult interleaving)
and causing wrong sequencing before templating; change the accumulation to emit
entries in-order by folding into a single Vec of enum-like items (or serde
Values) that preserves each block as either a user-part (json produced by
format_content_parts for contiguous user segments) or a tool message (json
produced by extract_tool_result_text and tool metadata), flushing buffered
user_parts whenever a ToolResult is encountered; update the code paths around
format_content_parts, extract_tool_result_text, InputContentBlock handling, and
where result is pushed so result receives entries in the original sequence
rather than grouping all user parts first.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 075ff4c9-696a-46df-9bc6-e594e6905db5

📥 Commits

Reviewing files that changed from the base of the PR and between 6219384 and d826daa.

📒 Files selected for processing (8)

model_gateway/src/routers/grpc/context.rs
model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
model_gateway/src/routers/grpc/regular/stages/mod.rs
model_gateway/src/routers/grpc/regular/stages/preparation.rs
model_gateway/src/routers/grpc/utils/chat_utils.rs
model_gateway/src/routers/grpc/utils/message_utils.rs
model_gateway/src/routers/grpc/utils/mod.rs

💤 Files with no reviewable changes (1)

model_gateway/src/routers/grpc/context.rs

coderabbitai · 2026-03-12T16:49:28Z

        match &ctx.input.request_type {
            RequestType::Chat(_) => self.chat_stage.execute(ctx).await,
            RequestType::Generate(_) => self.generate_stage.execute(ctx).await,
-            other => {
-                let type_name = match other {
-                    RequestType::Embedding(_) => "Embedding",
-                    RequestType::Classify(_) => "Classify",
-                    RequestType::Responses(_) => "Responses",
-                    _ => "Unknown",
-                };
+            request_type => {
                error!(
                    function = "PreparationStage::execute",
-                    "RequestType::{type_name} reached regular preparation stage"
+                    request_type = %request_type,
+                    "{request_type} request type reached regular preparation stage"
                );
                Err(grpc_error::internal_error(
                    "wrong_pipeline",
-                    format!("RequestType::{type_name} should use its dedicated pipeline"),
+                    format!("{request_type} should use its dedicated pipeline"),
                ))


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash set -euo pipefail printf 'RequestType variants:\n' rg -n -C2 '\benum\s+RequestType\b|RequestType::Messages|Messages\(' model_gateway/src/routers/grpc printf '\nPreparation-stage dispatch:\n' rg -n -C2 'MessagePreparationStage|RequestType::Chat|RequestType::Generate|RequestType::Messages' \ model_gateway/src/routers/grpc/regular/stages

Repository: lightseekorg/smg

Length of output: 15470

Delegate RequestType::Messages before falling back to wrong_pipeline.

The preparation stage match only routes Chat and Generate. The new MessagePreparationStage is therefore unreachable—a Messages request will fall through to the catch-all and return the internal wrong_pipeline error instead of being prepared. Other stages in the regular pipeline (request_building and response_processing) already explicitly handle Messages, so this dispatch is inconsistent.

Verification

RequestType::Messages is defined in context.rs:56, and MessagePreparationStage is properly exported. However, the preparation stage dispatcher (lines 43–46) only has branches for Chat and Generate, while request_building.rs and response_processing.rs both have explicit (RequestType::Responses(_) | RequestType::Messages(_)) branches.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@model_gateway/src/routers/grpc/regular/stages/preparation.rs` around lines 43 - 55, The match in PreparationStage::execute currently handles RequestType::Chat and RequestType::Generate but misses RequestType::Messages, causing Messages requests to hit the catch-all; add a branch for RequestType::Messages(_) that delegates to the MessagePreparationStage (e.g., call self.message_stage.execute(ctx).await) before the fallback, ensuring the new MessagePreparationStage is invoked like chat_stage and generate_stage are.

…arams for Messages API Add Stage 4 (request building) for the Messages API gRPC pipeline, converting PreparationOutput + CreateMessageRequest sampling parameters into backend-specific proto GenerateRequest. What changed: - model_gateway/src/routers/grpc/regular/stages/messages/request_building.rs: New MessageRequestBuildingStage (copied from chat, adapted for Messages). Uses msg_{uuid} request ID prefix, calls build_messages_request(), skips multimodal (postponed), no filtered_request pattern. - model_gateway/src/routers/grpc/client.rs: Add build_messages_request() dispatcher on GrpcClient enum, dispatching to each backend's build_generate_request_from_messages(). - crates/grpc_client/src/sglang_scheduler.rs: Add build_generate_request_from_messages() and build_grpc_sampling_params_from_messages(). Maps CreateMessageRequest fields (max_tokens, temperature, top_p, top_k, stop_sequences) to sglang proto SamplingParams with sensible defaults for missing fields. - crates/grpc_client/src/vllm_engine.rs: Same pattern for vLLM backend. Handles vLLM-specific differences (top_k=0 for disabled, Optional<f32> temperature). - crates/grpc_client/src/trtllm_service.rs: Same pattern for TRT-LLM backend using SamplingConfig, OutputConfig, and GuidedDecodingParams proto types. - model_gateway/src/routers/grpc/regular/stages/messages/mod.rs: Wire request_building module and re-export MessageRequestBuildingStage. Why: This is PR 3 in the Messages API gRPC series. Stage 4 bridges the gap between preparation (Stage 1, PR #741) and response processing (Stage 7, future PR), enabling the pipeline to build backend-specific proto requests from Messages API parameters. Refs: #739, #741 Signed-off-by: Simo Lin <linsimo.mark@gmail.com>

…on-streaming) Add Stage 7 (response processing) for the Messages API gRPC pipeline. This converts backend ProtoGenerateComplete responses into Anthropic Message format with proper ContentBlock construction and StopReason mapping. Non-streaming only; streaming deferred to follow-up PR. What changed: - processor.rs: add process_non_streaming_messages_response() to ResponseProcessor — full pipeline: token decoding, reasoning parsing, tool call parsing, content block construction (Thinking → Text → ToolUse), StopReason mapping (EndTurn/MaxTokens/StopSequence/ToolUse), and messages::Usage building - messages/response_processing.rs: new MessageResponseProcessingStage that extracts execution result, dispatch metadata, tokenizer, and stop decoder from RequestContext, delegates to ResponseProcessor, and stores FinalResponse::Messages - message_utils.rs: add get_history_tool_calls_count_messages() for counting tool use blocks in Messages API request history (needed for KimiK2-style tool call ID generation) - messages/mod.rs: wire response_processing module with unused_imports expect (wired in pipeline factory PR) Why: This is the fourth PR in the Messages API gRPC support series. With preparation (PR #741), request building (PR #744), and now response processing, three of the four endpoint-specific pipeline stages are complete. The shared stages (worker selection, client acquisition, dispatch, execution) are reused from the existing pipeline. How: Follows the same architecture as chat's response processing but adapted for Anthropic Message types: - Reuses existing convert_message_tool_choice() from message_utils to bridge Messages ToolChoice → Chat ToolChoice for parse_json_schema_response - Reuses ResponseProcessor's parse_tool_calls() for model-predicted path - Content blocks ordered per Anthropic convention: Thinking first, Text, then ToolUse blocks - Tool calls parsed as OpenAI ToolCall (via existing parsers) then converted to ContentBlock::ToolUse with JSON input - Messages always n=1, no logprobs - ThinkingConfig::Enabled check replaces separate_reasoning bool Refs: #739, #741, #744 Signed-off-by: Simo Lin <linsimo.mark@gmail.com>

slin1237 requested review from CatherineSue and key4ng as code owners March 12, 2026 16:21

github-actions Bot added grpc gRPC client and router changes model-gateway Model gateway crate changes labels Mar 12, 2026

gemini-code-assist Bot reviewed Mar 12, 2026

View reviewed changes

Comment thread model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs

Comment thread model_gateway/src/routers/grpc/utils/message_utils.rs

slin1237 force-pushed the slin/msg-2 branch from dee954d to 6219384 Compare March 12, 2026 16:24

coderabbitai Bot requested changes Mar 12, 2026

View reviewed changes

Comment thread model_gateway/src/routers/grpc/utils/message_utils.rs Outdated

chatgpt-codex-connector Bot reviewed Mar 12, 2026

View reviewed changes

slin1237 force-pushed the slin/msg-2 branch from 6219384 to 056f12c Compare March 12, 2026 16:30

slin1237 force-pushed the slin/msg-2 branch from 056f12c to d826daa Compare March 12, 2026 16:37

coderabbitai Bot requested changes Mar 12, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Mar 12, 2026

View reviewed changes

coderabbitai Bot requested changes Mar 12, 2026

View reviewed changes

slin1237 merged commit 57a9db1 into main Mar 12, 2026
35 checks passed

slin1237 deleted the slin/msg-2 branch March 12, 2026 17:31

slin1237 mentioned this pull request Mar 12, 2026

feat(gateway): add MessageRequestBuildingStage for Messages API #744

Merged

This was referenced Mar 12, 2026

feat(gateway): add MessageResponseProcessingStage for Messages API (non-streaming) #747

Merged

feat(gateway): wire Messages API pipeline into gRPC routers #753

Merged

coderabbitai Bot mentioned this pull request Apr 8, 2026

[WIP] Connorli/fix func call for KimiK2.5 #1070

Open

coderabbitai Bot mentioned this pull request Apr 21, 2026

feat(protocols): implement P7 ToolChoice variant coverage #1276

Merged

10 tasks

		let tool_call_constraint = if filtered_tools.is_empty() {
		None

Conversation

slin1237 commented Mar 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What changed

New: message_utils.rs

New: MessagePreparationStage

Modified

How

Test plan

Summary by CodeRabbit

Uh oh!

gemini-code-assist Bot commented Mar 12, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

coderabbitai Bot commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

mergify Bot commented Mar 12, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Mar 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

slin1237 commented Mar 12, 2026 •

edited by coderabbitai Bot

Loading

New: `message_utils.rs`

New: `MessagePreparationStage`

coderabbitai Bot commented Mar 12, 2026 •

edited

Loading