Skip to content

feat(gateway): add message_utils and MessagePreparationStage for Messages API#741

Merged
slin1237 merged 1 commit intomainfrom
slin/msg-2
Mar 12, 2026
Merged

feat(gateway): add message_utils and MessagePreparationStage for Messages API#741
slin1237 merged 1 commit intomainfrom
slin/msg-2

Conversation

@slin1237
Copy link
Copy Markdown
Member

@slin1237 slin1237 commented Mar 12, 2026

Summary

  • Add message_utils.rs with conversion functions for Anthropic Messages API types → internal chat template format
  • Add MessagePreparationStage (Stage 1) parallel to ChatPreparationStage
  • Make process_tool_call_arguments pub(crate) for reuse across message and chat paths

PR 2 in the Messages API gRPC pipeline series. PR 1 was #739 (type scaffolding).

What changed

New: message_utils.rs

Conversion utilities parallel to chat_utils.rs but for CreateMessageRequest / InputMessage:

  • process_messages() — top-level orchestrator (parallel to process_chat_messages())
  • process_message_content_format() — converts InputMessage[] to Vec<Value> for chat template
  • convert_user_message() — user messages with ToolResult splitting into separate "tool" role messages
  • convert_assistant_message() — extracts text, tool_calls, reasoning_content
  • extract_chat_tools() / convert_message_tool_choice() — type adapters from Messages API to chat types
  • extract_tool_result_text() — helper for ToolResult content extraction
  • 7 unit tests covering all major conversion paths

New: MessagePreparationStage

Created via git-cp from ChatPreparationStage to preserve file history. Key differences:

  • Uses messages_request_arc() instead of chat_request_arc()
  • Calls message_utils::process_messages() instead of utils::process_chat_messages()
  • Converts tools via extract_chat_tools() + convert_message_tool_choice() adapters
  • Uses request.stop_sequences (Messages API) instead of request.stop (Chat API)
  • No multimodal processing (postponed, async preserved for future .await)
  • No filtered_request / Cow<ChatCompletionRequest> pattern

Modified

  • chat_utils.rs: process_tool_call_arguments visibility → pub(crate)
  • stages/preparation.rs: delegating stage uses Display-based error messages
  • context.rs: removed stale #[expect(dead_code)] from messages_request_arc

How

Follows the same architecture as chat — reuses shared utilities (resolve_tokenizer, filter_tools_by_tool_choice, generate_tool_constraints, create_stop_decoder, process_tool_call_arguments) and only replaces the message-specific conversion layer.

Shared utility Reused as-is?
resolve_tokenizer() Yes
process_tool_call_arguments() Yes (made pub(crate))
generate_tool_constraints() Yes (after adapter)
filter_tools_by_tool_choice() Yes (after adapter)
create_stop_decoder() Yes (small conversion from Vec<String>)
process_content_format() No — replaced by process_message_content_format()

Test plan

  • cargo clippy -p smg --all-targets --all-features -- -D warnings — clean
  • cargo fmt -p smg -- --check — clean
  • cargo test -p smg --lib -- message_utils — 7/7 pass
  • cargo test -p smg -- message — all pass

Refs: #738

Summary by CodeRabbit

  • New Features

    • Messages API pipeline with a preparation stage for CreateMessage requests.
    • End-to-end message → chat-template processing, including tool handling, tokenization, stop-sequences, and template application.
    • Tool constraint generation and stop-decoder stored for downstream stages.
  • Refactor

    • Simplified unsupported-request-type error reporting.
    • Adjusted visibility and added utility modules for message/tool processing and reuse.

@github-actions github-actions Bot added grpc gRPC client and router changes model-gateway Model gateway crate changes labels Mar 12, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly extends the gateway's capabilities by laying the groundwork for the Anthropic Messages API. It introduces a new pipeline stage and a set of utility functions to seamlessly convert Messages API requests into the internal chat template format, enabling consistent processing while reusing existing infrastructure. This is a foundational step in a series of changes to fully support the Messages API.

Highlights

  • Messages API Integration: Introduced core components for integrating the Anthropic Messages API, including a new utility module and a dedicated preparation stage.
  • New Utility Module: Added message_utils.rs to handle conversions from Anthropic Messages API types to the internal chat template format, mirroring existing chat utilities.
  • Dedicated Preparation Stage: Implemented MessagePreparationStage to preprocess Messages API requests, managing tool conversion, message processing, tokenization, and tool constraint building.
  • Visibility Change: Modified the visibility of process_tool_call_arguments in chat_utils.rs to pub(crate) to allow reuse across both chat and messages API paths.
Changelog
  • model_gateway/src/routers/grpc/context.rs
    • Removed a dead_code attribute from messages_request_arc.
  • model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
    • Added a new module to encapsulate Messages API-specific pipeline stages.
    • Exported MessagePreparationStage for use in the pipeline factory.
  • model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
    • Implemented the MessagePreparationStage to handle preprocessing of Anthropic Messages API requests.
    • Included logic for tokenizer resolution, tool conversion and filtering, message processing, tokenization, and tool constraint generation.
    • Stored processed data and a stop decoder in the request context.
  • model_gateway/src/routers/grpc/regular/stages/mod.rs
    • Added the new messages module to the list of regular stages.
  • model_gateway/src/routers/grpc/regular/stages/preparation.rs
    • Updated error logging and message formatting to use Display for RequestType variants.
  • model_gateway/src/routers/grpc/utils/chat_utils.rs
    • Changed the visibility of process_tool_call_arguments to pub(crate) for broader access within the crate.
  • model_gateway/src/routers/grpc/utils/message_utils.rs
    • Added comprehensive utility functions for converting Anthropic Messages API types (CreateMessageRequest, InputMessage) to the internal chat template format.
    • Included functions for processing messages, converting user and assistant message content, extracting tool results, and adapting Messages API tool choices and custom tools to Chat API types.
    • Provided unit tests for various conversion scenarios.
  • model_gateway/src/routers/grpc/utils/mod.rs
    • Exported the new message_utils module.
Activity
  • No human activity has occurred on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 12, 2026

📝 Walkthrough

Walkthrough

Adds a Messages API preparation pipeline: new MessagePreparationStage, message transformation utilities converting Anthropic CreateMessageRequest into the internal chat-template flow, minor visibility and lint tweaks across gRPC context and utils, and wiring of the new messages submodule.

Changes

Cohort / File(s) Summary
Stages wiring
model_gateway/src/routers/grpc/regular/stages/mod.rs, model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
Adds messages submodule and re-exports MessagePreparationStage.
Message preparation stage
model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
New MessagePreparationStage implementing pipeline prep for CreateMessageRequest: resolves tokenizer, processes messages, tokenizes text, builds tool constraints and stop decoder, and saves preparation output to request/response state.
Message utilities
model_gateway/src/routers/grpc/utils/message_utils.rs, model_gateway/src/routers/grpc/utils/mod.rs
New module with process_messages(...) and helpers to convert Anthropic Messages API types to internal chat-template format; includes unit tests and extraction/formatting helpers.
Context & helpers
model_gateway/src/routers/grpc/context.rs, model_gateway/src/routers/grpc/utils/chat_utils.rs, model_gateway/src/routers/grpc/regular/stages/preparation.rs
Removed #[expect(dead_code)] from messages_request_arc accessor, made process_tool_call_arguments pub(crate), and simplified unsupported-request-type error logging/formatting.

Sequence Diagram

sequenceDiagram
    participant GRPC as GRPC Pipeline
    participant MPS as MessagePreparationStage
    participant CTX as RequestContext
    participant TKZ as Tokenizer
    participant TOOL as Tool Extractor
    participant CONS as Constraint Generator
    participant STATE as Response State

    GRPC->>MPS: execute(request, ctx)
    MPS->>CTX: resolve_tokenizer()
    CTX-->>MPS: tokenizer
    MPS->>TOOL: extract_chat_tools(request.tools)
    TOOL-->>MPS: chat_tools
    MPS->>MPS: process_messages(request, tokenizer, chat_tools)
    MPS->>TKZ: tokenize(formatted_text)
    TKZ-->>MPS: token_ids
    alt tools present
        MPS->>CONS: generate_tool_constraints(chat_tools)
        CONS-->>MPS: constraints
    end
    MPS->>MPS: create_stop_decoder(stop_sequences)
    MPS->>CTX: store_preparation_output(prep_output)
    MPS->>STATE: store_stop_decoder(decoder)
    MPS-->>GRPC: return success
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

anthropic, tests

Suggested reviewers

  • CatherineSue
  • key4ng

Poem

🐰 I hopped through pipelines, trimmed lint and thread,
Messages turned to tokens, tools politely led.
Stops tucked in pockets, templates snug and neat,
Rabbity scissors snipped—now requests are fleet.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main changes: adding message_utils and MessagePreparationStage to support the Messages API, which aligns with the substantive additions across 6 files and removal of the stale dead_code annotation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch slin/msg-2
📝 Coding Plan for PR comments
  • Generate coding plan

Comment @coderabbitai help to get the list of available commands and usage tips.

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Mar 12, 2026

Hi @slin1237, the DCO sign-off check has failed. All commits must include a Signed-off-by line.

To fix existing commits:

# Sign off the last N commits (replace N with the number of unsigned commits)
git rebase HEAD~N --signoff
git push --force-with-lease

To sign off future commits automatically:

  • Use git commit -s every time, or
  • VSCode: enable Git: Always Sign Off in Settings
  • PyCharm: enable Sign-off commit in the Commit tool window

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the message_utils and MessagePreparationStage to support the Anthropic Messages API, aligning it with the existing chat pipeline. The changes include conversion functions for Messages API types to the internal chat template format, and a new preparation stage that leverages shared utilities. The visibility of process_tool_call_arguments in chat_utils.rs has been updated to pub(crate) to facilitate reuse. Overall, the changes are well-structured and follow the established architecture for handling different API types.

Comment thread model_gateway/src/routers/grpc/utils/message_utils.rs
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/utils/message_utils.rs`:
- Around line 223-242: The fold currently overwrites reasoning for
InputContentBlock::Thinking, losing prior blocks; change the accumulated
reasoning from Option<String> to Vec<String> (e.g., in the initial tuple passed
to blocks.iter().fold and the fold closure), push each t.thinking.clone() on
InputContentBlock::Thinking, and after the fold join the Vec<String> with the
desired separator (or keep the Vec if callers can handle it) so that the
produced (text_parts, tool_calls, reasoning) preserves all thinking blocks
instead of only the last one.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 2c8607ce-0dea-40b6-b525-bff4a7920308

📥 Commits

Reviewing files that changed from the base of the PR and between e1d2183 and dee954d.

📒 Files selected for processing (8)
  • model_gateway/src/routers/grpc/context.rs
  • model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
  • model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
  • model_gateway/src/routers/grpc/regular/stages/mod.rs
  • model_gateway/src/routers/grpc/regular/stages/preparation.rs
  • model_gateway/src/routers/grpc/utils/chat_utils.rs
  • model_gateway/src/routers/grpc/utils/message_utils.rs
  • model_gateway/src/routers/grpc/utils/mod.rs
💤 Files with no reviewable changes (1)
  • model_gateway/src/routers/grpc/context.rs

Comment thread model_gateway/src/routers/grpc/utils/message_utils.rs Outdated
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dee954d7bc

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread model_gateway/src/routers/grpc/utils/message_utils.rs
Comment on lines +108 to +109
let tool_call_constraint = if filtered_tools.is_empty() {
None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject required tool_choice when no matching tools remain

This branch skips tool constraints whenever filtered_tools is empty, which lets requests continue even when tool_choice requires tool use (any or a specific tool) but filtering/adaptation left no usable custom tools. In that scenario the request silently degrades to unconstrained text generation rather than returning a client error, violating the caller’s explicit tool-choice contract.

Useful? React with 👍 / 👎.

…ages API

Add the first message-specific pipeline stage (Stage 1: Preparation) and
the utility functions it needs to convert Anthropic Messages API types
into the internal chat template format.

What changed:
- New message_utils.rs with conversion functions:
  - process_messages(): top-level orchestrator parallel to process_chat_messages()
  - process_message_content_format(): converts InputMessage to Vec<Value> JSON
  - convert_user_message(): handles user messages, splits ToolResult into
    separate "tool" role messages
  - convert_assistant_message(): extracts text, tool_calls, reasoning_content
  - extract_chat_tools(): filters Custom tools and converts to chat::Tool
  - convert_message_tool_choice(): maps Messages ToolChoice to chat ToolChoice
  - extract_tool_result_text(): helper for ToolResult content extraction
  - 7 unit tests covering all major conversion paths
- New MessagePreparationStage (parallel to ChatPreparationStage):
  - Same structure as ChatPreparationStage (impl method pattern)
  - Resolves tokenizer, converts/filters tools, processes messages,
    tokenizes, builds tool constraints, creates stop decoder
  - Multimodal processing postponed (marked with async for future .await)
- Made process_tool_call_arguments pub(crate) in chat_utils.rs for reuse
- Updated delegating PreparationStage to use Display-based error messages
- Removed stale #[expect(dead_code)] from messages_request_arc (now used)

Why:
This is PR 2 in the Messages API gRPC pipeline series. PR 1 (#739) added
type scaffolding. This PR adds the preparation stage that converts
Messages API requests into the shared internal format, enabling the
existing request building and response processing stages to work with
Messages API requests in follow-up PRs.

How:
Follows the same architecture as chat: reuses shared utilities
(resolve_tokenizer, filter_tools_by_tool_choice, generate_tool_constraints,
create_stop_decoder, process_tool_call_arguments) and only replaces the
message-specific conversion layer (process_content_format → process_message_content_format).

Refs: #738
Signed-off-by: Simo Lin <linsimo.mark@gmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/utils/message_utils.rs`:
- Around line 366-374: The current extract_chat_tools function silently drops
non-Custom messages::Tool variants which changes request semantics; change
extract_chat_tools to return a Result<Vec<ChatTool>, RpcError> (or your
project's equivalent error type) and validate each tool: if a messages::Tool is
Custom convert it via custom_tool_to_chat_tool, but if it is Bash, WebSearch,
TextEditor, McpToolset or any other non-Custom variant return an explicit client
error (400) with a clear message listing the unsupported tool(s). Update callers
(e.g., the MessagePreparationStage path) to propagate/handle this Result so
requests with only unsupported tools fail fast instead of being treated as
no-tools.
- Around line 160-191: The current fold over InputContent::Blocks collects all
user_parts and tool_msgs then appends user content first, which reorders mixed
sequences (e.g., [text, tool_result, text]); modify the logic in the
InputContent::Blocks handling (around the fold) to iterate with a for loop over
blocks, accumulating user_parts and whenever you encounter
InputContentBlock::ToolResult flush the accumulated user_parts by calling
format_content_parts (using the same content_format) and push a user json into
result, then push the tool message (use extract_tool_result_text and
tr.tool_use_id) immediately to preserve original ordering, and continue
accumulating subsequent user_parts; ensure any remaining user_parts are flushed
after the loop.
- Around line 266-280: The current format_content_parts (match arm
ChatTemplateContentFormat::String) collapses non-text-only parts into an empty
string; change it to detect when no text parts were extracted and in that case
return the original parts as a Value::Array (preserving image/document
placeholders) instead of Value::String(""), mirroring the behavior of
transform_content_field; locate format_content_parts and update the
ChatTemplateContentFormat::String branch to conditionally return the joined text
when present or the original parts array when text is absent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 46453291-4bcd-410a-88ce-261e231359a0

📥 Commits

Reviewing files that changed from the base of the PR and between dee954d and 6219384.

📒 Files selected for processing (8)
  • model_gateway/src/routers/grpc/context.rs
  • model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
  • model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
  • model_gateway/src/routers/grpc/regular/stages/mod.rs
  • model_gateway/src/routers/grpc/regular/stages/preparation.rs
  • model_gateway/src/routers/grpc/utils/chat_utils.rs
  • model_gateway/src/routers/grpc/utils/message_utils.rs
  • model_gateway/src/routers/grpc/utils/mod.rs
💤 Files with no reviewable changes (1)
  • model_gateway/src/routers/grpc/context.rs

Comment thread model_gateway/src/routers/grpc/utils/message_utils.rs
Comment thread model_gateway/src/routers/grpc/utils/message_utils.rs
Comment on lines +366 to +374
pub(crate) fn extract_chat_tools(tools: &[messages::Tool]) -> Vec<ChatTool> {
tools
.iter()
.filter_map(|t| match t {
messages::Tool::Custom(custom) => Some(custom_tool_to_chat_tool(custom)),
_ => None,
})
.collect()
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Reject unsupported Messages tool types instead of silently dropping them.

Filtering out non-Custom variants here changes request semantics. A request containing only Bash/WebSearch/TextEditor/McpToolset tools reaches MessagePreparationStage as if it had no tools at all, which also bypasses tool_choice enforcement for any or named-tool requests. This should fail fast with a 400 rather than degrade into an unconstrained generation request.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 366 -
374, The current extract_chat_tools function silently drops non-Custom
messages::Tool variants which changes request semantics; change
extract_chat_tools to return a Result<Vec<ChatTool>, RpcError> (or your
project's equivalent error type) and validate each tool: if a messages::Tool is
Custom convert it via custom_tool_to_chat_tool, but if it is Bash, WebSearch,
TextEditor, McpToolset or any other non-Custom variant return an explicit client
error (400) with a clear message listing the unsupported tool(s). Update callers
(e.g., the MessagePreparationStage path) to propagate/handle this Result so
requests with only unsupported tools fail fast instead of being treated as
no-tools.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: d826daa9ef

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +187 to +191
if !user_parts.is_empty() {
let content = format_content_parts(user_parts, content_format);
result.push(json!({"role": "user", "content": content}));
}
result.extend(tool_msgs);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve tool-result block order when splitting user content

convert_user_message() accumulates all user text/media parts and all tool-result parts separately, then always appends the synthesized user message before appending tool messages. For mixed user content where a tool_result block appears before text (or is interleaved), this reorders the turn and can make the model consume follow-up user text before the tool output it depends on, producing incorrect tool-loop behavior.

Useful? React with 👍 / 👎.

Comment on lines +200 to +204
Some(ToolResultContent::Blocks(blocks)) => blocks
.iter()
.filter_map(|b| match b {
messages::ToolResultContentBlock::Text(t) => Some(t.text.as_str()),
_ => None,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve non-text tool results instead of dropping them

extract_tool_result_text() only keeps ToolResultContentBlock::Text and discards other valid block types (Image, Document, SearchResult), so non-text tool results are silently converted to empty/partial tool messages. When tools return non-text output, the prompt loses the actual result content and the model receives an incorrect conversation state.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
model_gateway/src/routers/grpc/utils/message_utils.rs (3)

273-289: ⚠️ Potential issue | 🟠 Major

Keep non-text-only parts instead of returning "" in String mode.

When a user turn contains only image/document placeholders, this returns an empty string and silently drops the content. transform_content_field() in model_gateway/src/routers/grpc/utils/chat_utils.rs preserves the original array when no text parts exist, so the Messages path currently diverges from the Chat path.

Suggested fix
 fn format_content_parts(parts: Vec<Value>, content_format: ChatTemplateContentFormat) -> Value {
     match content_format {
         ChatTemplateContentFormat::String => {
-            // Extract text parts and join
-            let text: String = parts
+            let text_parts: Vec<String> = parts
                 .iter()
                 .filter_map(|p| {
                     p.as_object()
                         .and_then(|obj| obj.get("type")?.as_str().filter(|&t| t == "text"))
                         .and_then(|_| p.as_object()?.get("text")?.as_str())
                         .map(String::from)
                 })
-                .collect::<Vec<_>>()
-                .join(" ");
-            Value::String(text)
+                .collect();
+
+            if text_parts.is_empty() {
+                Value::Array(parts)
+            } else {
+                Value::String(text_parts.join(" "))
+            }
         }
         ChatTemplateContentFormat::OpenAI => Value::Array(parts),
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 273 -
289, The current format_content_parts function drops non-text-only content by
returning an empty string when no text parts are found; change the
ChatTemplateContentFormat::String branch to collect text parts as before but, if
the collected text Vec is empty, return the original parts as
Value::Array(parts) instead of Value::String(""), otherwise join and return
Value::String(joined_text); this keeps behavior consistent with
transform_content_field and preserves image/document placeholders.

368-381: ⚠️ Potential issue | 🟠 Major

Unsupported Messages tools should fail fast, not vanish.

Filtering out every non-Custom variant changes request semantics. A request containing only Bash/WebSearch/TextEditor/McpToolset tools reaches MessagePreparationStage as if it had no tools, which also bypasses named/required tool_choice handling.

Return a Result here and let MessagePreparationStage convert unsupported tool variants into a 400 with a clear error message instead of silently degrading the request.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 368 -
381, The current extract_chat_tools function silently drops non-Custom Tool
variants which changes request semantics; change extract_chat_tools to return a
Result<Vec<ChatTool>, ErrorType> (or a domain error) and scan tools: map Custom
-> custom_tool_to_chat_tool, and for any other variant return an Err containing
a clear validation message naming the unsupported variant(s) so
MessagePreparationStage can convert that into a 400; update callers (e.g.,
MessagePreparationStage) to handle the Result and propagate the error as a
bad-request response.

160-191: ⚠️ Potential issue | 🟠 Major

Preserve mixed user/tool-result ordering.

This fold buffers every user part and appends every synthesized tool message afterward. A source turn like [text, tool_result, text] becomes user -> tool instead of user -> tool -> user, which changes the message sequence before templating.

Suggested direction
 fn convert_user_message(
     content: &InputContent,
     content_format: ChatTemplateContentFormat,
     result: &mut Vec<Value>,
 ) {
     match content {
         InputContent::String(text) => {
             result.push(json!({"role": "user", "content": text}));
         }
         InputContent::Blocks(blocks) => {
-            let (user_parts, tool_msgs) = blocks.iter().fold(
-                (Vec::new(), Vec::new()),
-                |(mut user_parts, mut tool_msgs), block| {
-                    match block {
-                        InputContentBlock::Text(t) => {
-                            user_parts.push(json!({"type": "text", "text": t.text}));
-                        }
-                        InputContentBlock::Image(_) => {
-                            user_parts.push(json!({"type": "image"}));
-                        }
-                        InputContentBlock::Document(_) => {
-                            user_parts.push(json!({"type": "document"}));
-                        }
-                        InputContentBlock::ToolResult(tr) => {
-                            tool_msgs.push(json!({
-                                "role": "tool",
-                                "tool_call_id": tr.tool_use_id,
-                                "content": extract_tool_result_text(tr)
-                            }));
-                        }
-                        _ => {}
-                    }
-                    (user_parts, tool_msgs)
-                },
-            );
-
-            if !user_parts.is_empty() {
-                let content = format_content_parts(user_parts, content_format);
-                result.push(json!({"role": "user", "content": content}));
-            }
-            result.extend(tool_msgs);
+            let mut user_parts = Vec::new();
+            for block in blocks {
+                match block {
+                    InputContentBlock::Text(t) => {
+                        user_parts.push(json!({"type": "text", "text": t.text}));
+                    }
+                    InputContentBlock::Image(_) => {
+                        user_parts.push(json!({"type": "image"}));
+                    }
+                    InputContentBlock::Document(_) => {
+                        user_parts.push(json!({"type": "document"}));
+                    }
+                    InputContentBlock::ToolResult(tr) => {
+                        if !user_parts.is_empty() {
+                            let content =
+                                format_content_parts(std::mem::take(&mut user_parts), content_format);
+                            result.push(json!({"role": "user", "content": content}));
+                        }
+                        result.push(json!({
+                            "role": "tool",
+                            "tool_call_id": tr.tool_use_id,
+                            "content": extract_tool_result_text(tr)
+                        }));
+                    }
+                    _ => {}
+                }
+            }
+
+            if !user_parts.is_empty() {
+                let content = format_content_parts(user_parts, content_format);
+                result.push(json!({"role": "user", "content": content}));
+            }
         }
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/utils/message_utils.rs` around lines 160 -
191, The current InputContent::Blocks handling in the fold collects all
user_parts and tool_msgs separately, losing original ordering (e.g.,
InputContentBlock::Text and InputContentBlock::ToolResult interleaving) and
causing wrong sequencing before templating; change the accumulation to emit
entries in-order by folding into a single Vec of enum-like items (or serde
Values) that preserves each block as either a user-part (json produced by
format_content_parts for contiguous user segments) or a tool message (json
produced by extract_tool_result_text and tool metadata), flushing buffered
user_parts whenever a ToolResult is encountered; update the code paths around
format_content_parts, extract_tool_result_text, InputContentBlock handling, and
where result is pushed so result receives entries in the original sequence
rather than grouping all user parts first.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/src/routers/grpc/regular/stages/preparation.rs`:
- Around line 43-55: The match in PreparationStage::execute currently handles
RequestType::Chat and RequestType::Generate but misses RequestType::Messages,
causing Messages requests to hit the catch-all; add a branch for
RequestType::Messages(_) that delegates to the MessagePreparationStage (e.g.,
call self.message_stage.execute(ctx).await) before the fallback, ensuring the
new MessagePreparationStage is invoked like chat_stage and generate_stage are.

---

Duplicate comments:
In `@model_gateway/src/routers/grpc/utils/message_utils.rs`:
- Around line 273-289: The current format_content_parts function drops
non-text-only content by returning an empty string when no text parts are found;
change the ChatTemplateContentFormat::String branch to collect text parts as
before but, if the collected text Vec is empty, return the original parts as
Value::Array(parts) instead of Value::String(""), otherwise join and return
Value::String(joined_text); this keeps behavior consistent with
transform_content_field and preserves image/document placeholders.
- Around line 368-381: The current extract_chat_tools function silently drops
non-Custom Tool variants which changes request semantics; change
extract_chat_tools to return a Result<Vec<ChatTool>, ErrorType> (or a domain
error) and scan tools: map Custom -> custom_tool_to_chat_tool, and for any other
variant return an Err containing a clear validation message naming the
unsupported variant(s) so MessagePreparationStage can convert that into a 400;
update callers (e.g., MessagePreparationStage) to handle the Result and
propagate the error as a bad-request response.
- Around line 160-191: The current InputContent::Blocks handling in the fold
collects all user_parts and tool_msgs separately, losing original ordering
(e.g., InputContentBlock::Text and InputContentBlock::ToolResult interleaving)
and causing wrong sequencing before templating; change the accumulation to emit
entries in-order by folding into a single Vec of enum-like items (or serde
Values) that preserves each block as either a user-part (json produced by
format_content_parts for contiguous user segments) or a tool message (json
produced by extract_tool_result_text and tool metadata), flushing buffered
user_parts whenever a ToolResult is encountered; update the code paths around
format_content_parts, extract_tool_result_text, InputContentBlock handling, and
where result is pushed so result receives entries in the original sequence
rather than grouping all user parts first.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 075ff4c9-696a-46df-9bc6-e594e6905db5

📥 Commits

Reviewing files that changed from the base of the PR and between 6219384 and d826daa.

📒 Files selected for processing (8)
  • model_gateway/src/routers/grpc/context.rs
  • model_gateway/src/routers/grpc/regular/stages/messages/mod.rs
  • model_gateway/src/routers/grpc/regular/stages/messages/preparation.rs
  • model_gateway/src/routers/grpc/regular/stages/mod.rs
  • model_gateway/src/routers/grpc/regular/stages/preparation.rs
  • model_gateway/src/routers/grpc/utils/chat_utils.rs
  • model_gateway/src/routers/grpc/utils/message_utils.rs
  • model_gateway/src/routers/grpc/utils/mod.rs
💤 Files with no reviewable changes (1)
  • model_gateway/src/routers/grpc/context.rs

Comment on lines 43 to 55
match &ctx.input.request_type {
RequestType::Chat(_) => self.chat_stage.execute(ctx).await,
RequestType::Generate(_) => self.generate_stage.execute(ctx).await,
other => {
let type_name = match other {
RequestType::Embedding(_) => "Embedding",
RequestType::Classify(_) => "Classify",
RequestType::Responses(_) => "Responses",
_ => "Unknown",
};
request_type => {
error!(
function = "PreparationStage::execute",
"RequestType::{type_name} reached regular preparation stage"
request_type = %request_type,
"{request_type} request type reached regular preparation stage"
);
Err(grpc_error::internal_error(
"wrong_pipeline",
format!("RequestType::{type_name} should use its dedicated pipeline"),
format!("{request_type} should use its dedicated pipeline"),
))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

printf 'RequestType variants:\n'
rg -n -C2 '\benum\s+RequestType\b|RequestType::Messages|Messages\(' model_gateway/src/routers/grpc

printf '\nPreparation-stage dispatch:\n'
rg -n -C2 'MessagePreparationStage|RequestType::Chat|RequestType::Generate|RequestType::Messages' \
  model_gateway/src/routers/grpc/regular/stages

Repository: lightseekorg/smg

Length of output: 15470


Delegate RequestType::Messages before falling back to wrong_pipeline.

The preparation stage match only routes Chat and Generate. The new MessagePreparationStage is therefore unreachable—a Messages request will fall through to the catch-all and return the internal wrong_pipeline error instead of being prepared. Other stages in the regular pipeline (request_building and response_processing) already explicitly handle Messages, so this dispatch is inconsistent.

Verification

RequestType::Messages is defined in context.rs:56, and MessagePreparationStage is properly exported. However, the preparation stage dispatcher (lines 43–46) only has branches for Chat and Generate, while request_building.rs and response_processing.rs both have explicit (RequestType::Responses(_) | RequestType::Messages(_)) branches.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@model_gateway/src/routers/grpc/regular/stages/preparation.rs` around lines 43
- 55, The match in PreparationStage::execute currently handles RequestType::Chat
and RequestType::Generate but misses RequestType::Messages, causing Messages
requests to hit the catch-all; add a branch for RequestType::Messages(_) that
delegates to the MessagePreparationStage (e.g., call
self.message_stage.execute(ctx).await) before the fallback, ensuring the new
MessagePreparationStage is invoked like chat_stage and generate_stage are.

@slin1237 slin1237 merged commit 57a9db1 into main Mar 12, 2026
35 checks passed
@slin1237 slin1237 deleted the slin/msg-2 branch March 12, 2026 17:31
slin1237 added a commit that referenced this pull request Mar 12, 2026
…arams for Messages API

Add Stage 4 (request building) for the Messages API gRPC pipeline,
converting PreparationOutput + CreateMessageRequest sampling parameters
into backend-specific proto GenerateRequest.

What changed:
- model_gateway/src/routers/grpc/regular/stages/messages/request_building.rs:
  New MessageRequestBuildingStage (copied from chat, adapted for Messages).
  Uses msg_{uuid} request ID prefix, calls build_messages_request(),
  skips multimodal (postponed), no filtered_request pattern.
- model_gateway/src/routers/grpc/client.rs:
  Add build_messages_request() dispatcher on GrpcClient enum, dispatching
  to each backend's build_generate_request_from_messages().
- crates/grpc_client/src/sglang_scheduler.rs:
  Add build_generate_request_from_messages() and
  build_grpc_sampling_params_from_messages(). Maps CreateMessageRequest
  fields (max_tokens, temperature, top_p, top_k, stop_sequences) to
  sglang proto SamplingParams with sensible defaults for missing fields.
- crates/grpc_client/src/vllm_engine.rs:
  Same pattern for vLLM backend. Handles vLLM-specific differences
  (top_k=0 for disabled, Optional<f32> temperature).
- crates/grpc_client/src/trtllm_service.rs:
  Same pattern for TRT-LLM backend using SamplingConfig, OutputConfig,
  and GuidedDecodingParams proto types.
- model_gateway/src/routers/grpc/regular/stages/messages/mod.rs:
  Wire request_building module and re-export MessageRequestBuildingStage.

Why: This is PR 3 in the Messages API gRPC series. Stage 4 bridges
the gap between preparation (Stage 1, PR #741) and response processing
(Stage 7, future PR), enabling the pipeline to build backend-specific
proto requests from Messages API parameters.

Refs: #739, #741
Signed-off-by: Simo Lin <linsimo.mark@gmail.com>
slin1237 added a commit that referenced this pull request Mar 12, 2026
…on-streaming)

Add Stage 7 (response processing) for the Messages API gRPC pipeline.
This converts backend ProtoGenerateComplete responses into Anthropic
Message format with proper ContentBlock construction and StopReason
mapping. Non-streaming only; streaming deferred to follow-up PR.

What changed:
- processor.rs: add process_non_streaming_messages_response() to
  ResponseProcessor — full pipeline: token decoding, reasoning parsing,
  tool call parsing, content block construction (Thinking → Text →
  ToolUse), StopReason mapping (EndTurn/MaxTokens/StopSequence/ToolUse),
  and messages::Usage building
- messages/response_processing.rs: new MessageResponseProcessingStage
  that extracts execution result, dispatch metadata, tokenizer, and
  stop decoder from RequestContext, delegates to ResponseProcessor,
  and stores FinalResponse::Messages
- message_utils.rs: add get_history_tool_calls_count_messages() for
  counting tool use blocks in Messages API request history (needed for
  KimiK2-style tool call ID generation)
- messages/mod.rs: wire response_processing module with unused_imports
  expect (wired in pipeline factory PR)

Why:
This is the fourth PR in the Messages API gRPC support series. With
preparation (PR #741), request building (PR #744), and now response
processing, three of the four endpoint-specific pipeline stages are
complete. The shared stages (worker selection, client acquisition,
dispatch, execution) are reused from the existing pipeline.

How:
Follows the same architecture as chat's response processing but adapted
for Anthropic Message types:
- Reuses existing convert_message_tool_choice() from message_utils to
  bridge Messages ToolChoice → Chat ToolChoice for parse_json_schema_response
- Reuses ResponseProcessor's parse_tool_calls() for model-predicted path
- Content blocks ordered per Anthropic convention: Thinking first, Text,
  then ToolUse blocks
- Tool calls parsed as OpenAI ToolCall (via existing parsers) then
  converted to ContentBlock::ToolUse with JSON input
- Messages always n=1, no logprobs
- ThinkingConfig::Enabled check replaces separate_reasoning bool

Refs: #739, #741, #744
Signed-off-by: Simo Lin <linsimo.mark@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

grpc gRPC client and router changes model-gateway Model gateway crate changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant