Skip to content

feat(protocol): validate /v1/completions request contract#761

Closed
vschandramourya wants to merge 8 commits intomainfrom
feat/completions-protocol-validation
Closed

feat(protocol): validate /v1/completions request contract#761
vschandramourya wants to merge 8 commits intomainfrom
feat/completions-protocol-validation

Conversation

@vschandramourya
Copy link
Copy Markdown
Collaborator

@vschandramourya vschandramourya commented Mar 14, 2026

Description

Problem

This PR is the first step in introducing native support for the
OpenAI-compatible /v1/completions endpoint.

The /v1/completions endpoint accepted requests without protocol-level
validation, so invalid completion payloads could travel deeper into the
gateway before failing inside router-specific execution paths.

Solution

Add protocol-level validation to CompletionRequest and switch the
HTTP /v1/completions entrypoint to ValidatedJson, matching the
validated request flow used by other OpenAI-compatible endpoints.

This gives the endpoint a clear request contract before the native gRPC
completion pipeline lands in the stacked follow-up PRs.

Changes

  • Add validator::Validate support to CompletionRequest
  • Validate prompt shape, numeric ranges, stop sequences, and structured-output conflicts
  • Validate cross-field rules like stream_options and best_of >= n
  • Implement Normalizable for CompletionRequest
  • Change /v1/completions to use ValidatedJson<CompletionRequest>
  • Add completion protocol/spec tests

Test Plan

  • cargo test -p openai-protocol completion --quiet
  • cargo test -p smg completion --quiet
Checklist
  • cargo +nightly fmt passes
  • cargo clippy --all-targets --all-features -- -D warnings passes
  • (Optional) Documentation updated
  • (Optional) Please join us on Slack #sig-smg to discuss, review, and merge PRs

Summary by CodeRabbit

  • New Features
    • Completion API enforces comprehensive request validation: prompt format, sampling/penalty/token ranges, mutually exclusive options, and cross-field compatibility rules.
  • Bug Fixes
    • Earlier and clearer validation/error reporting for invalid completion requests so malformed payloads are rejected before processing.

Add protocol-level validation for CompletionRequest and switch the
/v1/completions HTTP handler to ValidatedJson so invalid requests fail
before entering router-specific execution paths.

Signed-off-by: VS Chandra Mourya <msrinivasa@together.ai>
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 14, 2026

📝 Walkthrough

Walkthrough

Added comprehensive validation to CompletionRequest (field-level and cross-field) via validator::Validate, implemented Normalizable for it, and switched the server extractor to ValidatedJson<CompletionRequest> so validation runs at the HTTP layer before handler logic.

Changes

Cohort / File(s) Summary
Request Validation
crates/protocols/src/completion.rs
Added validator::Validate derive and #[validate(schema(function = "validate_completion_cross_parameters"))]. Introduced field-level #[validate(...)] rules for prompt, numeric ranges, stop, top_p, top_k, and others. Added private validators (validate_completion_prompt, validate_completion_cross_parameters) and impl Normalizable for CompletionRequest {}. Some fields reordered; validation attributes added.
Server extractor update
model_gateway/src/server.rs
Replaced handler extractor Json<CompletionRequest> with ValidatedJson<CompletionRequest> in v1_completions signature so requests are validated during extraction.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant HTTP as "HTTP Server\n(Axum)"
    participant Validator as "ValidatedJson\n(validator)"
    participant Handler as "v1_completions\n(handler)"
    participant Model as "Model/Service"

    Client->>HTTP: POST /v1/completions (body)
    HTTP->>Validator: extract & validate body
    alt validation fails
        Validator-->>HTTP: 4xx Validation Error
        HTTP-->>Client: 4xx Response
    else validation succeeds
        Validator-->>Handler: Validated<CompletionRequest>
        Handler->>Model: forward request for completion
        Model-->>Handler: completion response
        Handler-->>Client: 200 Response with result
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

protocols

Suggested reviewers

  • key4ng
  • CatherineSue
  • slin1237

Poem

🐰 I nibbled bytes and checked each field,
Bounded numbers and prompts revealed.
Validator ready, extractor too—hip hop!
Requests now cozy, errors stop. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately summarizes the main change: adding validation to the /v1/completions request contract through validator::Validate and Normalizable traits.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/completions-protocol-validation

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added tests Test changes model-gateway Model gateway crate changes labels Mar 14, 2026
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces robust protocol-level validation for the /v1/completions endpoint. By integrating the validator crate and defining custom validation logic within the CompletionRequest struct, it ensures that incoming requests adhere to the expected contract before processing. This enhancement improves the reliability and security of the API, serving as a crucial first step towards native support for the OpenAI-compatible completions endpoint.

Highlights

  • Protocol Validation for CompletionRequest: Added validator::Validate support to the CompletionRequest struct, enabling declarative validation of its fields.
  • Comprehensive Field Validation: Implemented validation for prompt shape, numeric ranges of parameters (e.g., max_tokens, temperature, logprobs), and stop sequences.
  • Cross-Parameter Rule Enforcement: Introduced custom validation logic to enforce cross-field rules, such as stream_options requiring stream to be enabled, min_tokens not exceeding max_tokens, and best_of being greater than or equal to n.
  • Structured Output Constraint Validation: Added validation to ensure only one structured output constraint (regex, ebnf, or json_schema) is active at a time.
  • Normalizable Trait Implementation: Implemented the Normalizable trait for CompletionRequest, which can be used for pre-validation data normalization.
  • API Endpoint Integration: Updated the /v1/completions HTTP entrypoint to utilize ValidatedJson<CompletionRequest>, ensuring all incoming requests are automatically validated.
  • New Protocol Specification Tests: Added new unit tests specifically for the completion protocol, covering various validation scenarios for CompletionRequest.
Changelog
  • crates/protocols/src/completion.rs
    • Added validator::Validate derive macro to CompletionRequest.
    • Introduced #[validate] attributes for various fields, including prompt, max_tokens, temperature, top_p, n, logprobs, presence_penalty, frequency_penalty, best_of, top_k, min_p, min_tokens, and repetition_penalty.
    • Implemented custom validation functions validate_completion_prompt and validate_completion_cross_parameters to enforce complex rules.
    • Implemented the Normalizable trait for CompletionRequest.
  • model_gateway/src/server.rs
    • Modified the v1_completions async function to accept ValidatedJson<CompletionRequest> instead of Json<CompletionRequest>, enabling automatic request validation.
  • model_gateway/tests/spec/completion.rs
    • Added a new test file containing unit tests for CompletionRequest validation.
    • Included tests covering scenarios like empty prompts, empty prompt array items, stream_options requiring stream, and best_of needing to cover n.
Activity
  • The author vschandramourya initiated this pull request to add protocol validation for the /v1/completions endpoint.
  • The author has provided a detailed description outlining the problem, solution, and specific changes.
  • A test plan is included, indicating that cargo test -p openai-protocol completion --quiet and cargo test -p smg completion --quiet have been run and passed.
  • The cargo +nightly fmt check has passed, but cargo clippy is still pending.
  • The author has invited discussion on Slack.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces protocol-level validation for the /v1/completions endpoint by leveraging the validator crate. The CompletionRequest struct is now equipped with validation rules for its fields and cross-parameter constraints, ensuring that invalid payloads are rejected early. The HTTP endpoint has been updated to use the ValidatedJson extractor, and new tests have been added to confirm the validation logic. The changes are well-structured and improve the robustness of the API. My review includes a minor suggestion to enhance code clarity in one of the validation functions.

Comment thread crates/protocols/src/completion.rs Outdated
Comment on lines +193 to +200
let constraint_count = [
req.regex.is_some(),
req.ebnf.is_some(),
req.json_schema.is_some(),
]
.iter()
.filter(|&&enabled| enabled)
.count();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For improved readability and conciseness, you can simplify the logic for counting active constraints. Casting the boolean results of is_some() to an integer and summing them up is a more direct and idiomatic way to achieve this in Rust, avoiding the creation of an intermediate array and iterator.

Suggested change
let constraint_count = [
req.regex.is_some(),
req.ebnf.is_some(),
req.json_schema.is_some(),
]
.iter()
.filter(|&&enabled| enabled)
.count();
let constraint_count = req.regex.is_some() as u8 + req.ebnf.is_some() as u8 + req.json_schema.is_some() as u8;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the suggestion is cleaner

Comment thread model_gateway/src/server.rs
@vschandramourya vschandramourya marked this pull request as ready for review March 15, 2026 22:14
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f83dc56d72

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread model_gateway/tests/spec/completion.rs Outdated
@@ -0,0 +1,89 @@
use openai_protocol::{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Register completion spec tests in the spec test module

This new test file is never executed because model_gateway/tests/spec/mod.rs does not declare mod completion;, so none of the added completion validation tests are compiled into spec_test. In practice this means regressions in the new /v1/completions validation path will slip through CI despite this commit claiming completion protocol/spec coverage.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove the test file, and use e2e test_validations later.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@model_gateway/tests/spec/completion.rs`:
- Around line 72-89: Add unit tests that cover the remaining cross-parameter
validation rules: (1) create a request (using base_request()) where min_tokens
is set greater than max_tokens, call req.normalize() and assert
req.validate().is_err(); and (2) create requests that set more than one
structured-output constraint at once (e.g., set regex and ebnf, and separately
regex and json_schema, etc.), call req.normalize() and assert
req.validate().is_err() for each. Reference the existing test patterns and
functions (base_request, StreamOptions, req.normalize(), req.validate()) and add
new tests similar to test_completion_stream_options_require_stream and
test_completion_best_of_must_cover_n to ensure these validation branches are
covered.
- Around line 48-70: Add a unit test to cover the empty-array prompt case:
create a test similar to the others (e.g., name it
test_completion_prompt_array_cannot_be_empty) that sets req.prompt =
StringOrArray::Array(vec![]) on a base_request(), calls req.normalize(), and
asserts req.validate().is_err(); this will exercise the arr.is_empty() branch in
the completion validation logic and ensure empty prompt arrays fail validation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 949d5a45-596b-448c-96b4-876c19d652a9

📥 Commits

Reviewing files that changed from the base of the PR and between feb6f06 and f83dc56.

📒 Files selected for processing (3)
  • crates/protocols/src/completion.rs
  • model_gateway/src/server.rs
  • model_gateway/tests/spec/completion.rs

Comment thread model_gateway/tests/spec/completion.rs Outdated
Comment thread model_gateway/tests/spec/completion.rs Outdated
Copy link
Copy Markdown
Member

@CatherineSue CatherineSue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM. Mark request changed as the PR already got 2 approvals.
Couple of issues need to be addressed.

  1. Some of the parameter ranges are inconsistent with OpenAI official API.

  2. Can we group OpenAI official params and engine-specific ones? Something like https://github.com/lightseekorg/smg/blob/main/crates/protocols/src/chat.rs#L250-L255

And OpenAI ones should follow the same order in their API doc. Link is posted in the comment.

Comment thread crates/protocols/src/completion.rs Outdated
pub stream_options: Option<StreamOptions>,

/// Include the log probabilities on the logprobs most likely tokens
#[validate(range(min = 0, max = 20))]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread crates/protocols/src/completion.rs Outdated
pub top_p: Option<f32>,

/// How many completions to generate for each prompt
#[validate(range(min = 1, max = 10))]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

Comment thread crates/protocols/src/completion.rs Outdated

/// Generates best_of completions server-side and returns the "best"
#[validate(range(min = 1))]
pub best_of: Option<u32>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image

The range should be 0 to 20.

Comment thread crates/protocols/src/completion.rs Outdated
Comment on lines +193 to +200
let constraint_count = [
req.regex.is_some(),
req.ebnf.is_some(),
req.json_schema.is_some(),
]
.iter()
.filter(|&&enabled| enabled)
.count();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the suggestion is cleaner

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 20f18ebbc6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 43 to 45
/// How many completions to generate for each prompt
#[validate(range(min = 1, max = 10))]
pub n: Option<u32>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow the documented n range for completions

Because /v1/completions now goes through ValidatedJson, this cap turns requests with n=11..128 into gateway-side 400s even though the OpenAI completions contract allows up to 128 choices. The HTTP routers simply proxy CompletionRequest onward, so this is a new compatibility break for any backend or client that already relied on larger n values.

Useful? React with 👍 / 👎.

Comment on lines 31 to 33
/// The maximum number of tokens to generate
#[validate(range(min = 1))]
pub max_tokens: Option<u32>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Permit max_tokens = 0 on completions

The completion API allows max_tokens: 0, which is useful for echo-only or prompt-logprob style calls, but this validator now rejects those requests before they reach the upstream. Since this endpoint was switched from Json to ValidatedJson in the same change, existing zero-token completion calls will regress immediately.

Useful? React with 👍 / 👎.

Comment thread crates/protocols/src/completion.rs Outdated
Comment on lines 54 to 56
/// Include the log probabilities on the logprobs most likely tokens
#[validate(range(min = 0, max = 20))]
pub logprobs: Option<u32>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Cap completion logprobs at the API maximum

The /v1/completions contract only allows logprobs up to 5, so accepting values up to 20 means the new validator still lets invalid payloads through. With protocol validation now happening at the gateway, requests like logprobs: 10 will be proxied downstream and fail later or inconsistently across backends instead of being rejected up front.

Useful? React with 👍 / 👎.

Comment thread crates/protocols/src/completion.rs Outdated
Comment on lines +210 to +213
if let (Some(best_of), Some(n)) = (req.best_of, req.n) {
if best_of < n {
let mut error = validator::ValidationError::new("best_of_less_than_n");
error.message = Some("best_of must be greater than or equal to n".into());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Require best_of to be strictly greater than n

The completion spec requires best_of > n, but this check only rejects the < case. Requests such as {"best_of": 4, "n": 4} will now pass the new validator even though they are still invalid, which undermines the goal of moving /v1/completions onto protocol-level validation.

Useful? React with 👍 / 👎.

Signed-off-by: VS Chandra Mourya <msrinivasa@together.ai>
Signed-off-by: VS Chandra Mourya <msrinivasa@together.ai>
@github-actions github-actions Bot removed the tests Test changes label Mar 19, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a426df106c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +204 to +209
if let (Some(best_of), Some(n)) = (req.best_of, req.n) {
if best_of < n {
let mut error = validator::ValidationError::new("best_of_less_than_n");
error.message = Some("best_of must be greater than or equal to n".into());
return Err(error);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject streamed completions that set best_of

Because /v1/completions now goes through ValidatedJson, this schema function is the new gatekeeper for contract checks. It still accepts requests like {"best_of":2,"stream":true}, even though the OpenAI completions contract marks best_of as non-streamable, so those calls will now fail later or behave inconsistently per backend instead of getting a clear 400 at the gateway.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/protocols/src/completion.rs`:
- Around line 40-41: Add validation for the logit_bias maps used in Completion
and Chat by implementing a helper like validate_logit_bias(&HashMap<String,f32>)
-> Result<(), validator::ValidationError> and calling it from the existing
validation logic for the Completion struct (logit_bias field) and the Chat
message/request validation (corresponding logit_bias field). The validator
should iterate each (token_id, bias) pair, ensure bias is within -100.0..=100.0
(return a ValidationError with a clear "logit_bias_range" message on failure)
and ensure token_id.parse::<u32>() (or usize) succeeds (return a ValidationError
with a "logit_bias_token_id" message if parsing fails) so malformed keys or
out-of-range values fail fast at the protocol layer.
- Around line 148-167: The validate_completion_prompt function currently rejects
empty strings for both StringOrArray::String and items in StringOrArray::Array;
remove the checks that create and return a ValidationError for empty string
values (the branches that construct "prompt_empty" and "prompt_item_empty" for
empty s and for arr.iter().any(|prompt| prompt.is_empty())), but keep the
existing empty-array validation that returns an error when arr.is_empty();
ensure other behavior and error types (validator::ValidationError usage) remains
unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: eeb70bc0-b792-4c05-a1ea-185080ea04ae

📥 Commits

Reviewing files that changed from the base of the PR and between f83dc56 and a426df1.

📒 Files selected for processing (1)
  • crates/protocols/src/completion.rs

Comment on lines +40 to +41
/// Modify the likelihood of specified tokens appearing in the completion
pub logit_bias: Option<HashMap<String, f32>>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's examine the completion.rs file around line 41 to see the current state
cat -n crates/protocols/src/completion.rs | sed -n '30,60p'

Repository: lightseekorg/smg

Length of output: 1458


🏁 Script executed:

# Check for any existing logit_bias validators or patterns in the codebase
rg -n "logit_bias" --type rust

Repository: lightseekorg/smg

Length of output: 384


🏁 Script executed:

# Look for validator patterns used in the completion.rs file
cat -n crates/protocols/src/completion.rs | sed -n '1,100p'

Repository: lightseekorg/smg

Length of output: 4333


🏁 Script executed:

# Check if there's any OpenAI API spec reference or documentation in the repo
find . -type f -name "*.md" -o -name "*.rs" | xargs grep -l "logit_bias" | head -5

Repository: lightseekorg/smg

Length of output: 226


🏁 Script executed:

# Check for the validator crate version and how custom validators are used
rg -n "validate\(custom" --type rust | head -10

Repository: lightseekorg/smg

Length of output: 1008


🏁 Script executed:

# Search for any existing logit_bias validator implementations
rg -n "validate_logit_bias" --type rust

Repository: lightseekorg/smg

Length of output: 42


🏁 Script executed:

# Check the sampling_params.rs file to see the pattern of custom validators
cat -n crates/protocols/src/sampling_params.rs

Repository: lightseekorg/smg

Length of output: 4528


🏁 Script executed:

# Look for OpenAI API spec references or documentation
rg -n "logit_bias" --type rust -B 2 -A 2 | head -40

Repository: lightseekorg/smg

Length of output: 1577


🏁 Script executed:

# Check if there's any documentation about logit_bias constraints
find . -type f \( -name "*.md" -o -name "*.rs" \) -exec grep -l "logit" {} \; | head -10

Repository: lightseekorg/smg

Length of output: 483


🏁 Script executed:

# Search for common OpenAI references or links in the codebase
rg -i "openai.*api\|api.*openai\|developers\.openai" --type rust --type md | head -10

Repository: lightseekorg/smg

Length of output: 42


🌐 Web query:

OpenAI API logit_bias parameter specification range constraints

💡 Result:

The OpenAI API logit_bias parameter is an optional object that maps token IDs (integers) to bias values. Each bias value must be an integer in the range -100 to 100, inclusive. Negative values decrease the likelihood of the token being generated (e.g., -100 typically bans it), while positive values increase it (e.g., 100 makes it highly likely). Values between -1 and 1 have subtler effects. This applies to both the Completions API (/v1/completions) and Chat Completions API (/v1/chat/completions), though support varies by model (e.g., not supported on some reasoning models like o1/o3). Token IDs must be obtained from the model's tokenizer (e.g., cl100k_base for gpt-4/gpt-3.5-turbo). Values outside -100 to 100 cause a 400 error: "Logit bias value is invalid or outside of the acceptable range." Example: {"50256": -100} bans the end-of-text token.

Citations:


Add validation for logit_bias parameters in both completion and chat APIs.

Line 41 in completion.rs (and the corresponding field in chat.rs line 167) leaves logit_bias without validation, allowing malformed token-bias payloads to bypass the protocol layer. The OpenAI API spec requires bias values to be in the range -100 to 100 (inclusive); values outside this range cause a 400 error. Additionally, token IDs must be valid integers encoded as strings. Add validation here to fail fast at the protocol level rather than deferring errors downstream.

🧩 Suggested change
-    pub logit_bias: Option<HashMap<String, f32>>,
+    #[validate(custom(function = "validate_logit_bias"))]
+    pub logit_bias: Option<HashMap<String, f32>>,
fn validate_logit_bias(
    logit_bias: &HashMap<String, f32>,
) -> Result<(), validator::ValidationError> {
    for (token_id, bias) in logit_bias {
        if !(-100.0..=100.0).contains(bias) {
            let mut error = validator::ValidationError::new("logit_bias_range");
            error.message = Some("logit_bias values must be between -100 and 100".into());
            return Err(error);
        }

        if token_id.parse::<u32>().is_err() {
            let mut error = validator::ValidationError::new("logit_bias_token_id");
            error.message = Some("logit_bias keys must be token IDs encoded as strings".into());
            return Err(error);
        }
    }

    Ok(())
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/protocols/src/completion.rs` around lines 40 - 41, Add validation for
the logit_bias maps used in Completion and Chat by implementing a helper like
validate_logit_bias(&HashMap<String,f32>) -> Result<(),
validator::ValidationError> and calling it from the existing validation logic
for the Completion struct (logit_bias field) and the Chat message/request
validation (corresponding logit_bias field). The validator should iterate each
(token_id, bias) pair, ensure bias is within -100.0..=100.0 (return a
ValidationError with a clear "logit_bias_range" message on failure) and ensure
token_id.parse::<u32>() (or usize) succeeds (return a ValidationError with a
"logit_bias_token_id" message if parsing fails) so malformed keys or
out-of-range values fail fast at the protocol layer.

Comment thread crates/protocols/src/completion.rs Outdated
Signed-off-by: VS Chandra Mourya <msrinivasa@together.ai>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: a402f0f5b3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

/// The suffix that comes after a completion of inserted text
pub suffix: Option<String>,
/// Generates `best_of` completions server-side and returns the "best"
#[validate(range(min = 0, max = 20))]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Reject zero best_of before forwarding completions

Because /v1/completions now goes through ValidatedJson, this min = 0 range still accepts payloads like {"best_of":0} even though best_of is the number of server-side candidates and the API contract treats it as a positive count (defaulting to 1). That means the gateway still lets an invalid request through to the worker instead of returning the 400 this change is meant to provide.

Useful? React with 👍 / 👎.

Comment on lines +40 to +41
/// Modify the likelihood of specified tokens appearing in the completion
pub logit_bias: Option<HashMap<String, f32>>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Validate logit_bias entries on completion requests

Now that /v1/completions is gated by protocol validation, leaving logit_bias completely unchecked means out-of-contract payloads such as {"logit_bias":{"42":500}} still pass the gateway and only fail later or behave backend-specifically. This field needs validation here as well, otherwise the new extractor does not actually enforce the documented completion request contract.

Useful? React with 👍 / 👎.

Signed-off-by: VS Chandra Mourya <msrinivasa@together.ai>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6e4beb63b6

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +89 to 90
#[validate(custom(function = "validate_top_k_value"))]
pub top_k: Option<i32>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow top_k = 0 on completion requests

Because model_gateway/src/server.rs now sends /v1/completions through ValidatedJson, this validator becomes a hard gateway gate instead of a backend-specific check. The new validate_top_k_value path only allows -1 or >= 1, but model_gateway/src/routers/http/router.rs::route_completion forwards completion payloads unchanged to OpenAI-compatible HTTP workers, and at least vLLM accepts top_k: 0 as a valid “disabled” value. Any existing client that explicitly sends {"top_k":0} will now get a 400 from the gateway even though the worker would have handled it.

Useful? React with 👍 / 👎.

Comment on lines +97 to 98
#[validate(range(min = 1))]
pub min_tokens: Option<u32>,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Permit min_tokens = 0 for proxied completions

This new lower bound also turns a previously supported backend setting into a gateway-side 400. /v1/completions is still just proxied by model_gateway/src/routers/http/router.rs::route_completion, and OpenAI-compatible completion servers such as vLLM accept min_tokens: 0 as the normal “no minimum tokens” setting. With ValidatedJson enabled, requests that explicitly set min_tokens to zero now fail before they ever reach the worker.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
crates/protocols/src/completion.rs (1)

28-30: ⚠️ Potential issue | 🟡 Minor

best_of minimum should be 1, not 0.

A best_of value of 0 is semantically invalid—it would mean "generate zero candidate completions." When n is unspecified, the cross-parameter check won't catch this. OpenAI defaults best_of to 1 when not provided, implying the minimum valid explicit value is 1.

🔧 Suggested fix
     /// Generates `best_of` completions server-side and returns the "best"
-    #[validate(range(min = 0, max = 20))]
+    #[validate(range(min = 1, max = 20))]
     pub best_of: Option<u32>,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/protocols/src/completion.rs` around lines 28 - 30, The validation on
the Completion `best_of` field is incorrect: change the range validator on the
`best_of` Option<u32> from min = 0 to min = 1 so explicit values cannot be zero
(i.e., update the attribute on the `pub best_of: Option<u32>` field in the
completion struct), and run/update any unit tests or docs that assert the
previous 0-20 range to reflect the new 1-20 constraint.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/protocols/src/completion.rs`:
- Around line 28-30: The validation on the Completion `best_of` field is
incorrect: change the range validator on the `best_of` Option<u32> from min = 0
to min = 1 so explicit values cannot be zero (i.e., update the attribute on the
`pub best_of: Option<u32>` field in the completion struct), and run/update any
unit tests or docs that assert the previous 0-20 range to reflect the new 1-20
constraint.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: e5a5a800-b85d-4059-ac64-522f04d8e5ec

📥 Commits

Reviewing files that changed from the base of the PR and between a402f0f and 6e4beb6.

📒 Files selected for processing (1)
  • crates/protocols/src/completion.rs

@lightseek-bot lightseek-bot deleted the feat/completions-protocol-validation branch March 23, 2026 23:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

model-gateway Model gateway crate changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants