feat(completions): add native gRPC pipeline typing for /v1/completions#768
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request integrates the Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request effectively introduces CompletionRequest as a native type within the gRPC pipeline, which is a great step towards cleanly supporting the /v1/completions endpoint. The addition of comprehensive validation for CompletionRequest at the protocol level is a significant improvement for robustness. The new pipeline entrypoint execute_completion and the necessary updates across the context and stages are well-implemented and consistent with the existing architecture. I have a couple of minor suggestions to improve code clarity and style, as detailed in the comments.
| let constraint_count = [ | ||
| req.regex.is_some(), | ||
| req.ebnf.is_some(), | ||
| req.json_schema.is_some(), | ||
| ] | ||
| .iter() | ||
| .filter(|&&enabled| enabled) | ||
| .count(); |
There was a problem hiding this comment.
This is a bit verbose for counting boolean flags. You can make this more concise and arguably more idiomatic by casting the booleans to integers and summing them up. This aligns with prioritizing code simplicity and clarity.
| let constraint_count = [ | |
| req.regex.is_some(), | |
| req.ebnf.is_some(), | |
| req.json_schema.is_some(), | |
| ] | |
| .iter() | |
| .filter(|&&enabled| enabled) | |
| .count(); | |
| let constraint_count = req.regex.is_some() as u8 | |
| + req.ebnf.is_some() as u8 | |
| + req.json_schema.is_some() as u8; |
References
- Prioritize code simplicity and clarity over micro-optimizations, especially when the performance gain is negligible for typical use cases (e.g., iterating over a small number of items).
d98a8bc to
5ebfde8
Compare
Introduce CompletionRequest as a first-class gRPC pipeline request type and extend shared pipeline orchestration to carry completion requests and completion responses natively. Signed-off-by: VS Chandra Mourya <msrinivasa@together.ai>
5ebfde8 to
dfb3c78
Compare
0de7b05 to
dfb3c78
Compare
CatherineSue
left a comment
There was a problem hiding this comment.
Overall LGTM. Just a few issues regarding the request_type and final_response format.
| | RequestType::Embedding(_) | ||
| | RequestType::Classify(_) | ||
| | RequestType::Messages(_)) => { | ||
| RequestType::Generate(_) => { |
There was a problem hiding this comment.
The original code already supports request_type. We don't need to boilerplate here anymore.
You can simply do | RequestType::Completions
| | RequestType::Embedding(_) | ||
| | RequestType::Classify(_) | ||
| | RequestType::Messages(_)) => { | ||
| RequestType::Generate(_) => { |
| ); | ||
| axum::Json(response).into_response() | ||
| } | ||
| Some(FinalResponse::Chat(_)) |
There was a problem hiding this comment.
We might need to do a Fmt impl for FinalResponse like what we did for RequestType. Then we would simply do response_type @ here.
CatherineSue
left a comment
There was a problem hiding this comment.
Please don't open a stack PR. Merge commit to main instead.
|
hi @vschandramourya , the PR looks great. Only couple of issues need to be addressed. Posted above. I do recommend to close this PR and open another one to merge into main. |
Description
Problem
This PR continues the
/v1/completionsrollout by making completions afirst-class native request type in the gRPC pipeline.
Even with a validated request contract, the gRPC pipeline did not yet treat
completions as a native request type, which made
/v1/completionsimpossible to support cleanly through the same pipeline contract used by
chat, generate, embeddings, and other endpoints.
Solution
Introduce
CompletionRequestas a first-class gRPC pipeline request type andextend shared pipeline orchestration to carry completion requests and
completion responses natively.
This PR adds the native completion pipeline type, metadata flow, and
completion entrypoint, while leaving the actual completion stage wiring to the
later stacked PRs.
Changes
RequestType::CompletionFinalResponse::Completionexecute_completion()to the gRPC request pipelinedelegators buildable until native completion stages land in the stacked follow-up PRs
Test Plan
cargo test -p smg completion --quietcargo clippy -p smg --all-targets --all-features -- -D warningsChecklist
cargo +nightly fmtpassescargo clippy --all-targets --all-features -- -D warningspasses