feat: add OpenAI-compatible streaming server with WebSocket support #1751
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Adds an OpenAI-compatible TTS server with both HTTP and WebSocket endpoints for real-time streaming TTS applications.
Endpoints
POST /v1/audio/speech- HTTP streaming (OpenAI API compatible)WS /v1/audio/speech/stream- Bidirectional WebSocket streamingWebSocket Protocol
ws://host:port/v1/audio/speech/stream{"voice": "speaker.wav", "speed": 1.0}{"text": "Hello"}{"event": "end"}Files Added/Modified
openai_server.py: FastAPI server with HTTP + WebSocket TTS endpointsrun_openai_server.sh: Launch script with venv activationcosyvoice/llm/llm.py: Addinference_bistreamto CosyVoice3LM class for streaming supportUsage
bash run_openai_server.sh # Server runs on http://0.0.0.0:50000Use Case
This enables CosyVoice to be used as a drop-in TTS backend for:
Tested with CosyVoice3-0.5B model.