ZeroBuild Providers Reference

This document maps provider IDs, aliases, and credential environment variables.

Last verified: February 21, 2026.

How to List Providers

zerobuild providers

Credential Resolution Order

Runtime resolution order is:

Explicit credential from config/CLI
Provider-specific env var(s)
Generic fallback env vars: ZEROBUILD_API_KEY then API_KEY

For resilient fallback chains (reliability.fallback_providers), each fallback provider resolves credentials independently. The primary provider's explicit credential is not reused for fallback providers.

Auto-Configured Fallback Providers

When running zerobuild onboard, the wizard automatically suggests fallback providers based on your primary provider selection. This improves reliability by automatically switching to alternative providers if the primary fails.

Example fallback chains:

kimi-code → moonshot → openrouter → anthropic
anthropic → openrouter → openai → gemini
openrouter → anthropic → openai → gemini
ollama (local) → openrouter → anthropic → groq

Configuration in config.toml:

[reliability]
provider_retries = 2
provider_backoff_ms = 500
fallback_providers = ["openrouter", "anthropic"]

Provider Catalog

Canonical ID	Aliases	Local	Provider-specific env var(s)
`openrouter`	—	No	`OPENROUTER_API_KEY`
`anthropic`	—	No	`ANTHROPIC_OAUTH_TOKEN`, `ANTHROPIC_API_KEY`
`openai`	—	No	`OPENAI_API_KEY`
`ollama`	—	Yes	`OLLAMA_API_KEY` (optional)
`gemini`	`google`, `google-gemini`	No	`GEMINI_API_KEY`, `GOOGLE_API_KEY`
`venice`	—	No	`VENICE_API_KEY`
`vercel`	`vercel-ai`	No	`VERCEL_API_KEY`
`cloudflare`	`cloudflare-ai`	No	`CLOUDFLARE_API_KEY`
`moonshot`	`kimi`	No	`MOONSHOT_API_KEY`
`kimi-code`	`kimi_coding`, `kimi_for_coding`	No	`KIMI_CODE_API_KEY`, `MOONSHOT_API_KEY`
`synthetic`	—	No	`SYNTHETIC_API_KEY`
`opencode`	`opencode-zen`	No	`OPENCODE_API_KEY`
`zai`	`z.ai`	No	`ZAI_API_KEY`
`glm`	`zhipu`	No	`GLM_API_KEY`
`minimax`	`minimax-intl`, `minimax-io`, `minimax-global`, `minimax-cn`, `minimaxi`, `minimax-oauth`, `minimax-oauth-cn`, `minimax-portal`, `minimax-portal-cn`	No	`MINIMAX_OAUTH_TOKEN`, `MINIMAX_API_KEY`
`bedrock`	`aws-bedrock`	No	`AWS_ACCESS_KEY_ID` + `AWS_SECRET_ACCESS_KEY` (optional: `AWS_REGION`)
`qianfan`	`baidu`	No	`QIANFAN_API_KEY`
`doubao`	`volcengine`, `ark`, `doubao-cn`	No	`ARK_API_KEY`, `DOUBAO_API_KEY`
`qwen`	`dashscope`, `qwen-intl`, `dashscope-intl`, `qwen-us`, `dashscope-us`, `qwen-code`, `qwen-oauth`, `qwen_oauth`	No	`QWEN_OAUTH_TOKEN`, `DASHSCOPE_API_KEY`
`groq`	—	No	`GROQ_API_KEY`
`mistral`	—	No	`MISTRAL_API_KEY`
`xai`	`grok`	No	`XAI_API_KEY`
`deepseek`	—	No	`DEEPSEEK_API_KEY`
`together`	`together-ai`	No	`TOGETHER_API_KEY`
`fireworks`	`fireworks-ai`	No	`FIREWORKS_API_KEY`
`perplexity`	—	No	`PERPLEXITY_API_KEY`
`cohere`	—	No	`COHERE_API_KEY`
`copilot`	`github-copilot`	No	(use config/`API_KEY` fallback with GitHub token)
`lmstudio`	`lm-studio`	Yes	(optional; local by default)
`llamacpp`	`llama.cpp`	Yes	`LLAMACPP_API_KEY` (optional; only if server auth is enabled)
`sglang`	—	Yes	`SGLANG_API_KEY` (optional)
`vllm`	—	Yes	`VLLM_API_KEY` (optional)
`osaurus`	—	Yes	`OSAURUS_API_KEY` (optional; defaults to `"osaurus"`)
`nvidia`	`nvidia-nim`, `build.nvidia.com`	No	`NVIDIA_API_KEY`

Vercel AI Gateway Notes

Provider ID: vercel (alias: vercel-ai)
Base API URL: https://ai-gateway.vercel.sh/v1
Authentication: VERCEL_API_KEY
Vercel AI Gateway usage does not require a project deployment.
If you see DEPLOYMENT_NOT_FOUND, verify the provider is targeting the gateway endpoint above instead of https://api.vercel.ai.

Gemini Notes

Provider ID: gemini (aliases: google, google-gemini)
Auth can come from GEMINI_API_KEY, GOOGLE_API_KEY, or Gemini CLI OAuth cache (~/.gemini/oauth_creds.json)
API key requests use generativelanguage.googleapis.com/v1beta
Gemini CLI OAuth requests use cloudcode-pa.googleapis.com/v1internal with Code Assist request envelope semantics
Thinking models (e.g. gemini-3-pro-preview) are supported — internal reasoning parts are automatically filtered from the response

Ollama Vision Notes

Provider ID: ollama
Vision input is supported through user message image markers: [IMAGE:<source>].
After multimodal normalization, ZeroBuild sends image payloads through Ollama's native messages[].images field.
If a non-vision provider is selected, ZeroBuild returns a structured capability error instead of silently ignoring images.

Ollama Cloud Routing Notes

Use :cloud model suffix only with a remote Ollama endpoint.
Remote endpoint should be set in api_url (example: https://ollama.com).
ZeroBuild normalizes a trailing /api in api_url automatically.
If default_model ends with :cloud while api_url is local or unset, config validation fails early with an actionable error.
Local Ollama model discovery intentionally excludes :cloud entries to avoid selecting cloud-only models in local mode.

llama.cpp Server Notes

Provider ID: llamacpp (alias: llama.cpp)
Default endpoint: http://localhost:8080/v1
API key is optional by default; set LLAMACPP_API_KEY only when llama-server is started with --api-key.
Model discovery: zerobuild models refresh --provider llamacpp

SGLang Server Notes

Provider ID: sglang
Default endpoint: http://localhost:30000/v1
API key is optional by default; set SGLANG_API_KEY only when the server requires authentication.
Tool calling requires launching SGLang with --tool-call-parser (e.g. hermes, llama3, qwen25).
Model discovery: zerobuild models refresh --provider sglang

vLLM Server Notes

Provider ID: vllm
Default endpoint: http://localhost:8000/v1
API key is optional by default; set VLLM_API_KEY only when the server requires authentication.
Model discovery: zerobuild models refresh --provider vllm

Osaurus Server Notes

Provider ID: osaurus
Default endpoint: http://localhost:1337/v1
API key defaults to "osaurus" but is optional; set OSAURUS_API_KEY to override or leave unset for keyless access.
Model discovery: zerobuild models refresh --provider osaurus
Osaurus is a unified AI edge runtime for macOS (Apple Silicon) that combines local MLX inference with cloud provider proxying through a single endpoint.
Supports multiple API formats simultaneously: OpenAI-compatible (/v1/chat/completions), Anthropic (/messages), Ollama (/chat), and Open Responses (/v1/responses).
Built-in MCP (Model Context Protocol) support for tool and context server connectivity.
Local models run via MLX (Llama, Qwen, Gemma, GLM, Phi, Nemotron, and others); cloud models are proxied transparently.

Bedrock Notes

Provider ID: bedrock (alias: aws-bedrock)
API: Converse API
Authentication: AWS AKSK (not a single API key). Set AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY environment variables.
Optional: AWS_SESSION_TOKEN for temporary/STS credentials, AWS_REGION or AWS_DEFAULT_REGION (default: us-east-1).
Default onboarding model: anthropic.claude-sonnet-4-5-20250929-v1:0
Supports native tool calling and prompt caching (cachePoint).
Cross-region inference profiles supported (e.g., us.anthropic.claude-*).
Model IDs use Bedrock format: anthropic.claude-sonnet-4-6, anthropic.claude-opus-4-6-v1, etc.

Ollama Reasoning Toggle

You can control Ollama reasoning/thinking behavior from config.toml:

[runtime]
reasoning_enabled = false

Behavior:

false: sends think: false to Ollama /api/chat requests.
true: sends think: true.
Unset: omits think and keeps Ollama/model defaults.

Kimi Code Notes

Provider ID: kimi-code
Endpoint: https://api.kimi.com/coding/v1
Default onboarding model: kimi-for-coding (alternative: kimi-k2.5)
Runtime auto-adds User-Agent: KimiCLI/0.77 for compatibility.

NVIDIA NIM Notes

Canonical provider ID: nvidia
Aliases: nvidia-nim, build.nvidia.com
Base API URL: https://integrate.api.nvidia.com/v1
Model discovery: zerobuild models refresh --provider nvidia

Recommended starter model IDs (verified against NVIDIA API catalog on February 18, 2026):

meta/llama-3.3-70b-instruct
deepseek-ai/deepseek-v3.2
nvidia/llama-3.3-nemotron-super-49b-v1.5
nvidia/llama-3.1-nemotron-ultra-253b-v1

Custom Endpoints

OpenAI-compatible endpoint:

default_provider = "custom:https://your-api.example.com"

Anthropic-compatible endpoint:

default_provider = "anthropic-custom:https://your-api.example.com"

MiniMax OAuth Setup (config.toml)

Set the MiniMax provider and OAuth placeholder in config:

default_provider = "minimax-oauth"
api_key = "minimax-oauth"

Then provide one of the following credentials via environment variables:

MINIMAX_OAUTH_TOKEN (preferred, direct access token)
MINIMAX_API_KEY (legacy/static token)
MINIMAX_OAUTH_REFRESH_TOKEN (auto-refreshes access token at startup)

Optional:

MINIMAX_OAUTH_REGION=global or cn (defaults by provider alias)
MINIMAX_OAUTH_CLIENT_ID to override the default OAuth client id

Channel compatibility note:

For MiniMax-backed channel conversations, runtime history is normalized to keep valid user/assistant turn order.
Channel-specific delivery guidance (for example Telegram attachment markers) is merged into the leading system prompt instead of being appended as a trailing system turn.

Qwen Code OAuth Setup (config.toml)

Set Qwen Code OAuth mode in config:

default_provider = "qwen-code"
api_key = "qwen-oauth"

Credential resolution for qwen-code:

Explicit api_key value (if not the placeholder qwen-oauth)
QWEN_OAUTH_TOKEN
~/.qwen/oauth_creds.json (reuses Qwen Code cached OAuth credentials)
Optional refresh via QWEN_OAUTH_REFRESH_TOKEN (or cached refresh token)
If no OAuth placeholder is used, DASHSCOPE_API_KEY can still be used as fallback

Optional endpoint override:

QWEN_OAUTH_RESOURCE_URL (normalized to https://.../v1 if needed)
If unset, resource_url from cached OAuth credentials is used when available

Model Routing (`hint:<name>`)

You can route model calls by hint using [[model_routes]]:

[[model_routes]]
hint = "reasoning"
provider = "openrouter"
model = "anthropic/claude-opus-4-20250514"

[[model_routes]]
hint = "fast"
provider = "groq"
model = "llama-3.3-70b-versatile"

Then call with a hint model name (for example from tool or integration paths):

hint:reasoning

Embedding Routing (`hint:<name>`)

You can route embedding calls with the same hint pattern using [[embedding_routes]]. Set [memory].embedding_model to a hint:<name> value to activate routing.

[memory]
embedding_model = "hint:semantic"

[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
dimensions = 1536

[[embedding_routes]]
hint = "archive"
provider = "custom:https://embed.example.com/v1"
model = "your-embedding-model-id"
dimensions = 1024

Supported embedding providers:

none
openai
custom:<url> (OpenAI-compatible embeddings endpoint)

Optional per-route key override:

[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
api_key = "sk-route-specific"

Upgrading Models Safely

Use stable hints and update only route targets when providers deprecate model IDs.

Recommended workflow:

Keep call sites stable (hint:reasoning, hint:semantic).
Change only the target model under [[model_routes]] or [[embedding_routes]].
Run:
- zerobuild doctor
- zerobuild status
Smoke test one representative flow (chat + memory retrieval) before rollout.

This minimizes breakage because integrations and prompts do not need to change when model IDs are upgraded.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZeroBuild Providers Reference

How to List Providers

Credential Resolution Order

Auto-Configured Fallback Providers

Provider Catalog

Vercel AI Gateway Notes

Gemini Notes

Ollama Vision Notes

Ollama Cloud Routing Notes

llama.cpp Server Notes

SGLang Server Notes

vLLM Server Notes

Osaurus Server Notes

Bedrock Notes

Ollama Reasoning Toggle

Kimi Code Notes

NVIDIA NIM Notes

Custom Endpoints

MiniMax OAuth Setup (config.toml)

Qwen Code OAuth Setup (config.toml)

Model Routing (`hint:<name>`)

Embedding Routing (`hint:<name>`)

Upgrading Models Safely

FilesExpand file tree

providers-reference.md

Latest commit

History

providers-reference.md

File metadata and controls

ZeroBuild Providers Reference

How to List Providers

Credential Resolution Order

Auto-Configured Fallback Providers

Provider Catalog

Vercel AI Gateway Notes

Gemini Notes

Ollama Vision Notes

Ollama Cloud Routing Notes

llama.cpp Server Notes

SGLang Server Notes

vLLM Server Notes

Osaurus Server Notes

Bedrock Notes

Ollama Reasoning Toggle

Kimi Code Notes

NVIDIA NIM Notes

Custom Endpoints

MiniMax OAuth Setup (config.toml)

Qwen Code OAuth Setup (config.toml)

Model Routing (hint:<name>)

Embedding Routing (hint:<name>)

Upgrading Models Safely

Model Routing (`hint:<name>`)

Embedding Routing (`hint:<name>`)