This document maps provider IDs, aliases, and credential environment variables.
Last verified: February 21, 2026.
zerobuild providersRuntime resolution order is:
- Explicit credential from config/CLI
- Provider-specific env var(s)
- Generic fallback env vars:
ZEROBUILD_API_KEYthenAPI_KEY
For resilient fallback chains (reliability.fallback_providers), each fallback
provider resolves credentials independently. The primary provider's explicit
credential is not reused for fallback providers.
When running zerobuild onboard, the wizard automatically suggests fallback providers
based on your primary provider selection. This improves reliability by automatically
switching to alternative providers if the primary fails.
Example fallback chains:
kimi-code→moonshot→openrouter→anthropicanthropic→openrouter→openai→geminiopenrouter→anthropic→openai→geminiollama(local) →openrouter→anthropic→groq
Configuration in config.toml:
[reliability]
provider_retries = 2
provider_backoff_ms = 500
fallback_providers = ["openrouter", "anthropic"]| Canonical ID | Aliases | Local | Provider-specific env var(s) |
|---|---|---|---|
openrouter |
— | No | OPENROUTER_API_KEY |
anthropic |
— | No | ANTHROPIC_OAUTH_TOKEN, ANTHROPIC_API_KEY |
openai |
— | No | OPENAI_API_KEY |
ollama |
— | Yes | OLLAMA_API_KEY (optional) |
gemini |
google, google-gemini |
No | GEMINI_API_KEY, GOOGLE_API_KEY |
venice |
— | No | VENICE_API_KEY |
vercel |
vercel-ai |
No | VERCEL_API_KEY |
cloudflare |
cloudflare-ai |
No | CLOUDFLARE_API_KEY |
moonshot |
kimi |
No | MOONSHOT_API_KEY |
kimi-code |
kimi_coding, kimi_for_coding |
No | KIMI_CODE_API_KEY, MOONSHOT_API_KEY |
synthetic |
— | No | SYNTHETIC_API_KEY |
opencode |
opencode-zen |
No | OPENCODE_API_KEY |
zai |
z.ai |
No | ZAI_API_KEY |
glm |
zhipu |
No | GLM_API_KEY |
minimax |
minimax-intl, minimax-io, minimax-global, minimax-cn, minimaxi, minimax-oauth, minimax-oauth-cn, minimax-portal, minimax-portal-cn |
No | MINIMAX_OAUTH_TOKEN, MINIMAX_API_KEY |
bedrock |
aws-bedrock |
No | AWS_ACCESS_KEY_ID + AWS_SECRET_ACCESS_KEY (optional: AWS_REGION) |
qianfan |
baidu |
No | QIANFAN_API_KEY |
doubao |
volcengine, ark, doubao-cn |
No | ARK_API_KEY, DOUBAO_API_KEY |
qwen |
dashscope, qwen-intl, dashscope-intl, qwen-us, dashscope-us, qwen-code, qwen-oauth, qwen_oauth |
No | QWEN_OAUTH_TOKEN, DASHSCOPE_API_KEY |
groq |
— | No | GROQ_API_KEY |
mistral |
— | No | MISTRAL_API_KEY |
xai |
grok |
No | XAI_API_KEY |
deepseek |
— | No | DEEPSEEK_API_KEY |
together |
together-ai |
No | TOGETHER_API_KEY |
fireworks |
fireworks-ai |
No | FIREWORKS_API_KEY |
perplexity |
— | No | PERPLEXITY_API_KEY |
cohere |
— | No | COHERE_API_KEY |
copilot |
github-copilot |
No | (use config/API_KEY fallback with GitHub token) |
lmstudio |
lm-studio |
Yes | (optional; local by default) |
llamacpp |
llama.cpp |
Yes | LLAMACPP_API_KEY (optional; only if server auth is enabled) |
sglang |
— | Yes | SGLANG_API_KEY (optional) |
vllm |
— | Yes | VLLM_API_KEY (optional) |
osaurus |
— | Yes | OSAURUS_API_KEY (optional; defaults to "osaurus") |
nvidia |
nvidia-nim, build.nvidia.com |
No | NVIDIA_API_KEY |
- Provider ID:
vercel(alias:vercel-ai) - Base API URL:
https://ai-gateway.vercel.sh/v1 - Authentication:
VERCEL_API_KEY - Vercel AI Gateway usage does not require a project deployment.
- If you see
DEPLOYMENT_NOT_FOUND, verify the provider is targeting the gateway endpoint above instead ofhttps://api.vercel.ai.
- Provider ID:
gemini(aliases:google,google-gemini) - Auth can come from
GEMINI_API_KEY,GOOGLE_API_KEY, or Gemini CLI OAuth cache (~/.gemini/oauth_creds.json) - API key requests use
generativelanguage.googleapis.com/v1beta - Gemini CLI OAuth requests use
cloudcode-pa.googleapis.com/v1internalwith Code Assist request envelope semantics - Thinking models (e.g.
gemini-3-pro-preview) are supported — internal reasoning parts are automatically filtered from the response
- Provider ID:
ollama - Vision input is supported through user message image markers:
[IMAGE:<source>]. - After multimodal normalization, ZeroBuild sends image payloads through Ollama's native
messages[].imagesfield. - If a non-vision provider is selected, ZeroBuild returns a structured capability error instead of silently ignoring images.
- Use
:cloudmodel suffix only with a remote Ollama endpoint. - Remote endpoint should be set in
api_url(example:https://ollama.com). - ZeroBuild normalizes a trailing
/apiinapi_urlautomatically. - If
default_modelends with:cloudwhileapi_urlis local or unset, config validation fails early with an actionable error. - Local Ollama model discovery intentionally excludes
:cloudentries to avoid selecting cloud-only models in local mode.
- Provider ID:
llamacpp(alias:llama.cpp) - Default endpoint:
http://localhost:8080/v1 - API key is optional by default; set
LLAMACPP_API_KEYonly whenllama-serveris started with--api-key. - Model discovery:
zerobuild models refresh --provider llamacpp
- Provider ID:
sglang - Default endpoint:
http://localhost:30000/v1 - API key is optional by default; set
SGLANG_API_KEYonly when the server requires authentication. - Tool calling requires launching SGLang with
--tool-call-parser(e.g.hermes,llama3,qwen25). - Model discovery:
zerobuild models refresh --provider sglang
- Provider ID:
vllm - Default endpoint:
http://localhost:8000/v1 - API key is optional by default; set
VLLM_API_KEYonly when the server requires authentication. - Model discovery:
zerobuild models refresh --provider vllm
- Provider ID:
osaurus - Default endpoint:
http://localhost:1337/v1 - API key defaults to
"osaurus"but is optional; setOSAURUS_API_KEYto override or leave unset for keyless access. - Model discovery:
zerobuild models refresh --provider osaurus - Osaurus is a unified AI edge runtime for macOS (Apple Silicon) that combines local MLX inference with cloud provider proxying through a single endpoint.
- Supports multiple API formats simultaneously: OpenAI-compatible (
/v1/chat/completions), Anthropic (/messages), Ollama (/chat), and Open Responses (/v1/responses). - Built-in MCP (Model Context Protocol) support for tool and context server connectivity.
- Local models run via MLX (Llama, Qwen, Gemma, GLM, Phi, Nemotron, and others); cloud models are proxied transparently.
- Provider ID:
bedrock(alias:aws-bedrock) - API: Converse API
- Authentication: AWS AKSK (not a single API key). Set
AWS_ACCESS_KEY_ID+AWS_SECRET_ACCESS_KEYenvironment variables. - Optional:
AWS_SESSION_TOKENfor temporary/STS credentials,AWS_REGIONorAWS_DEFAULT_REGION(default:us-east-1). - Default onboarding model:
anthropic.claude-sonnet-4-5-20250929-v1:0 - Supports native tool calling and prompt caching (
cachePoint). - Cross-region inference profiles supported (e.g.,
us.anthropic.claude-*). - Model IDs use Bedrock format:
anthropic.claude-sonnet-4-6,anthropic.claude-opus-4-6-v1, etc.
You can control Ollama reasoning/thinking behavior from config.toml:
[runtime]
reasoning_enabled = falseBehavior:
false: sendsthink: falseto Ollama/api/chatrequests.true: sendsthink: true.- Unset: omits
thinkand keeps Ollama/model defaults.
- Provider ID:
kimi-code - Endpoint:
https://api.kimi.com/coding/v1 - Default onboarding model:
kimi-for-coding(alternative:kimi-k2.5) - Runtime auto-adds
User-Agent: KimiCLI/0.77for compatibility.
- Canonical provider ID:
nvidia - Aliases:
nvidia-nim,build.nvidia.com - Base API URL:
https://integrate.api.nvidia.com/v1 - Model discovery:
zerobuild models refresh --provider nvidia
Recommended starter model IDs (verified against NVIDIA API catalog on February 18, 2026):
meta/llama-3.3-70b-instructdeepseek-ai/deepseek-v3.2nvidia/llama-3.3-nemotron-super-49b-v1.5nvidia/llama-3.1-nemotron-ultra-253b-v1
- OpenAI-compatible endpoint:
default_provider = "custom:https://your-api.example.com"- Anthropic-compatible endpoint:
default_provider = "anthropic-custom:https://your-api.example.com"Set the MiniMax provider and OAuth placeholder in config:
default_provider = "minimax-oauth"
api_key = "minimax-oauth"Then provide one of the following credentials via environment variables:
MINIMAX_OAUTH_TOKEN(preferred, direct access token)MINIMAX_API_KEY(legacy/static token)MINIMAX_OAUTH_REFRESH_TOKEN(auto-refreshes access token at startup)
Optional:
MINIMAX_OAUTH_REGION=globalorcn(defaults by provider alias)MINIMAX_OAUTH_CLIENT_IDto override the default OAuth client id
Channel compatibility note:
- For MiniMax-backed channel conversations, runtime history is normalized to keep valid
user/assistantturn order. - Channel-specific delivery guidance (for example Telegram attachment markers) is merged into the leading system prompt instead of being appended as a trailing
systemturn.
Set Qwen Code OAuth mode in config:
default_provider = "qwen-code"
api_key = "qwen-oauth"Credential resolution for qwen-code:
- Explicit
api_keyvalue (if not the placeholderqwen-oauth) QWEN_OAUTH_TOKEN~/.qwen/oauth_creds.json(reuses Qwen Code cached OAuth credentials)- Optional refresh via
QWEN_OAUTH_REFRESH_TOKEN(or cached refresh token) - If no OAuth placeholder is used,
DASHSCOPE_API_KEYcan still be used as fallback
Optional endpoint override:
QWEN_OAUTH_RESOURCE_URL(normalized tohttps://.../v1if needed)- If unset,
resource_urlfrom cached OAuth credentials is used when available
You can route model calls by hint using [[model_routes]]:
[[model_routes]]
hint = "reasoning"
provider = "openrouter"
model = "anthropic/claude-opus-4-20250514"
[[model_routes]]
hint = "fast"
provider = "groq"
model = "llama-3.3-70b-versatile"Then call with a hint model name (for example from tool or integration paths):
hint:reasoning
You can route embedding calls with the same hint pattern using [[embedding_routes]].
Set [memory].embedding_model to a hint:<name> value to activate routing.
[memory]
embedding_model = "hint:semantic"
[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
dimensions = 1536
[[embedding_routes]]
hint = "archive"
provider = "custom:https://embed.example.com/v1"
model = "your-embedding-model-id"
dimensions = 1024Supported embedding providers:
noneopenaicustom:<url>(OpenAI-compatible embeddings endpoint)
Optional per-route key override:
[[embedding_routes]]
hint = "semantic"
provider = "openai"
model = "text-embedding-3-small"
api_key = "sk-route-specific"Use stable hints and update only route targets when providers deprecate model IDs.
Recommended workflow:
- Keep call sites stable (
hint:reasoning,hint:semantic). - Change only the target model under
[[model_routes]]or[[embedding_routes]]. - Run:
zerobuild doctorzerobuild status
- Smoke test one representative flow (chat + memory retrieval) before rollout.
This minimizes breakage because integrations and prompts do not need to change when model IDs are upgraded.