This file is no longer the primary documentation source.
The official and actively maintained documentation has been moved to the GitHub Wiki:
👉 https://github.com/codefox-lab/CodeFox-CLI/wiki
Please refer to the Wiki for:
- installation guides
- configuration reference
- CLI commands
- GitHub Actions integration
- examples and usage guides
The Wiki allows documentation to be updated independently from the main repository and provides better navigation across pages, which is a common approach used in many GitHub projects. :contentReference[oaicite:0]{index=0}
-
📖 Documentation Home
https://github.com/codefox-lab/CodeFox-CLI/wiki -
⚙️ Configuration Guide
https://github.com/codefox-lab/CodeFox-CLI/wiki/Configurations -
💻 CLI Commands
https://github.com/codefox-lab/CodeFox-CLI/wiki/CLI-commands -
🚀 GitHub Actions
https://github.com/codefox-lab/CodeFox-CLI/wiki/GitHub-Action
The .codefox.yml file allows you to configure the analysis behavior, model selection, review format, and the AI provider (Gemini, Ollama, OpenRouter).
All parameters are optional except model and model.name.
Type: string
Default: gemini
Description: The AI provider used for code analysis and (for some providers) embeddings.
Supported values:
| Value | Description |
|---|---|
gemini |
Google Gemini API (default). Requires API key in .codefoxenv. |
ollama |
Local or remote Ollama server. RAG uses the configured embedding model. |
openrouter |
OpenRouter API (many models). Uses embeddings for RAG context. |
Example:
provider: ollamaSettings for the LLM being used.
model:
name: gemini-3-flash-preview
temperature: 0.2
max_tokens: 4000
max_completion_tokens: null
timeout: 600
# Provider-specific (Ollama / OpenRouter):
base_url: null
embedding: nullType: string
Description: The name of the model used for code analysis.
Examples:
name: gemini-3-flash-preview
name: gemini-3-pro-preview
name: gemini-3-proType: number
Default: 0.2
Controls the creativity level of the model.
Recommendations:
0.0 – 0.2-> deterministic and stable analysis ✅0.3 – 0.7-> more “conversational” suggestions> 0.7-> ❌ not recommended for code review
Type: number | null
Default: 4000
Limits the maximum size of the model’s response.
null-> the model’s default limit is used- number -> hard limit
Example:
max_tokens: 3000Type: number | null
Default: null
Separate limit for completion tokens (used by OpenRouter and some APIs). If null, the provider default or max_tokens is used.
Type: number
Default: 600
Request timeout in seconds for API calls.
Type: string | null
Relevant for: ollama, openrouter
Overrides the default API base URL.
- Ollama: default
https://ollama.com. Set to your Ollama host, e.g.http://localhost:11434for local. - OpenRouter: default
https://openrouter.ai/api/v1. Override only if using a proxy or custom endpoint.
Example (Ollama local):
provider: ollama
model:
name: gemma3:12b
base_url: http://localhost:11434Type: string | null
Relevant for: gemini, ollama, openrouter
Model used for embeddings (RAG context search). Required when using RAG (when review.diff_only is false).
Gemini:, Ollama:, OpenRouter: default BAAI/bge-small-en-v1.5 (via fastembed).
Example:
provider: openrouter
model:
name: openai/gpt-4o
embedding: BAAI/bge-small-enThe parameters in the model section control the loading of the embedding model, splitting code into chunks, building the index, and the volume of context passed to the LLM. All parameters are optional.
| Parameter | Type | Default | Description |
|---|---|---|---|
model.embedding |
string |
null |
see above |
model.max_rag_chars |
number |
4096 |
Maximum number of RAG context characters injected into the prompt. Increasing this provides more code in the context, but increases token consumption. |
model.max_rag_matching_chunks |
number |
12 |
Maximum number of RAG context chunks to search in database. |
model.max_diff_chars |
number |
500000 |
Diff size truncation: if the diff is larger than this value, it is truncated and a truncation notice is appended at the end. |
model.rag_max_query_chars |
number |
2000 |
Maximum length of the query to RAG (when searching for relevant chunks). An overly long query is truncated. |
model.rag_chunk_size |
number |
1000 |
Chunk size in characters when splitting files. Code is split by function/class boundaries (tree-sitter) or by sentences. |
model.rag_chunk_overlap |
number |
200 |
Overlap between adjacent chunks (in characters). Must be strictly less than rag_chunk_size. |
model.rag_embed_batch_size |
number |
64 |
Batch size when computing embeddings. A higher value speeds up indexing if there is sufficient RAM. |
model.rag_threads_embedding |
number |
null |
null |
model.rag_lazy_load |
boolean |
false |
Lazy loading of model weights: saves memory upon the first RAG request. |
model.rag_index_dir |
string |
.codefox/rag_index/ |
Directory where the FAISS index, chunks, and metadata are stored. Changing the directory creates a separate index. |
model.rag_max_chunks |
number |
null |
Limit on the number of chunks when building the index. Useful for quick tests or limiting the index size. |
model.rag_max_files |
number |
null |
Limit on the number of files used to build the index. |
model.rag_min_score |
number |
null |
Minimum RRF score threshold during hybrid search (FAISS + BM25). Chunks with a lower score are filtered out. |
Recommendations:
- For large repositories: Decrease
rag_chunk_size(e.g., 600-800) or setrag_max_files/rag_max_chunksto speed up indexing and reduce memory usage. - For more precise context: Increase
max_rag_chars(e.g., 6000–8000) if the model supports a long context window. - When memory is tight: Enable
rag_lazy_load: trueor decreaserag_embed_batch_size.
Example configuration with RAG fine-tuning:
provider: gemini
model:
name: gemini-2.0-flash
embedding: BAAI/bge-small-en-v1.5
max_rag_chars: 6000
max_diff_chars: 300000
rag_chunk_size: 800
rag_chunk_overlap: 150
rag_embed_batch_size: 32
rag_index_dir: .codefox/rag_index/
rag_max_files: 500
review:
diff_only: false
Analysis logic settings.
review:
severity: false
max_issues: null
suggest_fixes: true
diff_only: falseType: string | false
Filter by severity level.
Possible values:
severity: low
severity: medium
severity: high
severity: critical
severity: false # disable the filterIf set — only issues of the specified level and above are shown.
Type: number | null
Limits the number of detected issues.
Useful for:
- reducing noise
- CI mode
max_issues: 10Type: boolean
Default: true
Enables generation of auto-fix patches.
If false:
- only comments are shown without fixes
Type: boolean
Default: false
Analysis mode:
true-> only thegit diffis analyzedfalse-> all files in scope are analyzed
Recommended for CI and PRs.
Technical debt management.
baseline:
enable: trueType: boolean
If enabled:
- existing issues are ignored
- only new ones are shown
ruler:
security: true
performance: true
style: trueEnables or disables analysis categories.
Searches for:
- vulnerabilities
- secret leaks
- unsafe practices
Searches for:
- inefficient algorithms
- redundant operations
- memory / query issues
Checks:
- readability
- best practices
- code smells
Model behavior customization.
prompt:
system: null
extra: null
hard_mode: false
short_mode: false
strict_facts: falseType: string | null
Completely overrides the system prompt.
Used for:
- strict internal rules
- corporate standards
Example:
system: |
You are a strict senior reviewer.
Reject any unsafe code.Type: string | null
Additional instructions on top of the default prompt.
Example:
extra: |
Follow our internal architecture guidelines.
Ignore legacy modules.Type: boolean
Default: false
Enables stricter audit rules: anti-hallucination, business logic and regression checks, concrete language and output guards, self-check. Use for high-stakes or compliance-oriented reviews.
Type: boolean
Default: false
Requests shorter, more concise audit output.
Type: boolean
Default: false
Reduces hallucinations for smaller or weaker models. Enforces: use only names that appear literally in the diff, do not invent class/API/file names, do not speculate; every claim must be traceable to a line in the diff. Prefer short, direct answers and avoid filler. Recommended when using small/local models (e.g. 7B–8B).
Runs analysis and RAG against a local or remote Ollama server.
- API key: Optional. Set
CODEFOX_API_KEYin.codefoxenvif your Ollama instance requires auth. - Models: Use
codefox --command listto see models available on your server. - Default model:
gemma3:12b - RAG: Uploaded files are chunked and embedded using
model.embedding(defaultBAAI/bge-small-en-v1.5). Ensure the embedding model is pulled in Ollama.
Minimal .codefox.yml:
provider: ollama
model:
name: gemma3:12b
base_url: http://localhost:11434 # optional, default https://ollama.com
embedding: BAAI/bge-small-en-v1.5 # optional, for RAG
review:
diff_only: trueUses OpenRouter to run many third-party models (OpenAI, Anthropic, Qwen, etc.) with a single API key.
- API key: Required. Set
CODEFOX_API_KEYin.codefoxenvto your OpenRouter API key. - Models: Use
codefox --command listto see available models. - Default model:
qwen/qwen3-vl-30b-a3b-thinking - RAG: Embeddings use
model.embedding(defaulttext-embedding-3-small). File upload and context search are supported.
Minimal .codefox.yml:
provider: openrouter
model:
name: openai/gpt-4o
embedding: text-embedding-3-small # optional
max_completion_tokens: 4096 # optional
review:
diff_only: trueGemini:
provider: gemini
model:
name: gemini-3-pro
temperature: 0.1
max_tokens: 4000
timeout: 600
review:
severity: high
max_issues: 15
suggest_fixes: true
diff_only: true
baseline:
enable: true
ruler:
security: true
performance: false
style: false
prompt:
hard_mode: false
short_mode: false
extra: |
Use our NestJS architecture rules.Ollama (local):
provider: ollama
model:
name: gemma3:12b
base_url: http://localhost:11434
embedding: BAAI/bge-small-en-v1.5
temperature: 0.2
max_tokens: 4000
timeout: 600
review:
diff_only: true
severity: highOpenRouter:
provider: openrouter
model:
name: openai/gpt-4o
embedding: text-embedding-3-small
temperature: 0.2
max_tokens: 4000
max_completion_tokens: 4096
timeout: 600
review:
diff_only: true
severity: highreview:
diff_only: true
severity: high
max_issues: 10baseline:
enable: true
review:
diff_only: true