Use the FREE local Apple Intelligence LLM on your Mac - your model, your machine, your way.
No API keys. No cloud. No subscriptions. No per-token billing. The AI is already on your computer - apfel lets you use it.
Every Mac with Apple Silicon has a built-in LLM - Apple's on-device foundation model, shipped as part of Apple Intelligence. Apple provides the FoundationModels framework (macOS 26+) to access it, but only exposes it through Siri and system features. apfel wraps it in a CLI and an HTTP server - so you can actually use it. All inference runs on-device, no network calls.
- UNIX tool -
echo "summarize this" | apfel- pipe-friendly, file attachments, JSON output, exit codes - OpenAI-compatible server -
apfel --serve- drop-in replacement atlocalhost:11434, works with any OpenAI SDK - Tool calling - function calling with schema conversion, full round-trip support
- Zero cost - no API keys, no cloud, no subscriptions, 4096-token context window
- Apple Silicon Mac, macOS 26 Tahoe or newer, Apple Intelligence enabled
- Building from source requires Command Line Tools with macOS 26.4 SDK (ships Swift 6.3). No Xcode required.
Homebrew (recommended):
brew tap Arthur-Ficial/tap
brew install Arthur-Ficial/tap/apfel
brew upgrade apfel # update to latestBuild from source:
git clone https://github.com/Arthur-Ficial/apfel.git
cd apfel
make installTroubleshooting: docs/install.md
Shell note: if your prompt contains !, prefer single quotes in zsh/bash so history expansion does not break copy-paste. Example: apfel 'Hello, Mac!'
# Single prompt
apfel "What is the capital of Austria?"
# Stream output
apfel --stream "Write a haiku about code"
# Pipe input
echo "Summarize: $(cat README.md)" | apfel
# Attach file content to prompt
apfel -f README.md "Summarize this project"
# Attach multiple files
apfel -f old.swift -f new.swift "What changed between these two files?"
# Combine files with piped input
git diff HEAD~1 | apfel -f CONVENTIONS.md "Review this diff against our conventions"
# JSON output for scripting
apfel -o json "Translate to German: hello" | jq .content
# System prompt
apfel -s "You are a pirate" "What is recursion?"
# System prompt from file
apfel --system-file persona.txt "Explain TCP/IP"
# Quiet mode for shell scripts
result=$(apfel -q "Capital of France? One word.")# Start server
apfel --serve
# In another terminal:
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model":"apple-foundationmodel","messages":[{"role":"user","content":"Hello"}]}'Works with the official Python client:
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="unused")
resp = client.chat.completions.create(
model="apple-foundationmodel",
messages=[{"role": "user", "content": "What is 1+1?"}],
)
print(resp.choices[0].message.content)apfel --chat
apfel --chat -s "You are a helpful coding assistant"Context window is managed automatically with configurable strategies:
apfel --chat --context-strategy newest-first # default: keep recent turns
apfel --chat --context-strategy oldest-first # keep earliest turns
apfel --chat --context-strategy sliding-window --context-max-turns 6
apfel --chat --context-strategy summarize # compress old turns via on-device model
apfel --chat --context-strategy strict # error on overflow, no trimming
apfel --chat --context-output-reserve 256 # custom output token reserveapfel --update # check for updates and upgrade via Homebrew (v0.7.7+)
brew upgrade apfel # or update directly via HomebrewMeasure internal overhead on the installed release binary:
apfel --benchmark -o json
make benchmarkAttach one or more files to any prompt with -f. File contents are prepended to your prompt text. The flag is repeatable - use it as many times as you need.
# Explain a file
apfel -f main.swift "Explain what this code does"
# Compare two files
apfel -f before.txt -f after.txt "What are the differences?"
# Code review a git diff
jj diff | apfel -f CONVENTIONS.md "Review this diff against our coding conventions"
git diff HEAD~1 | apfel -f style-guide.md "Any style violations in this diff?"
# Summarize a commit
git show HEAD | apfel -f CHANGELOG.md "Write a changelog entry for this commit"
# Combine with other flags
apfel -f data.csv -o json "Extract the top 5 rows" | jq .content
apfel -f code.py -s "You are a senior code reviewer" "Find bugs"
apfel -f spec.md --stream "Generate test cases for this spec"Files, stdin, and prompt arguments all compose naturally:
# File only (file content becomes the prompt)
apfel -f poem.txt
# File + prompt argument
apfel -f poem.txt "Translate this to German"
# Stdin + prompt argument
echo "some text" | apfel "Summarize this"
# File + stdin + prompt argument (all three combined)
echo "extra context" | apfel -f code.swift "Explain this code with the context above"Attach MCP tool servers with --mcp. apfel discovers tools, executes them automatically, and returns the final answer. No glue code needed.
# Give the model a calculator
apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
# tool: multiply({"a": 15, "b": 27}) = 405
# 15 times 27 is 405.
# Multiple MCP servers (bring your own)
apfel --mcp ./server_a.py --mcp ./server_b.py "Use both tools"
# Server mode - tools auto-available to all clients
apfel --serve --mcp ./mcp/calculator/server.py
# No --mcp = exactly as before. Zero overhead.
apfel "What is 2+2?"Ships with a calculator MCP server at mcp/calculator/. See MCP docs for details and how to build your own.
See demo/ for real-world shell scripts powered by apfel.
cmd - natural language to shell command:
demo/cmd "find all .log files modified today"
# $ find . -name "*.log" -type f -mtime -1
demo/cmd -x "show disk usage sorted by size" # -x = execute after confirm
demo/cmd -c "list open ports" # -c = copy to clipboardShell function version - add to your .zshrc and use cmd from anywhere:
# cmd - natural language to shell command (apfel). Add to .zshrc:
cmd(){ local x c r a; while [[ $1 == -* ]]; do case $1 in -x)x=1;shift;; -c)c=1;shift;; *)break;; esac; done; r=$(apfel -q -s 'Output only a shell command.' "$*" | sed '/^```/d;/^#/d;s/\x1b\[[0-9;]*[a-zA-Z]//g;s/^[[:space:]]*//;/^$/d' | head -1); [[ $r ]] || { echo "no command generated"; return 1; }; printf '\e[32m$\e[0m %s\n' "$r"; [[ $c ]] && printf %s "$r" | pbcopy && echo "(copied)"; [[ $x ]] && { printf 'Run? [y/N] '; read -r a; [[ $a == y ]] && eval "$r"; }; return 0; }cmd find all swift files larger than 1MB # shows: $ find . -name "*.swift" -size +1M
cmd -c show disk usage sorted by size # shows command + copies to clipboard
cmd -x what process is using port 3000 # shows command + asks to run it
cmd list all git branches merged into main
cmd count lines of code by languageoneliner - complex pipe chains from plain English:
demo/oneliner "sum the third column of a CSV"
# $ awk -F',' '{sum += $3} END {print sum}' file.csv
demo/oneliner "count unique IPs in access.log"
# $ awk '{print $1}' access.log | sort | uniq -c | sort -rnmac-narrator - your Mac's inner monologue:
demo/mac-narrator # one-shot: what's happening right now?
demo/mac-narrator --watch # continuous narration every 60sAlso in demo/:
- wtd - "what's this directory?" - instant project orientation
- explain - explain a command, error, or code snippet
- naming - naming suggestions for functions, variables, files
- port - what's using this port?
- gitsum - summarize recent git activity
apfel-gui - native macOS SwiftUI app for chatting with Apple Intelligence and inspecting every request/response.
# Install (requires apfel):
brew install Arthur-Ficial/tap/apfel # if you don't have apfel yet
git clone https://github.com/Arthur-Ficial/apfel-gui.git
cd apfel-gui && make install
# Run:
apfel-guiChat, debug inspector, request logs, context settings, speech-to-text, text-to-speech - all on-device. See the apfel-gui repo for details.
Base URL: http://localhost:11434/v1
| Feature | Status | Notes |
|---|---|---|
POST /v1/chat/completions |
Supported | Streaming + non-streaming |
GET /v1/models |
Supported | Returns apple-foundationmodel |
GET /health |
Supported | Model availability, context window, languages |
GET /v1/logs |
Debug only | Available only when server starts with --debug |
GET /v1/logs/stats |
Debug only | Available only when server starts with --debug |
| Tool calling | Supported | Native Transcript.ToolDefinition + JSON detection. See Tool Calling Guide |
response_format: json_object |
Supported | Via system prompt injection |
temperature, max_tokens, seed |
Supported | Mapped to GenerationOptions |
stream: true |
Supported | SSE with usage stats in final chunk |
finish_reason |
Supported | stop, tool_calls, length |
| Context strategies | Supported | x_context_strategy, x_context_max_turns, x_context_output_reserve extension fields |
| CORS | Supported | Enable with --cors |
POST /v1/completions |
501 | Legacy text completions not supported |
POST /v1/embeddings |
501 | Embeddings not available on-device |
logprobs=true, n>1, stop, presence_penalty, frequency_penalty |
400 | Rejected explicitly. n=1 and logprobs=false are accepted as no-ops |
| Multi-modal (images) | 400 | Rejected with clear error |
Authorization header |
Supported | Required when --token is set. See Server Security |
Full API spec: openai/openai-openapi
| Constraint | Detail |
|---|---|
| Context window | 4096 tokens (input + output combined). ~3000 English words. |
| Platform | macOS 26+, Apple Silicon only |
| Model | One model (apple-foundationmodel), not configurable |
| Guardrails | Apple's safety system may block benign prompts (false positives exist) |
| Speed | On-device inference, not cloud-scale - expect a few seconds per response |
| No embeddings | Apple's model doesn't support vector embeddings |
| No vision | Image/multi-modal input not supported |
MODES
apfel <prompt> Single prompt (default)
apfel --stream <prompt> Stream response tokens
apfel --chat Interactive conversation
apfel --serve Start OpenAI-compatible server
apfel --benchmark Run internal performance benchmarks
INPUT
apfel -f, --file <path> <prompt> Attach file content (repeatable)
apfel -s, --system <text> <prompt> Set system prompt
apfel --system-file <path> <prompt> Read system prompt from file
apfel --mcp <server.py> <prompt> Attach MCP tool server (repeatable)
OUTPUT
-o, --output <fmt> Output format: plain, json
-q, --quiet Suppress non-essential output
--no-color Disable ANSI colors
MODEL
--temperature <n> Sampling temperature (e.g., 0.7)
--seed <n> Random seed for reproducibility
--max-tokens <n> Maximum response tokens
--permissive Use permissive content guardrails
CONTEXT (--chat)
--context-strategy <s> newest-first, oldest-first, sliding-window, summarize, strict
--context-max-turns <n> Max history turns (sliding-window only)
--context-output-reserve <n> Tokens reserved for output (default: 512)
SERVER (--serve)
--port <n> Server port (default: 11434)
--host <addr> Bind address (default: 127.0.0.1)
--cors Enable CORS headers
--allowed-origins <origins> Comma-separated allowed origins
--no-origin-check Disable origin checking
--token <secret> Require Bearer token auth
--token-auto Generate random Bearer token
--public-health Keep /health unauthenticated
--footgun Disable all protections
--max-concurrent <n> Max concurrent requests (default: 5)
--debug Verbose logging + /v1/logs endpoints
META
-v, --version Print version
-h, --help Show help
--release Detailed build info
--model-info Print model capabilities
--update Check for updates via Homebrew
Examples by flag:
# -f, --file — attach file content to prompt (repeatable)
apfel -f main.swift "Explain this code"
apfel -f before.txt -f after.txt "What changed?"
# -s, --system — set a system prompt
apfel -s "You are a pirate" "What is recursion?"
apfel -s "Reply in JSON only" "List 3 colors"
# --system-file — read system prompt from a file
apfel --system-file persona.txt "Introduce yourself"
# --mcp — attach MCP tool servers (repeatable)
apfel --mcp ./mcp/calculator/server.py "What is 15 times 27?"
apfel --mcp ./calc.py --mcp ./weather.py "Use both tools"
# -o, --output — output format: plain (default) or json
apfel -o json "Translate to German: hello" | jq .content
apfel -o json "List 3 facts" | jq -r .content
# -q, --quiet — suppress headers and chrome, output only
apfel -q "Give me a UUID"
apfel -q -o json "Summarize" | jq .content
# --no-color — disable ANSI color codes
apfel --no-color --help
NO_COLOR=1 apfel "Hello"
# --temperature — sampling temperature (higher = more creative)
apfel --temperature 0.0 "What is 2+2?"
apfel --temperature 1.5 "Write a wild poem"
# --seed — random seed for reproducible output
apfel --seed 42 "Tell me a joke"
apfel --seed 42 "Tell me a joke" # same output
# --max-tokens — limit response length
apfel --max-tokens 50 "Explain quantum computing"
apfel --max-tokens 10 "Say something short"
# --permissive — relax content guardrails
apfel --permissive "Write a villain monologue"
# --stream — stream response tokens as they arrive
apfel --stream "Write a haiku about code"
apfel --stream -s "Be verbose" "Explain TCP"
# --chat — interactive multi-turn conversation
apfel --chat
apfel --chat -s "You are a helpful coding assistant"
# --context-strategy — how to handle context overflow in chat
apfel --chat --context-strategy newest-first # default: keep recent turns
apfel --chat --context-strategy oldest-first # keep earliest turns
apfel --chat --context-strategy sliding-window --context-max-turns 6
apfel --chat --context-strategy summarize # compress old turns
apfel --chat --context-strategy strict # error on overflow
# --context-max-turns — max turns to keep (sliding-window only)
apfel --chat --context-strategy sliding-window --context-max-turns 4
# --context-output-reserve — tokens reserved for model response
apfel --chat --context-output-reserve 256
# --serve — start OpenAI-compatible HTTP server
apfel --serve
apfel --serve --port 3000 --host 0.0.0.0
# --port, --host — server bind address
apfel --serve --port 8080
apfel --serve --host 0.0.0.0 --port 443
# --cors — enable CORS for browser clients
apfel --serve --cors
# --allowed-origins — restrict to specific origins
apfel --serve --allowed-origins "https://myapp.com,https://staging.myapp.com"
# --no-origin-check — allow requests from any origin
apfel --serve --no-origin-check
# --token — require Bearer token authentication
apfel --serve --token "my-secret-token"
curl localhost:11434/v1/models -H "Authorization: Bearer my-secret-token"
# --token-auto — generate a random token and print it
apfel --serve --token-auto
# --public-health — keep /health open even with token auth
apfel --serve --token "secret" --host 0.0.0.0 --public-health
# --footgun — disable all protections (CORS + no origin check)
apfel --serve --footgun # only for local development!
# --max-concurrent — limit parallel model requests
apfel --serve --max-concurrent 2
# --debug — verbose logging with /v1/logs endpoints
apfel --serve --debug
# --benchmark — run internal performance benchmarks
apfel --benchmark
apfel --benchmark -o json | jq '.benchmarks[] | {name, speedup_ratio}'
# --model-info — print model capabilities
apfel --model-info
# --update — check for updates and upgrade via Homebrew
apfel --update
# --release — detailed version and build info
apfel --release
# --version
apfel --version
# --help
apfel --helpSee Server Security for detailed documentation on security options.
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Runtime error |
| 2 | Usage error (bad flags) |
| 3 | Guardrail blocked |
| 4 | Context overflow |
| 5 | Model unavailable |
| 6 | Rate limited |
| Variable | Description |
|---|---|
APFEL_SYSTEM_PROMPT |
Default system prompt |
APFEL_HOST |
Server bind address |
APFEL_PORT |
Server port |
APFEL_TOKEN |
Bearer token for server authentication |
APFEL_TEMPERATURE |
Default temperature |
APFEL_MAX_TOKENS |
Default max tokens |
APFEL_CONTEXT_STRATEGY |
Default context strategy |
APFEL_CONTEXT_MAX_TURNS |
Max turns for sliding-window |
APFEL_CONTEXT_OUTPUT_RESERVE |
Tokens reserved for output |
NO_COLOR |
Disable colors (no-color.org) |
CLI (single/stream/chat) ──┐
├─→ FoundationModels.SystemLanguageModel
HTTP Server (/v1/*) ───────┘ (100% on-device, zero network)
ContextManager → Transcript API
SchemaConverter → native ToolDefinitions
TokenCounter → real token counts (SDK 26.4)
Built with Swift 6.3 strict concurrency. Single Package.swift, three targets:
ApfelCore- pure logic library (no FoundationModels dependency, unit-testable)apfel- executable (CLI + server)apfel-tests- unit tests (pure Swift runner, no XCTest)
No Xcode required. Builds and tests with Command Line Tools only.
# Build + install (auto-bumps patch version each time)
make install # build release + install to /usr/local/bin
make build # build release only (no install)
# Version management (zero manual editing)
make version # print current version
make release-minor # bump minor: 0.6.x -> 0.7.0
make release-major # bump major: 0.x.y -> 1.0.0
# Debug build (no version bump, uses swift directly)
swift build # quick debug build
# Tests
swift run apfel-tests # pure Swift unit tests (no XCTest needed)
apfel --serve --debug & # start server for integration tests
python3 -m pytest Tests/integration/ -v # integration tests (requires server)
apfel --benchmark -o json # performance report from the installed binary
make benchmark # same benchmark via MakefileEvery make build/make install automatically:
- Bumps the patch version (
.versionfile is the single source of truth) - Updates the README version badge
- Generates build metadata (commit, date, Swift version) viewable via
apfel --release
- apfel-clip - AI-powered clipboard actions from the menu bar (fix grammar, translate, explain code, and more)
- apfel-gui - Native macOS SwiftUI debug GUI for apfel (chat, request inspector, logs, TTS/STT)
See docs/EXAMPLES.md for 50 real prompts and unedited model outputs.

