██████╗ █████╗ ███╗ ███╗██████╗ ██╗
██╔════╝ ██╔══██╗████╗ ████║██╔══██╗██║
██║ ███╗███████║██╔████╔██║██████╔╝██║
██║ ██║██╔══██║██║╚██╔╝██║██╔══██╗██║
╚██████╔╝██║ ██║██║ ╚═╝ ██║██████╔╝██║
╚═════╝ ╚═╝ ╚═╝╚═╝ ╚═╝╚═════╝ ╚═╝
- What is Gambi?
- Installation
- Quick Start
- Features
- Usage Examples
- Architecture
- Development
- Supported Providers
- Security
- Roadmap
- Contributing
- License
Gambi is a local-first LLM sharing system that allows multiple users on a network to pool their LLM resources together. Everyone can share their Ollama, LM Studio, LocalAI, or any endpoint that speaks OpenResponses or OpenAI-compatible chat/completions.
The public name Gambi is the short form of gambiarra. In Brazilian Portuguese, gambiarra here means the good kind: creative improvisation under constraints—resourceful, community-minded problem solving, not a sloppy hack. The shorter spelling is easier to say, type, and wire into CLI commands and package names in English, without losing that meaning.
If you installed the project under its previous CLI package name, see the migration guide.
If you still have the old global CLI package installed, remove it and install Gambi:
# npm
npm uninstall -g gambiarra && npm install -g gambi
# bun
bun remove -g gambiarra && bun add -g gambiimport { createGambi } from "gambi-sdk";
const gambi = createGambi({ roomCode: "ABC123" });- Local-First: Your data stays on your network
- Resource Sharing: Pool LLM endpoints across your team
- OpenResponses First: Prefers
v1/responsesby default and falls back tochat/completionswhen needed - Universal Compatibility: Works with OpenResponses and OpenAI-compatible chat/completions APIs
- Vercel AI SDK Integration: Drop-in replacement for your AI SDK workflows
- Auto-Discovery: mDNS/Bonjour support for zero-config networking
- Real-time Monitoring: Beautiful TUI for tracking room activity
- Production Ready: Built with TypeScript, Bun, and modern tooling
- Development Teams: Share expensive LLM endpoints across your team
- Hackathons: Pool resources for AI projects
- Research Labs: Coordinate LLM access across multiple workstations
- Home Labs: Share your gaming PC's LLM with your laptop
- Education: Classroom environments where students share compute
The CLI allows you to start hubs, create rooms, and join as a participant.
Linux / macOS (recommended - standalone binary):
curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/install.sh | bashWindows (PowerShell):
irm https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/install.ps1 | iexVia npm (first-class wrapper package):
npm install -g gambiThe published gambi package installs a lightweight wrapper plus the matching platform binary for the current machine. It does not require Bun at runtime.
Via bun:
bun add -g gambibun add -g gambi installs the same wrapper package and matching platform binary.
Verify installation:
gambi --versionUninstall:
# Linux / macOS (standalone binary)
curl -fsSL https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/uninstall.sh | bash# Windows (PowerShell)
irm https://raw.githubusercontent.com/arthurbm/gambi/main/scripts/uninstall.ps1 | iex# If installed via npm
npm uninstall -g gambi
# If installed via bun
bun remove -g gambiThe SDK provides Vercel AI SDK integration for using shared LLMs in your applications.
Via npm:
npm install gambi-sdkVia bun:
bun add gambi-sdkUninstall:
# If installed via npm
npm uninstall gambi-sdk
# If installed via bun
bun remove gambi-sdkgambi serve
# Or with flags: gambi serve --port 3000 --mdnsgambi create
# Or with flags: gambi create --name "My Room" --config ./room-defaults.jsongambi join
# Or with flags: gambi join --code ABC123 --model llama3 --config ./participant-config.jsonAll commands support interactive mode — run without flags and you'll be guided through each option step by step. Flags still work for scripting and automation.
Example config JSON:
{
"instructions": "Always answer in Brazilian Portuguese.",
"temperature": 0.4,
"max_tokens": 512
}import { createGambi, resolveGambiTarget } from "gambi-sdk";
import { generateText } from "ai";
const target = await resolveGambiTarget({
roomCode: "ABC123", // optional if only one room is visible on your LAN
});
const gambi = createGambi({
roomCode: target.roomCode,
hubUrl: target.hubUrl,
});
const result = await generateText({
model: gambi.any(),
prompt: "Hello, Gambi!",
});
console.log(result.text);For scripts or hosted environments where discovery is not needed, you can still pass hubUrl and roomCode directly.
# All commands support interactive mode — just run the command:
gambi serve
gambi create
gambi join
gambi list
# Or use flags for scripting:
gambi serve --mdns
gambi create --name "My Room" --config ./room-defaults.json
gambi join --code ABC123 --model llama3 --config ./participant-config.json
gambi list --jsonRoom defaults are merged at request time with precedence room defaults -> participant defaults -> runtime request. Public room/participant listings expose only a safe summary such as hasInstructions, not the raw instructions text.
Use runtime defaults when you want a room or participant to contribute reusable behavior without forcing every client request to repeat the same settings.
Example room defaults:
{
"instructions": "Answer in Brazilian Portuguese.",
"temperature": 0.3,
"max_tokens": 512
}Create a room with defaults:
gambi create --name "Portuguese Room" --config ./room-defaults.jsonExample participant defaults:
{
"instructions": "Prefer concise technical answers.",
"temperature": 0.6
}Join with participant defaults:
gambi join --code ABC123 --model llama3 --config ./participant-config.jsonMerge behavior:
- Room defaults apply first.
- Participant defaults override room defaults.
- The request sent by the client overrides both.
Public API behavior:
- Sensitive instruction text is stored by the hub but not exposed in public room or participant listings.
- Public responses expose summary fields such as
hasInstructionsinstead.
import { createGambi } from "gambi-sdk";
import { generateText } from "ai";
const gambi = createGambi({ roomCode: "ABC123" });
// Use any available participant
const result = await generateText({
model: gambi.any(),
prompt: "Explain quantum computing",
});
// Target specific participant
const result2 = await generateText({
model: gambi.participant("joao"),
prompt: "Write a haiku about TypeScript",
});
// Route by model type
const result3 = await generateText({
model: gambi.model("llama3"),
prompt: "What is the meaning of life?",
});
// Use Chat Completions instead of the default Responses API
const legacy = createGambi({
roomCode: "ABC123",
defaultProtocol: "chatCompletions",
});
const result4 = await generateText({
model: legacy.any(),
prompt: "Hello from chat/completions",
});Monitor rooms in real-time with a beautiful TUI:
cd apps/tui
bun run dev ABC123# Interactive — prompts for port, host, mDNS:
gambi serve
# Or with flags:
gambi serve --port 3000 --mdns# Interactive — prompts for name and password:
gambi create
# Or with flags:
gambi create --name "My Room"
gambi create --name "My Room" --config ./room-defaults.json# Interactive — prompts for hub URL and output format:
gambi list
# Or with flags:
gambi list --json# Interactive — select provider, model, set nickname:
gambi join
# Or with flags:
gambi join --code ABC123 --model llama3
gambi join --code ABC123 --model mistral --endpoint http://localhost:1234
gambi join --code ABC123 --model llama3 --config ./participant-config.jsonimport { createGambi } from "gambi-sdk";
import { generateText } from "ai";
const gambi = createGambi({ roomCode: "ABC123" });
const result = await generateText({
model: gambi.any(),
prompt: "What is TypeScript?",
});
console.log(result.text);import { createGambi } from "gambi-sdk";
import { streamText } from "ai";
const gambi = createGambi({ roomCode: "ABC123" });
const stream = await streamText({
model: gambi.model("llama3"),
prompt: "Write a story about a robot",
});
for await (const chunk of stream.textStream) {
process.stdout.write(chunk);
}const gambi = createGambi({
roomCode: "ABC123",
hubUrl: "http://192.168.1.100:3000",
});
const result = await generateText({
model: gambi.any(),
prompt: "Explain recursion",
temperature: 0.7,
maxTokens: 500,
});cd apps/tui
bun install
bun run dev ABC123The TUI provides real-time monitoring of:
- Active participants
- Current model loads
- Request history
- Participant health status
Gambi uses a HTTP + SSE architecture for simplicity and compatibility:
┌─────────────────────────────────────────────────────────────┐
│ GAMBI HUB (HTTP) │
│ │
│ Endpoints: │
│ • POST /rooms (Create room) │
│ • GET /rooms (List rooms) │
│ • POST /rooms/:code/join (Join room) │
│ • POST /rooms/:code/v1/responses (Proxy) │
│ • GET /rooms/:code/v1/responses/:id │
│ • DELETE /rooms/:code/v1/responses/:id │
│ • POST /rooms/:code/v1/responses/:id/cancel │
│ • GET /rooms/:code/v1/responses/:id/input_items │
│ • POST /rooms/:code/v1/chat/completions (Proxy) │
│ • GET /rooms/:code/events (SSE updates) │
└─────────────────────────────────────────────────────────────┘
▲ ▲ ▲
│ HTTP │ HTTP │ SSE
│ │ │
┌────┴────┐ ┌─────────┴────────┐ ┌──────┴─────┐
│ SDK │ │ Participants │ │ TUI │
└─────────┘ └──────────────────┘ └────────────┘
- Hub: Central HTTP server that routes requests and manages rooms
- Participants: LLM endpoints registered in a room (Ollama, LM Studio, etc.)
- SDK: Vercel AI SDK provider that proxies to the hub
- TUI: Real-time monitoring interface using Server-Sent Events
Internally, the hub uses a protocol adapter registry. OpenResponses is the default public path, but the core stays open to additional protocol adapters instead of baking protocol-specific branching into every call site.
| Pattern | Example | Description |
|---|---|---|
| Participant ID | gambi.participant("joao") |
Route to specific participant |
| Model Name | gambi.model("llama3") |
Route to first participant with model |
| Any | gambi.any() |
Route to random online participant |
gambi/
├── packages/
│ ├── core/ # Core library (Hub, Room, Protocol)
│ ├── cli/ # Command-line interface
│ └── sdk/ # Vercel AI SDK integration
├── apps/
│ ├── docs/ # Documentation site (Astro Starlight)
│ └── tui/ # Terminal UI for monitoring
└── docs/ # Architecture documentation
| Package | Description | Version |
|---|---|---|
gambi |
CLI for managing hubs and participants | 0.1.0 |
gambi-sdk |
Vercel AI SDK provider | 0.1.0 |
@gambi/core |
Hub server, room management, SSE, mDNS (internal) | 0.0.1 |
For detailed architecture, see docs/architecture.md.
# Clone the repository
git clone https://github.com/arthurbm/gambi.git
cd gambi
# Install dependencies
bun install
# Build all packages
bun run build# Run hub server in development mode
bun run dev
# Run docs app
bun run dev:docs
# Type checking
bun run check-types
# Linting and formatting (Ultracite/Biome)
bun x ultracite check
bun x ultracite fix# Work on CLI
cd packages/cli
bun run dev serve --port 3000
# Work on Core
cd packages/core
bun run check-types
# Work on SDK
cd packages/sdk
bun run check-typesThis project uses Ultracite, a zero-config preset for Biome. See CLAUDE.md for detailed code standards.
Releases are automated via GitHub Actions. The workflow updates synchronized versions, publishes the SDK, publishes the CLI binary packages first, publishes the gambi wrapper last, and then creates GitHub releases with the same binaries.
Via GitHub UI:
- Go to Actions > Release > Run workflow
- Select bump type:
patch,minor, ormajor - Click Run workflow
Via GitHub CLI:
# Release the synchronized package set
gh workflow run release.yml -f bump=patch
# Watch the workflow progress
gh run watchThe workflow will:
- Calculate the new version (e.g., 0.1.1 → 0.1.2 for patch)
- Pin the release to one source commit
- Update all
package.jsonfiles - Build the CLI distribution once and reuse it across publish and release
- Publish the CLI binary packages before the
gambiwrapper - Build and publish to npm
- Commit and tag the release
- Create a GitHub Release with binaries
For a deeper explanation of the release pipeline and package layout, see:
Gambi works with endpoints that expose OpenResponses or OpenAI-compatible chat/completions:
| Provider | Default Endpoint | Notes |
|---|---|---|
| Ollama | http://localhost:11434 |
Most popular local LLM server |
| LM Studio | http://localhost:1234 |
GUI-based LLM management |
| LocalAI | http://localhost:8080 |
Self-hosted OpenAI alternative |
| vLLM | http://localhost:8000 |
High-performance inference |
| text-generation-webui | http://localhost:5000 |
Gradio-based interface |
| Custom | Any URL | Any OpenAI-compatible endpoint |
- Local Network Only: Gambi is designed for trusted local networks
- No Authentication: Currently no built-in auth (use network isolation)
- HTTP Only: Uses plain HTTP (consider reverse proxy for HTTPS)
- Participant Trust: All participants can access shared models
For production use, consider:
- Running behind a reverse proxy (Caddy, Nginx)
- Using VPN or WireGuard for remote access
- Implementing authentication at the proxy level
- Authentication & authorization
- Participant quotas and rate limiting
- Persistent room storage (SQLite/PostgreSQL)
- Load balancing across multiple participants
- Model capability negotiation
- Web UI for room management
- Docker/container support
- Metrics and observability
- Request queueing for busy participants
Contributions are welcome! This is an early-stage project and we'd love your help.
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Make your changes
- Run
bun x ultracite fixto format code - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow the code standards in CLAUDE.md
- Write type-safe TypeScript
- Add tests for new features
- Update documentation as needed
MIT License - see LICENSE for details.
Built with:
- Bun - Fast JavaScript runtime
- Turbo - High-performance build system
- Vercel AI SDK - AI integration framework
- Biome - Fast formatter and linter
- Clipanion - Type-safe CLI framework
- Bonjour - mDNS service discovery
Made with love for the local LLM community