The most feature-complete local AI workstation. No subscriptions. No cloud dependency. Just your hardware.
While everyone else ships another chat UI with fancy presets, Eloquent gives you in-house Stable Diffusion, multi-GPU inference, voice cloning, model ELO testing, a tool-calling code editor, multi-role chat, and forensic linguistics β all running locally.
Optional cloud APIs for when you want them. Your choice.
- Single application: LLM + image generation + voice + code tools + model evaluation
- Multi-GPU that works: Unified tensor splitting or dedicated GPU assignment
- More than chat: ELO testing framework, forensic linguistics, story state tracking, multi-role conversations
- Production features: Voice cloning, image upscaling, conversation summaries, agent mode
| I want to... | Do this |
|---|---|
| Chat with voice | install.bat β run.bat β load a GGUF β enable Auto-TTS |
| Generate images | Drop .safetensors in a folder β Settings β Image Gen β set path |
| Upscale images | Generate image β click Upscale β select 2x/3x/4x |
| Multi-character roleplay | Settings β enable Multi-Role β add characters to roster |
| Test models | Model Tester β import prompts β run A/B with ELO ratings |
| Edit code with AI | Load Devstral β Code Editor β set project directory |
| Play chess (AI + personality) | Chess tab in navbar (Stockfish installed automatically by install.bat) |
| Clone a voice | Settings β Audio β Chatterbox Turbo β upload reference |
Power users with NVIDIA GPUs who want a complete local AI stack instead of juggling 5 different tools.
Roleplayers & writers who need multi-character conversations, story state, portraits, and voice in one app.
Model evaluators who want ELO testing and judge orchestration without building research infrastructure.
Privacy-first users who don't want conversations leaving their machine.
- You don't have an NVIDIA GPU
- You're on Mac or Linux (Windows only)
Multi-Role Conversations
- Multiple characters in one chat with automatic speaker selection
- Per-character TTS voices and talkativeness weights
- Optional narrator with customizable interjection frequency
- User profile picker for switching between personas
- Group scene context for shared settings
Story Management
- Story Tracker: Characters, locations, inventory, objectives injected into AI context
- Scene Summary: Persistent context that grounds the AI in current mood and situation
- Choice Generator: Contextual actions with 6 behavior modes (Dramatic, Chaotic, Romantic, etc.)
- Director Mode: Toggle between character actions and narrative beats for plot steering
- Conversation Summaries: Save summaries and load them into fresh chats for continuity
Standard Features
- Character library and creator with AI-generated portraits
- Memory & RAG with document ingestion and web search
- Author's Note for direct AI guidance
- Focus Mode and Call Mode interfaces
Multi-GPU Support
- Unified tensor splitting across 2, 3, 4+ GPUs
- Split-services mode with dedicated GPU assignments
- Purpose slots for judge models and memory agents
- Real-time VRAM monitoring
Model Compatibility
- Local GGUF models via llama.cpp
- OpenAI-compatible APIs (OpenRouter, local proxies, Chub.ai)
- Simultaneous local + API model usage
Local Stable Diffusion
- SD 1.5, SDXL, and FLUX support (safetensors/ckpt/gguf)
- Custom ADetailer with YOLO face detection and inpainting
- "Visualize Scene" - auto-generate images from chat context
- Set generated images as chat backgrounds
Image Upscaling
- Variable upscaling: 2x, 3x, 4x with ESRGAN models
- Model selector for different upscaler weights
Cloud Fallback (Optional)
- NanoGPT API for image generation without local GPU
- Experimental video generation (pay-per-use)
TTS Engines
- Kokoro: Fast neural synthesis with multiple voices
- Chatterbox: Voice cloning from reference samples
- Chatterbox Turbo: Enhanced cloning with paralinguistic cues (
[laugh],[sigh],[cough])
Features
- Chunked streaming pipeline for low latency
- Auto-TTS with one-click toggle
- Call Mode: Full-screen voice conversation with animated avatars
- Per-character voice assignment in multi-role chat
ELO Testing Framework
- Single model testing against prompt collections (MT-Bench, custom)
- A/B head-to-head comparisons with ELO updates
- Dual-judge mode with reconciliation
- Character-aware judging with custom evaluation criteria
- Parameter sweeps (temperature, top_p, top_k)
- 14 built-in analysis perspectives including 6-Year-Old Transformer Boy, Al Swearengen, Bill Burr, Alex Jones
- Import/export results with full metadata
Tool-Calling Agent
- Devstral Small 2 24B (local) or Devstral Large (OpenRouter)
- File operations with automatic
.bakbackups - Shell execution (optional, sandboxed)
- Vision support via screenshots
Agent Mode Features
- Chain of Thought visualization - see reasoning before actions
- Hallucination Rescue - executes intended tools even when JSON parsing fails
- Loop detection prevents endless file reading
- File explorer with full drive navigation
Security
- Sandboxed to working directory
- Optional command execution
- Automatic backups on file writes
Forensic Linguistics
- Authorship analysis and stylistic comparison
- Pluggable embedding models (BGE-M3, GTE, RoBERTa, Jina, Nomic)
- Build corpora from documents or scraped text
UI & Customization
- 5 premium themes: Claude, Messenger, WhatsApp, Cyberpunk, ChatGPT Light
- Text formatting: Quote highlighting, H1-H3 headings, paragraph controls
- Auto-save settings (directories require manual save)
Full mobile optimization for phones and tablets.
- Responsive design with touch-friendly UI throughout
- Universal access: automatic
0.0.0.0binding and IP discovery for local network connection - Native audio handling for reliable TTS on iOS and Android
- Mobile-first themes (Messenger, WhatsApp) designed for phone/tablet use
- Touch-optimized controls and adaptive layouts
Full chat with Story Tracker, Choice Generator, streaming TTS, and model control.
Voice cloning with real-time streaming playback.
AI-generated character portraits via built-in Stable Diffusion.
Professional model evaluation with dual-judge reconciliation.
- Windows 10/11 (64-bit)
- NVIDIA GPU with CUDA support
- Python 3.11 or 3.12
- Node.js v21.7.3 (recommended). Node 22 is untested; if the backend window closes when you use Browse for model/directory settings, try Node 21.7.3 or type the folder path manually.
| Use Case | Recommended VRAM |
|---|---|
| Small models (7B Q4) | 8GB |
| Medium models (13B-20B) | 12GB |
| Large models (70B+) | 24GB+ or multi-GPU |
| SD 1.5 | 4GB+ |
| SDXL/FLUX | 8GB+ |
| LLM + image gen together | 16GB+ or split across GPUs |
git clone https://github.com/boneylizard/Eloquent
cd Eloquent
install.bat # Wait for completion (5-10 minutes)
run.batThe installer handles everything: Python venv, PyTorch with CUDA 12.1, pre-built wheels, all dependencies.
Default ports:
- Backend:
http://localhost:8000 - TTS:
http://localhost:8002 - Frontend:
http://localhost:5173
Port conflicts are handled automatically - the frontend discovers actual ports.
- Settings β Model Settings β set GGUF directory
- Model Selector β choose per-GPU or unified multi-GPU
- Add OpenAI-compatible API endpoints if desired
- Settings β Image Generation β set safetensors directory
- ADetailer Models β point to YOLO
.ptfiles - Upscaler Models β point to ESRGAN
.pthfiles
- Settings β Audio β choose Kokoro or Chatterbox/Chatterbox Turbo
- For cloning: upload reference sample
- Enable Auto-TTS toggle in chat
- Settings β enable Multi-Role Chat
- Click roster button β add characters
- Set talkativeness weights and voices
- Optionally enable narrator
The Chess tab uses Stockfish for analysis. Fresh installs: install.bat runs the Stockfish installer automatically (downloads the official Windows build into tools/stockfish/). If that step failed (e.g. no network), run install_stockfish.bat or python scripts/install_stockfish.py from the repo root. You can also install Stockfish manually and set STOCKFISH_PATH to the stockfish.exe path. Then open the Chess tab, set ELO and personality, and play as White; the AI plays Black with optional LLM commentary.
| Problem | Solution |
|---|---|
| Missing dependencies | Run install.bat again or pip install the missing package |
| CUDA errors | Update NVIDIA drivers, ensure CUDA 12.x |
| Model won't load | Check VRAM, try smaller quantization |
| Port conflicts | Let auto-detection handle it or check processes on 8000/8002/5173 |
| TTS not working | Verify TTS service started in console |
| Import error on startup | Read traceback, activate venv, pip install the package |
| Context too large | Settings β adjust Context Window per endpoint |
| Agent Mode JSON errors | Hallucination Rescue should auto-fix, check console |
Multi-Character Roleplay
- Enable Multi-Role in settings
- Add 3-4 characters to roster with different voices
- Set talkativeness weights (quiet character = 0.3, loud = 1.5)
- Enable narrator for scene-setting every 5 turns
- Use Story Tracker to maintain world state
- Generate portraits with built-in SD
Model Evaluation
- Import MT-Bench prompts
- Run A/B tests between two 70B models
- Enable dual-judge with Al Swearengen and Bill Burr
- Run parameter sweep on temperature
- Export results with ELO rankings
Local Cursor Alternative
- Load Devstral Small 2 24B
- Open Code Editor β set project directory
- Enable Chain of Thought to see reasoning
- Ask it to refactor a module
- Watch tool execution in real-time
Long-Form Writing
- Load character and enable Story Tracker
- Use Director Mode to steer plot beats
- Generate scene visualizations
- Save conversation summary every chapter
- Load summaries into fresh chats for continuity
| Metric | Value |
|---|---|
| Total lines of code | 50,000+ |
| Python backend files | 27 |
| React frontend files | 72 |
| Built-in TTS engines | 3 (Kokoro + Chatterbox + Chatterbox Turbo) |
| Analysis perspectives | 14 |
| Code editor tools | 7 |
| Supported SD architectures | 3 (SD 1.5, SDXL, FLUX) |
| Premium themes | 5 |
See CHANGELOG.md for detailed version history and updates.
Contributions welcome.
Licensed under GNU Affero General Public License v3.0.
- llama.cpp & llama-cpp-python
- stable-diffusion.cpp & stable-diffusion-cpp-python
- Kokoro TTS
- Chatterbox TTS
- ultralytics YOLO
- FastAPI
- React
Bernard Peter Fitzgerald (@boneylizard)
Eloquent β your GPUs deserve better.



