Your local models deserve a real frontend.
Web access. Adaptive memory. Multi-user. Built on LM Studio's native API.
Main chat view — dark theme, desktop
I use local LLMs for everything — brainstorming, planning, day-to-day questions, recommendations based on what I've already told it. The kind of stuff you'd use any AI assistant for, except it's running on my own hardware. LM Studio handles inference really well, but I kept hitting the same wall: no web access. I couldn't pick up a conversation from my phone, share the server with anyone else, or have it remember context across sessions without the desktop app open in front of me.
lm-chat fills that gap. It's a web frontend that handles everything around LM Studio — browser access from any device, persistent conversations that survive model swaps, adaptive memory that learns who you are, and multi-user auth so your whole household or team can share one server.
It's the only web client built on LM Studio's native API (/api/v1/chat), so you get MCP tools, server-managed conversation history, and model-aware features that aren't available through the OpenAI compatibility layer. No re-implementation, no compatibility hacks — just a tight integration with everything LM Studio already does well.
No pip install, no npm, no build step. Just run it.
docker run -d -p 3001:3001 -v ./lm-chat-data:/app/data \
-e LMSTUDIO_URL=http://host.docker.internal:1234 \
ghcr.io/chevron7locked/lm-chat:nightlyMulti-arch: linux/amd64 + linux/arm64 (Apple Silicon, Raspberry Pi).
git clone https://github.com/chevron7locked/lm-chat.git
cd lm-chat
python3 server.pyOpen http://localhost:3001. Log in with the admin credentials printed to the console (see First Run below).
Requirements: Python 3.10+ (or Docker) and LM Studio running with at least one model loaded.
Authentication is on by default. On first launch, lm-chat creates an admin account and prints the credentials to stderr:
==================================================
Admin account created
Username: admin
Password: <random-password>
(set LM_CHAT_ADMIN_PASS to use your own)
==================================================
Copy the password from the terminal and log in at http://localhost:3001. You can change it in Settings → Security once logged in.
To set your own credentials upfront:
LM_CHAT_ADMIN_USER=myname LM_CHAT_ADMIN_PASS=mypassword python3 server.pyOr with Docker:
docker run -d -p 3001:3001 -v ./lm-chat-data:/app/data \
-e LMSTUDIO_URL=http://host.docker.internal:1234 \
-e LM_CHAT_ADMIN_USER=myname \
-e LM_CHAT_ADMIN_PASS=mypassword \
ghcr.io/chevron7locked/lm-chat:nightlyTo disable auth entirely (single-user, trusted network): LM_CHAT_AUTH=false.
Once logged in as admin, you can invite other users from Settings → Users.
Most third-party UIs talk to LM Studio through /v1/chat/completions — the OpenAI compatibility layer. lm-chat is built on /api/v1/chat, LM Studio's native endpoint. This matters because the native API exposes features the compatibility layer doesn't:
| Feature | Native API (/api/v1/chat) |
OpenAI Compat (/v1/chat/completions) |
|---|---|---|
| MCP tool execution | LM Studio runs your MCP servers | Not available |
| Response ID chaining | Server-managed history | Client resends everything |
| Reasoning events | Real SSE events | Parse <think> tags yourself |
| Capability detection | Vision, tool_use flags per model | Not available |
| Loaded instance routing | Use instance alias, avoid JIT reload | Not available |
| Model metadata | Context window, quantization, format | Basic only |
Response ID chaining is the big one. LM Studio manages the full conversation history server-side. lm-chat sends only the new message + a reference to the previous response. No token waste re-sending the entire history every turn.
LM Studio's desktop app uses all of this natively. lm-chat is the first web client that does too.
- SSE streaming with live token stats (tokens/sec, time-to-first-token)
- MCP tool execution — all MCP servers configured in
~/.lmstudio/mcp.jsonshow up automatically and are on by default. Toggle per-conversation. Supports multi-step agentic loops - Native reasoning display — thinking blocks from reasoning models (DeepSeek-R1, QwQ, Qwen3, etc.) in collapsible sections, with configurable depth (Off / Low / Medium / High)
- Stop, edit, resend, regenerate — full conversation control
- Conversation forking — branch from any message to explore alternatives
- Auto-generated titles via LLM
- Suggested follow-ups — optional follow-up questions after each response
- Response feedback — upvote / downvote individual responses; signals feed back into memory scoring
Live MCP tool call with streaming arguments — desktop
Two opt-in inference modes that improve response quality at the cost of extra LLM calls. Toggle globally in Settings or per-conversation in the chat settings panel.
Self-Consistency — Generates 3 independent responses, then synthesizes the most consistent answer. Reduces noise on reasoning, factual, and technical questions. Skips synthesis when the first two responses are nearly identical (>80% token overlap). ~4× token cost.
Chain of Verification — Four-step pipeline: draft → extract verification questions → answer each question independently → synthesize a corrected response. Reduces hallucinations on factual claims by 50–70%. Based on Dhuliawala et al., 2023. ~4× token cost.
Both can be enabled simultaneously: CoVe runs first, then SC synthesizes across CoVe's output.
Pin your most-used chats, group related conversations into folders, and find anything instantly.
- Pinned chats — star any conversation to keep it at the top of the sidebar
- Pinned messages — pin individual assistant responses; they survive
/compactand are searchable globally - Folders — create named folders to organize chats by project, topic, or whatever makes sense
- Collapsible sections — folders collapse/expand with a click
- Recent section — everything else, sorted by last activity
- Text search — filter chats by title instantly
- Semantic search — press Enter to search by meaning across all messages (powered by the embedding model in LM Studio —
nomic-embed-text-v1.5is included with every LM Studio install)
Sidebar with pinned chats, folders, and recent conversations — desktop
Six system prompt presets, each tuned for a specific task. Switch from the settings panel or activate via slash commands:
| Command | Mode | Temperature |
|---|---|---|
/research |
Deep Research — multi-source synthesis | 0.4 |
/code |
Coding Agent — doc lookup, structured planning | 0.1 |
/write |
Creative Writing — craft-focused workshop | 0.9 |
/analyze |
Strategic Analyst — framework-driven analysis | 0.3 |
/architect |
Systems Architect — technical design | 0.2 |
Or choose Custom to write your own system prompt. Template variables are replaced on send: {{current_date}}, {{day_of_week}}, {{current_time}}, {{model}}, {{memories}}.
Slash command autocomplete — desktop
Share any conversation as a read-only page. One click generates a unique URL — no login required to view.
- Full markdown rendering — code blocks, formatting, and structure preserved
- Standalone pages — minimal JavaScript, works anywhere
- Strict CSP headers — shared pages are sandboxed
- Revocable — delete the share anytime from the chat menu
Shared conversation — read-only page
Your context follows you — across conversations, across model swaps. lm-chat builds a profile of your preferences, projects, skills, and opinions without you lifting a finger.
- Auto-distillation — insights extracted from conversations in the background
- Bayesian scoring — feedback on responses propagates back to the memory insights that shaped them
- Cognitive decay — stale memories fade naturally (freshness × usage × feedback scoring)
- Category weighting — identity and skill stay; session-specific details drift
- Full user control — view, edit, delete, toggle on/off, refine (LLM-based dedup/merge)
- Zero external dependencies — SQLite-backed, no vector store
Memory panel — categorized insights with decay indicators
- Context gauge — live visualization of context window usage, click to compact
/compact— LLM-summarized context when you need to free up space (pinned messages are preserved)- Instruction sandwich — core instructions reinforced at end of system prompt for better adherence with local models
Your LM Studio MCP servers show up automatically — configured in ~/.lmstudio/mcp.json, enabled by default in the UI. Toggle any server per-conversation.
Remote MCP — Connect additional MCP endpoints by URL with optional auth headers. Per-server credentials are stored server-side and never sent to the browser.
- Hot model switching — topbar dropdown or input pill
- Capability badges — Vision and Tool Use auto-detected per model from LM Studio metadata
- Loaded / Idle status — loaded models shown first, context window from the live instance config
- Full sampling control — temperature, top_p, top_k, min_p, repeat_penalty, max output tokens
- Reasoning depth — Off / Low / Medium / High for supported thinking models
- Instance-aware routing — uses the model's instance identifier (nickname) to avoid JIT reloads on every request
- Connection monitoring — live status indicator with health polling
Model switching with capability badges — desktop
Two settings surfaces:
Full-page settings (gear icon) — global defaults and account settings:
| Tab | Contents |
|---|---|
| Chat | System prompt presets, reasoning depth, suggested follow-ups, Self-Consistency, Chain of Verification, delete all chats |
| Memory | Toggle, view, edit, add, refine, clear insights |
| Starters | Customize welcome screen shortcuts |
| Server | LM Studio URL, API key, loaded models, MCP toggles, remote MCP endpoints, debug logging |
| Profile | Display name, change password |
| Security | TOTP 2FA setup and management |
| Users | Admin-only user management and invites |
Per-chat settings panel (right panel, per-conversation overrides):
- System prompt and preset (primary)
- Temperature (always visible)
- Advanced settings expander: top_p, top_k, min_p, repeat_penalty, max output tokens, reasoning depth
- Quality checks: Self-Consistency, Chain of Verification toggles
Per-chat settings override global defaults. Advanced sampling params default to LM Studio's instance config when not set.
Unified settings — tabbed navigation
Optional (LM_CHAT_AUTH=true, enabled by default). Not bolted on — designed in from day one:
- Invite-only accounts with admin management
- TOTP 2FA — QR enrollment, works with any authenticator app (RFC 6238, stdlib-only QR generator)
- Per-user API keys — each user stores their own LM Studio auth token server-side
- Per-user data isolation — users only see their own conversations and memories
- Scrypt password hashing with timing-safe comparison
- HttpOnly session cookies with SameSite=Strict and sliding 30-day expiry
- CSRF protection via custom header validation
- Rate limiting on login (5 attempts per 15 minutes per IP)
- Strict Content Security Policy on all pages
Toggleable in Server Settings without restart. When enabled:
- Logs all requests, SSE events, memory operations, and tool calls
- Rotating log files (5 MB × 5 files = 25 MB max)
- View log file sizes directly in the settings panel
- Export as Markdown or JSON
- Keyboard shortcuts —
Cmd+Nnew chat,Cmd+Shift+Ssidebar,Cmd+,settings,Cmd+Shift+Eexport,Escclose - PWA — install on any device's home screen
- Dark theme — tuned for extended use, matched to LM Studio's aesthetic
- Incognito mode — toggle disables history and memory for the session (ephemeral, not persisted)
- Accessibility — full keyboard navigation, focus indicators, ARIA labels, screen reader support,
prefers-reduced-motionrespected - Mobile-responsive — collapsible sidebar, 44px touch targets, always-visible actions on touch
- Image and file attachments — drag-and-drop images (JPEG, PNG, WebP, GIF) and text files (code, markdown, CSV, JSON, etc.)
- Syntax highlighting — vendored highlight.js, no CDN dependency
- Slash command autocomplete —
/research,/code,/write,/analyze,/architect,/compact,/help
Sidebar with pinned chats and folders — iPhone PWA
LM Studio is already great on the desktop. lm-chat extends it into a web-accessible, multi-user platform:
| LM Studio Desktop | lm-chat | |
|---|---|---|
| Chat with MCP tools | Yes | Yes (via native API) |
| Web / browser access | No | Yes |
| Mobile PWA | No | Yes |
| Multi-user auth | No | Yes |
| Adaptive memory | No | Yes |
| Persistent chat history | Session-based | SQLite-backed |
| Semantic search | No | Yes |
| Pinned chats & folders | No | Yes |
| Share conversations | No | Yes |
| System prompt presets | No | Yes |
| Self-Consistency / CoVe | No | Yes |
| Remote access (Tailscale, etc.) | Requires desktop | Browser-based |
browser ──HTTP──> server.py ──HTTP──> LM Studio
(port 3001) (port 1234)
SQLite · Auth MCP servers
Memory · Logging Inference
server.py— stdlib Python, zero dependencies. Proxies native API, persists chats, manages auth, indexes embeddings, handles memory distillation, structured logging. ~3.7k lines.qr.py— pure-Python QR code generator for TOTP enrollment. ~345 lines.index.html— HTML shell. ~655 lines.style.css— all CSS, organized with@layerand native nesting. ~3.5k lines.app.js— all client-side JS. ~5.9k lines.manifest.json+sw.js— PWA support.highlight.min.js+highlight.min.css— vendored syntax highlighting, no CDN.logs/— rotating debug logs (auto-created, gitignored).
No frameworks. No transpilation. No node_modules. No build step.
| Variable | Default | Description |
|---|---|---|
PORT |
3001 |
Server port |
LMSTUDIO_URL |
http://localhost:1234 |
LM Studio API URL |
LMSTUDIO_TOKEN |
(empty) | Bearer token (also configurable per-user in UI) |
LM_CHAT_AUTH |
true |
Authentication (false to disable) |
LM_CHAT_SECRET |
(auto-generated) | Signing key for sessions and TOTP |
LM_CHAT_ADMIN_USER |
admin |
Initial admin username (first run only) |
LM_CHAT_ADMIN_PASS |
(auto-generated) | Initial admin password (printed to stderr if not set) |
LM_CHAT_DEBUG |
(off) | Start with debug logging enabled (also toggleable in UI) |
LM_CHAT_DB |
./chats.db |
SQLite database path (Docker: /app/data/chats.db) |
LM_CHAT_LOGS |
./logs |
Log directory path (Docker: /app/data/logs) |
LM_CHAT_HTTPS |
(off) | Secure cookie flag (also auto-detected via X-Forwarded-Proto) |
LMSTUDIO_MCP_JSON |
~/.lmstudio/mcp.json |
Path to LM Studio MCP config |
# Quick start
docker run -d -p 3001:3001 -v ./lm-chat-data:/app/data ghcr.io/chevron7locked/lm-chat:nightly
# With Docker Compose
curl -O https://raw.githubusercontent.com/Chevron7Locked/lm-chat/main/docker-compose.yml
docker compose up -d
# Nightly builds (latest from main)
docker pull ghcr.io/chevron7locked/lm-chat:nightlyPlatforms: linux/amd64, linux/arm64 (Apple Silicon, Raspberry Pi, AWS Graviton)
Data persistence: Mount a directory to /app/data — stores the SQLite database, logs, and signing key. Without a mount, data is lost on container restart.
Security hardening: The default docker-compose.yml runs with read_only: true, no-new-privileges, and all capabilities dropped. Only /tmp and /app/data are writable.
Connecting to LM Studio:
- Same machine (Docker Desktop):
LMSTUDIO_URL=http://host.docker.internal:1234(default in image) - Remote server:
LMSTUDIO_URL=http://192.168.1.x:1234 - Docker network:
LMSTUDIO_URL=http://lmstudio:1234
- Load a model in LM Studio
- Configure MCP servers in
~/.lmstudio/mcp.json(docs) - Enable "Allow calling servers from mcp.json" in LM Studio Developer Settings
- For remote MCP: enable "Allow per-request MCPs" in Developer Settings
- For semantic search: load an embedding model —
nomic-embed-text-v1.5is bundled with LM Studio
With Docker Compose and restart: unless-stopped, the container starts automatically when Docker Desktop launches. Enable "Start Docker Desktop when you sign in" in Docker Desktop settings.
For bare Python (without Docker):
cat > ~/Library/LaunchAgents/com.lm-chat.plist << 'EOF'
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key>
<string>com.lm-chat</string>
<key>ProgramArguments</key>
<array>
<string>/usr/bin/python3</string>
<string>/path/to/lm-chat/server.py</string>
</array>
<key>RunAtLoad</key>
<true/>
<key>KeepAlive</key>
<true/>
<key>WorkingDirectory</key>
<string>/path/to/lm-chat</string>
</dict>
</plist>
EOF
launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.lm-chat.plistNote: If switching from launchd to Docker, unload the agent first:
launchctl bootout gui/$(id -u) ~/Library/LaunchAgents/com.lm-chat.plistTailscale + http://your-mac-hostname:3001. Add to home screen for the full PWA experience.
Copyright (c) 2026 chevron7locked
GNU Affero General Public License v3.0
For commercial licensing, contact dev@chevron7.io
