Open-WebUI Service

Open-WebUI is a self-hosted AI chat interface that provides a ChatGPT-like experience with support for Ollama, OpenAI, and other LLM backends.

Overview

Purpose: AI chat interface with multi-model support Version: 0.6.40 (Helm chart) Port: 8080 (application), 443 (nginx-tls) Storage: 10Gi for chat history and configurations Access: VPN-only via HTTPS Authentication: LDAP via FreeIPA + local users

Features

Multiple LLM Backends - Ollama, OpenAI, vLLM
Chat History - Persistent conversation storage
LDAP Authentication - FreeIPA integration
Model Management - Download and manage models
Ollama Integration - Bundled Ollama for local models
OpenAI API Compatible - Works with OpenAI API keys
OpenTelemetry Tracing - Distributed tracing to Tempo (gRPC on port 4317)

Architecture

┌─────────────────────────────────────────┐
│       Open-WebUI Pod                    │
├─────────────────────────────────────────┤
│  ┌──────────┐  ┌────────────────────┐   │
│  │ nginx-   │─▶│   Open-WebUI       │   │
│  │ tls      │  │   (port 8080)      │   │
│  │ (443)    │  └────────────────────┘   │
│  └──────────┘           │                │
│                         ▼                │
│              ┌──────────────────┐        │
│              │     Ollama       │        │
│              │  (local models)  │        │
│              └──────────────────┘        │
│                         │                │
│                         ▼                │
│              ┌──────────────────┐        │
│              │  FreeIPA LDAPS   │        │
│              │  Authentication  │        │
│              └──────────────────┘        │
│                                          │
│  ┌──────────┐                            │
│  │Tailscale │ VPN connectivity           │
│  └──────────┘                            │
└─────────────────────────────────────────┘

Configuration

Terraform Variables

# terraform.tfvars
open_webui_enabled           = true
open_webui_version           = "0.6.40"
open_webui_hostname          = "ai.example.com"
open_webui_storage           = "10Gi"
open_webui_tailscale_enabled = true
open_webui_ollama_enabled    = true

# Optional: OpenAI API keys (semicolon-separated)
open_webui_open_api_keys = "sk-xxxxx;sk-yyyyy"

Resource Limits

open_webui_cpu_request    = "500m"
open_webui_memory_request = "1Gi"
open_webui_cpu_limit      = "2"
open_webui_memory_limit   = "2Gi"

Access

Web UI

# Via VPN
open https://ai.example.com

First Login

Option 1 Local Admin Account

Navigate to https://ai.example.com
Create first admin account
Additional users can use LDAP

Option 2 LDAP (FreeIPA)

Navigate to https://ai.example.com
Click "Sign in"
Use FreeIPA credentials

LDAP Authentication

LDAP Configuration

Configured automatically via Terraform:

ENABLE_LDAP=true
LDAP_SERVER_HOST=freeipa.dev.svc.cluster.local
LDAP_SERVER_PORT=636
LDAP_USE_TLS=true
LDAP_SEARCH_BASE=cn=users,cn=accounts,dc=dev,dc=svc,dc=cluster,dc=local
LDAP_ATTRIBUTE_FOR_USERNAME=uid
LDAP_ATTRIBUTE_FOR_MAIL=mail
LDAP_SEARCH_FILTER=(uid=%s)
LDAP_APP_DN=uid=admin,cn=users,cn=accounts,dc=dev,dc=svc,dc=cluster,dc=local

User Login

FreeIPA users can log in with their credentials:

Username: FreeIPA uid (e.g., privileged_user)
Password: FreeIPA password

Certificate Trust

FreeIPA CA certificate is automatically:

Mounted at /ca/ca.crt
Installed to system trust store via init container
Trusted for LDAPS connections

OpenTelemetry Tracing

Open-WebUI automatically sends distributed traces to Tempo:

Protocol: OTLP gRPC
Endpoint: http://tempo.monitoring.svc.cluster.local:4317
Traces Include: API calls, model inference times, user interactions
Integration: View traces in Grafana with traces-to-metrics/logs correlations

Configuration:

# Configured automatically via terraform/open-webui.tf
OTEL_SERVICE_NAME: "open-webui"
OTEL_TRACES_EXPORTER: "otlp"
OTEL_EXPORTER_OTLP_ENDPOINT: "http://tempo.monitoring.svc.cluster.local:4317"
OTEL_EXPORTER_OTLP_PROTOCOL: "grpc"

Viewing Traces:

Access Grafana → Explore → Tempo datasource
Search for service name "open-webui"
View trace spans for API calls and model inferences
Correlate with Prometheus metrics and Loki logs

LLM Backends

Bundled Ollama

If open_webui_ollama_enabled = true:

Endpoint: http://localhost:11434
Models: Downloaded on-demand via UI
Storage: Shared with Open-WebUI

Popular models:

# Pull models via UI or command:
kubectl exec -n dev open-webui-0 -c ollama -- \
  ollama pull llama2

kubectl exec -n dev open-webui-0 -c ollama -- \
  ollama pull codellama

OpenAI API

Configure OpenAI keys in terraform.tfvars:

open_webui_open_api_keys = "sk-proj-xxxxx;sk-proj-yyyyy"

Or add in UI:

Settings → Connections
OpenAI API → Add Key

vLLM Models

If vllm_enabled = true, vLLM models available at:

DeepSeek Coder: http://vllm-deepseek.dev.svc.cluster.local:8000
Hermes 3: http://vllm-hermes.dev.svc.cluster.local:8000

Add in UI:

Settings → Connections
OpenAI API Compatible → Add Server
URL: http://vllm-deepseek.dev.svc.cluster.local:8000

Common Operations

Download Ollama Model

# Via kubectl
kubectl exec -n dev open-webui-0 -c ollama -- \
  ollama pull mistral

# Via UI
# Settings → Models → Pull Model → Enter model name

List Models

kubectl exec -n dev open-webui-0 -c ollama -- ollama list

Delete Model

kubectl exec -n dev open-webui-0 -c ollama -- ollama rm <model-name>

View Logs

# Open-WebUI logs
kubectl logs -n dev open-webui-0 -c open-webui -f

# Ollama logs (if bundled)
kubectl logs -n dev open-webui-0 -c ollama -f

Troubleshooting

LDAP Authentication Failing

Error: "User not found in the LDAP server"

Common causes:

Password expired - Reset in FreeIPA
User doesn't exist - Create in FreeIPA
Wrong base DN - Verify matches FreeIPA domain

Debug:

# Check LDAP connectivity
kubectl exec -n dev open-webui-0 -c open-webui -- \
  nc -zv freeipa.dev.svc.cluster.local 636

# Check CA certificate
kubectl exec -n dev open-webui-0 -c open-webui -- \
  ls -la /etc/ssl/certs/freeipa-ca.pem

# View Open-WebUI logs
kubectl logs -n dev open-webui-0 -c open-webui | grep -i ldap

See LDAP Troubleshooting Guide for detailed debugging.

Ollama Models Not Loading

# Check Ollama is running
kubectl exec -n dev open-webui-0 -c ollama -- ollama list

# Check storage space
kubectl exec -n dev open-webui-0 -c ollama -- df -h

# Pull model manually
kubectl exec -n dev open-webui-0 -c ollama -- ollama pull llama2

Cannot Access Web UI

# Check pod status
kubectl get pods -n dev open-webui-0

# Check all containers running
kubectl get pod open-webui-0 -n dev -o jsonpath='{.status.containerStatuses[*].ready}'

# Check ingress
kubectl get ingress -n dev open-webui

# Verify VPN connection
tailscale status

High Memory Usage

Ollama models can be memory-intensive:

# Increase memory limits
# In terraform.tfvars:
open_webui_memory_limit = "4Gi"

# Or use external Ollama instead
open_webui_ollama_enabled = false

Storage

Data Persistence

Open-WebUI stores:

Chat history
User preferences
Downloaded models (if Ollama bundled)
Configuration

Volume: 10Gi persistent volume (configurable)

Backup

# Backup persistent volume
kubectl exec -n dev open-webui-0 -c open-webui -- \
  tar czf /tmp/openwebui-backup.tar.gz /app/backend/data

# Copy to local
kubectl cp dev/open-webui-0:/tmp/openwebui-backup.tar.gz \
  ./openwebui-backup.tar.gz -c open-webui

Security

LDAPS encryption for authentication
VPN-only access via Tailscale
IP allowlisting (100.64.0.0/10)
TLS certificates via cert-manager
OpenAI keys stored as Kubernetes secrets
FreeIPA CA certificate for trust

Performance Tuning

Resource Allocation

For better performance with local models:

open_webui_cpu_limit      = "4"
open_webui_memory_limit   = "8Gi"
open_webui_storage        = "50Gi"  # More models

Use External Ollama

For GPU acceleration or distributed models:

open_webui_ollama_enabled = false

# Configure external Ollama in UI:
# Settings → Connections → Ollama API
# URL: http://ollama-external.example.com:11434

FilesExpand file tree

open-webui.md

Latest commit

History