Run ComfyUI locally with a RunPod-compatible API handler, then deploy to RunPod serverless or GPU pods.
Open Source: This project is open source and welcomes contributions! See CONTRIBUTING.md for guidelines.
docker compose upFirst run: Initial startup downloads ComfyUI, installs dependencies with uv (10-100x faster than pip!), and configures models/nodes from config.yml. First run ~2-3 minutes, subsequent runs ~10-30 seconds thanks to volume-first architecture and SHA-based caching.
Open your browser:
- http://localhost:8188 - ComfyUI interface (design workflows)
- http://localhost:8000 - API endpoint (test requests)
- http://localhost:8888 - Jupyter Lab (edit config, manage files)
Press Ctrl+C to stop.
✅ Lightning-fast package installation - uv package manager (10-100x faster than pip)
✅ Volume-first architecture - Minimal 8.7GB container, everything persists on volume
✅ Universal GPU support - PyTorch 2.9.0 + CUDA 12.8 (RTX 4090, RTX 5090, future GPUs)
✅ Blazing downloads - hf_transfer for HuggingFace (100-200+ MB/s), parallel chunks for others
✅ ComfyUI auto-installs on first run (updateable, persistent)
✅ Full web interface for workflow design
✅ Jupyter Lab for config editing and file management
✅ RunPod-compatible API handler
✅ SHA-based config caching - Skip reinstalls when config unchanged
✅ Works everywhere - Mac, Linux, Windows
🚀 Serverless Endpoints - Auto-scaling production APIs with scale-to-zero 🔧 GPU Pods - Interactive development with Jupyter and SSH access
Deploy ComfyUI as a serverless API endpoint with auto-scaling and scale-to-zero cost savings.
- RunPod Account - Sign up at https://runpod.io
- Payment Method - Add to your RunPod account
- Network Volume - For persistent models/nodes (recommended)
- Go to https://runpod.io/console/serverless
- Click "New Template"
- Configure:
Name: ComfyUI Handler
Container Image: artokun/comfyui-runpod:latest
Container Disk: 20 GB
Environment Variables (Optional - all have defaults):
RUN_MODE=endpoint # Skips Jupyter for fast cold starts
# AUTO_UPDATE=false # (default)
# COMFY_API_URL=http://127.0.0.1:8188 # (default)
Expose HTTP Ports:
Leave blank (API-only)
Or: 8188 (for debugging ComfyUI UI)
- Click "Save Template"
- Go to Storage → Network Volumes
- Click "New Network Volume"
- Configure:
- Name:
comfyui-volume - Region: Choose region with GPU availability
- Size: 100 GB (recommended for models + ComfyUI)
- Name:
- Click "Create"
Why volume is required:
- ComfyUI installs to
/runpod-volume/ComfyUI(persistent) - Models stored permanently (no re-download)
- Custom nodes persist
- Faster cold starts after first run
- Go back to Serverless
- Click "New Endpoint"
- Configure:
Name: comfyui-production
Select Template: ComfyUI Handler
Select Network Volume: comfyui-volume
GPUs:
☑ RTX 4090 (or RTX 5090)
Min Workers: 0
Max Workers: 3
Advanced:
Idle Timeout: 5 seconds
Execution Timeout: 600 seconds
Max Concurrent Requests: 1
- Click "Deploy"
export RUNPOD_ENDPOINT_ID="your-endpoint-id"
export RUNPOD_API_KEY="your-api-key"
curl -X POST "https://api.runpod.ai/v2/${RUNPOD_ENDPOINT_ID}/runsync" \
-H "Authorization: Bearer ${RUNPOD_API_KEY}" \
-H "Content-Type: application/json" \
-d @examples/example_request.jsonRTX 4090:
- Idle: $0.00/hour (scale to zero)
- Active: ~$0.50-0.70/hour
- Per workflow: ~$0.01-0.03
RTX 5090:
- Idle: $0.00/hour
- Active: ~$1.50-2.00/hour
- Per workflow: ~$0.01-0.07
Tips to reduce costs:
- Scale to zero (min workers: 0)
- Set short idle timeout (5 seconds)
- Batch multiple images
- Use appropriate resolution
Deploy ComfyUI to RunPod GPU Pods for interactive development with Jupyter notebook access.
Pods are traditional GPU instances with:
- ✅ SSH and Jupyter notebook access
- ✅ Direct file system access
- ✅ Interactive development
- ✅ Pay-per-hour billing (running time only)
Use Pods for: Development, testing, interactive ComfyUI design, experimenting Use Serverless for: Production APIs, auto-scaling, scale-to-zero cost savings
- Go to https://runpod.io/console/storage
- Click "New Network Volume"
- Configure:
- Name:
comfyui-volume - Region: Choose region with RTX 4090/5090
- Size: 50 GB minimum
- Name:
- Click "Create"
- Go to https://runpod.io/console/pods
- Click "Deploy" or "GPU Pods"
- Select GPU: RTX 4090 ($0.50-0.70/hr) or RTX 5090
- Configure:
Container Image: artokun/comfyui-runpod:latest Container Disk: 50 GB Volume Mount: comfyui-volume → /runpod-volume Expose HTTP Ports: 8188, 8000, 8888 Environment Variables: RUN_MODE=production # Enables Jupyter Lab - Click "Deploy"
Once deployed:
- Connect button → Opens Jupyter notebook
- TCP Port Mappings → External URLs for ports
Access points:
- Port 8188 → ComfyUI web interface
- Port 8000 → RunPod handler API
- Port 8888 → Jupyter Lab
Pods bill per second while running:
- RTX 4090: ~$0.50-0.70/hour
- RTX 5090: ~$1.50-2.00/hour
Tip: Stop pods when not in use! Your volume data persists.
- Stop: Click "Stop" button (volume persists)
- Start: Click "Start" button (restores from volume)
- Terminate: Deletes pod (volume remains)
The config.yml file controls which models and custom nodes are installed. It's mounted as a volume, so you can edit anytime without rebuilding the Docker image!
Default config: Minimal setup with SD 1.5 + ComfyUI Manager (~4GB) for fast builds.
Advanced example:
See config.example.yml for complete WAN Animate 2.2 setup with 11 models and 20+ nodes (~30GB).
Step 1: Access Jupyter Lab
- Local: http://localhost:8888
- RunPod Pod: Port 8888 in pod connection info
Step 2: Navigate to config.yml
- RunPod:
/runpod-volume/config.ymlor/workspace/config.yml - Local:
/workspace/config.yml
Step 3: Edit and save (Ctrl+S or Cmd+S)
models:
- url: https://huggingface.co/stabilityai/sdxl-vae/resolve/main/sdxl_vae.safetensors
destination: vae
optional: false
nodes:
- url: https://github.com/Kosinkadink/ComfyUI-VideoHelperSuite.git
version: latestStep 4: Apply changes (see Applying Changes below)
nano config.yml
# or
code config.yml # VS CodeThen restart container:
docker compose restartFor RunPod Endpoints (serverless), set as environment variable:
- Prepare your
config.ymlfile - Encode at https://www.base64encode.org/ (copy file, paste, ENCODE)
- Set in RunPod template settings:
CONFIG_YML=<base64-encoded-string>
The container automatically decodes and applies on startup!
Configuration Priority:
- 🥇
CONFIG_YMLenv var → Writes to volume (persistent) - 🥈
config.ymlon volume → Can be edited directly - 🥉 Baked-in default → Fallback
After editing config.yml, run the apply script to install new models/nodes without restarting:
Via Jupyter Terminal:
cd /app
chmod +x apply_config.sh
./apply_config.shVia Docker:
docker compose exec comfyui /app/apply_config.shWhat it does:
- ✅ Installs new custom nodes
- ✅ Downloads new models
- ✅ Skips already installed nodes/models (fast!)
Then restart ComfyUI (custom nodes require restart):
- RunPod Pods: Stop and start pod from console
- Local:
docker compose restart
models:
- url: https://huggingface.co/model.safetensors
destination: checkpoints # Where to place the model
optional: false # Skip if download fails?
- url: https://civitai.com/api/download/models/123456
destination: loras
optional: trueSupported destinations:
checkpoints- Main model checkpointsvae- VAE modelsloras- LoRA modelscontrolnet- ControlNet modelsclip_vision- CLIP vision modelsembeddings- Text embeddingsupscale_models- Upscaler modelsdiffusion_models- Diffusion modelstext_encoders- Text encoder models
nodes:
- url: https://github.com/user/repo.git
version: latest # Latest stable release (tag)
- url: https://github.com/user/repo.git
version: nightly # Latest commit (bleeding edge)
- url: https://github.com/user/repo.git
version: v1.2.3 # Specific tag
- url: https://github.com/user/repo.git
version: abc1234 # Specific commit hash
- url: https://github.com/user/repo.git
version: main # Specific branchVersion options:
latest- Latest stable release tag (recommended)nightly- Latest commit on default branchv1.2.3- Specific version tagabc1234- Specific commit hashmain- Track a specific branch
The container uses SHA256 hashing to detect config changes:
- First run: Calculates SHA of
config.yml, installs everything, stores SHA - Subsequent runs: Compares current SHA with stored SHA
- If match: Skips all downloads/installs (seconds instead of minutes!)
- If different: Applies updates and updates SHA
SHA file locations:
- RunPod:
/runpod-volume/.config-sha256 - Local:
/workspace/.config-sha256
Force reinstall:
rm /runpod-volume/.config-sha256 # RunPod
rm /workspace/.config-sha256 # LocalThis dramatically improves RunPod Endpoint cold start performance!
Optional: Enable auto-updates or mount existing models.
cp .env.example .env
# Edit .env fileAuto-Update:
AUTO_UPDATE=true # Update ComfyUI on startupModels Directory:
MODELS_PATH=/path/to/existing/models # Mount existing modelsJupyter Password:
JUPYTER_PASSWORD=your_secure_password # Enable password protectionIf not set, Jupyter Lab is accessible without authentication (default for local development).
Open http://localhost:8188 and create your workflow visually.
- Enable Dev Mode: Settings → Dev Mode
- Save workflow: Save (API Format)
- Save to
workflows/my_workflow.json
curl -X POST http://localhost:8000/runsync \
-H "Content-Type: application/json" \
-d @examples/example_request.jsonOr with Python:
import requests
import json
with open('examples/example_request.json') as f:
response = requests.post('http://localhost:8000/runsync', json=json.load(f))
print(response.json())./deploy.shSame workflows work immediately in production!
# Start services (with logs)
docker compose up
# Start in background
docker compose up -d
docker compose logs -f # View logs
# Stop services
docker compose down
# Rebuild after changes
docker compose up --build
# Test locally without Docker
python examples/test_local.py
# Build for RunPod
./build.sh
# Deploy to production
./deploy.shContainer: Minimal shell (~8.7GB) with CUDA runtime + system dependencies only Volume: All Python packages, PyTorch, ComfyUI, models, custom nodes (persistent)
Why?
- ✅ No wasted disk space from package duplication
- ✅ True persistence across container rebuilds
- ✅ Faster deployments (smaller images)
- ✅ Easy updates without image rebuilds
How it works:
- Container sets
PIP_TARGET=/workspace/python-packages(volume) - All packages install directly to volume using uv (10-100x faster than pip!)
- Container is stateless, volume holds everything important
- Rebuild container anytime, data persists!
This project uses uv (https://github.com/astral-sh/uv), a Rust-based pip replacement that's 10-100x faster:
- ✅ Parallel downloads across all your bandwidth
- ✅ Resolves 166 packages in <1 second
- ✅ Installs packages in milliseconds
- ✅ Perfect for 2Gbps+ connections
Your downloads will fly at 100-200+ MB/s instead of the old 15 MB/s with pip!
Configure via RUN_MODE environment variable:
development(default local) - ComfyUI + Handler + Jupyter Labproduction(RunPod Pods) - ComfyUI + Handler + Jupyter Labendpoint(RunPod Serverless) - ComfyUI + Handler only (skips Jupyter for fast cold starts)
Universal Image - One image for all modern NVIDIA GPUs:
- CUDA 12.8
- PyTorch 2.9.0+cu128
- RTX 4090 (Ada, compute 8.9) ✓
- RTX 5090 (Blackwell, compute 12.0) ✓
- Future architectures supported out-of-the-box!
No architecture-specific builds needed!
Expected generation times on RTX 4090:
- SD 1.5: ~5-10 seconds
- SDXL: ~20-30 seconds
- FLUX: ~90-120 seconds
RTX 5090 is 40-60% faster when available.
Download speeds:
- HuggingFace: 100-200+ MB/s (hf_transfer enabled)
- Civitai/Others: Parallel 8-thread chunks
- PyPI packages: Parallel downloads via uv
{
"input": {
"workflow": { /* ComfyUI workflow in API format */ },
"overrides": [
{
"node_id": "6",
"field": "inputs.text",
"value": "a beautiful sunset"
},
{
"node_id": "3",
"field": "inputs.seed",
"value": 42
}
]
}
}{
"status": "success",
"prompt_id": "abc-123",
"execution_time": 8.32,
"images": [
{
"url": "http://127.0.0.1:8188/view?filename=ComfyUI_00001.png",
"filename": "ComfyUI_00001.png"
}
]
}comfy-template/
├── README.md # This file
├── CLAUDE.md # Project instructions for Claude Code
├── docker-compose.yml # Run with: docker compose up
├── Dockerfile # Universal GPU support (all in one!)
├── .env.example # Configuration template
│
├── handler.py # RunPod worker logic
├── s3_upload.py # S3 upload module
├── requirements.txt # Python dependencies
├── start.sh # Container startup script
├── entrypoint.sh # Entrypoint script
├── test_input.json # Local test input for RunPod SDK
│
├── build.sh # Build production image
├── deploy.sh # Deploy to Docker Hub
├── download_models.py # Model downloader
├── install_nodes.py # Custom nodes installer
├── apply_config.sh # Apply config changes without restart
├── config.yml # Unified configuration (models + nodes)
│
├── workspace/ # Persistent workspace (local dev)
│ └── ComfyUI/ # Auto-created on first run
│ ├── main.py # ComfyUI application
│ ├── models/ # Model files
│ ├── custom_nodes/ # Custom nodes
│ └── output/ # Generated images
│
├── docs/ # Documentation
└── examples/ # Examples and testing
├── example_workflow.json
├── example_request.json
└── test_local.py
Local Development:
- Docker with GPU support
- NVIDIA GPU
- NVIDIA Container Toolkit
Production:
- Docker Hub account
- RunPod account
- Network volume (recommended)
- ✅ Linux: Full GPU support
- ✅ Windows: Full GPU support (native or WSL2)
⚠️ Mac: Works for workflow design, no GPU acceleration (use RunPod for generation)
# Test GPU access
nvidia-smi
docker run --rm --gpus all nvidia/cuda:12.0.0-base-ubuntu22.04 nvidia-smi# Find what's using port 8188
lsof -i :8188 # Mac/Linux
netstat -ano | findstr "8188" # Windows# Clean build
docker compose build --no-cacheCheck pod logs:
- Pod details → "Logs" tab
- Look for startup messages
- Ensure ports 8188, 8000, 8888 exposed
The scripts use:
- ✅ uv for Python packages (10-100x faster)
- ✅ hf_transfer for HuggingFace (100-200+ MB/s)
- ✅ 8-thread parallel chunks for Civitai
If still slow, check your network connection.
Common causes:
- Git URL typo - Verify repository URL
- Version doesn't exist - Check repo for valid tags
- Missing dependencies - Some nodes need system packages
Check error messages for details.
See examples/ directory for:
- Sample workflows in API format
- Example API requests
- Local testing without Docker
We welcome contributions! Please see CONTRIBUTING.md for details on:
- 🐛 Reporting bugs
- ✨ Proposing features
- 🔧 Submitting pull requests
- 📝 Improving documentation
# Fork and clone the repo
git clone https://github.com/YOUR-USERNAME/comfyui-runpod-handler.git
cd comfyui-runpod-handler
# Install development dependencies
pip install -r requirements-dev.txt
# Install pre-commit hooks
pre-commit install
# Run tests
pytest
# Start development environment
docker compose upSee CONTRIBUTING.md for detailed instructions.
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Code of Conduct: CODE_OF_CONDUCT.md
This project is licensed under the MIT License - see the LICENSE file for details.
- Built using ComfyUI by comfyanonymous
- Package management powered by uv by Astral
- Deployment patterns inspired by WAN-ANIMATE
- RunPod serverless infrastructure by RunPod
Special thanks to all contributors who have helped improve this project!