Documentation for Ollama integration setup for the CatalystChatbox agent using the llama3.3:70b model.
The CatalystChatbox agent (application/app/Agents/CatalystChatbox.php) is configured to use Ollama with the llama3.3:70b model for local AI processing. This setup provides:
- Local AI processing (no external API dependencies)
- High-performance llama3.3:70b model optimized for chat interactions
- Full control over AI processing and data privacy
- Integration with the Vizra ADK framework
The catalystexplorer.ollama service is configured in docker-compose.yml:
catalystexplorer.ollama:
image: ollama/ollama:0.6.2
container_name: catalystexplorer.ollama
ports:
- "${FORWARD_OLLAMA_PORT:-11434}:11434"
volumes:
- "catalystexplorer-ollama:/root/.ollama"
- "./docker/scripts/ollama-init.sh:/ollama-init.sh"
environment:
- OLLAMA_KEEP_ALIVE=24h
- OLLAMA_HOST=0.0.0.0
restart: unless-stoppedThe following variables are configured in application/.env:
# Ollama AI Configuration
OLLAMA_HOST=http://catalystexplorer.ollama:11434
OLLAMA_MODEL=llama3.3:70b
OLLAMA_TIMEOUT=300
FORWARD_OLLAMA_PORT=11434The Ollama provider is configured in application/config/vizra-adk.php:
'default_provider' => env('VIZRA_ADK_DEFAULT_PROVIDER', 'ollama'),
'default_model' => env('VIZRA_ADK_DEFAULT_MODEL', 'llama3.3:70b'),
'providers' => [
'ollama' => [
'base_url' => env('OLLAMA_HOST', 'http://catalystexplorer.ollama:11434'),
'timeout' => env('OLLAMA_TIMEOUT', 300),
'keep_alive' => env('OLLAMA_KEEP_ALIVE', '24h'),
],
// ... other providers
],The CatalystChatbox agent is configured to use the Ollama model:
protected string $model = 'ollama:llama3.3:70b';
protected string $description = 'AI assistant for Project Catalyst community, providing help with proposals, funding, and community resources.';Start the Docker services including Ollama:
make upPull the llama3.3:70b model (this may take some time depending on your internet connection):
make ollama-pullAlternatively, you can initialize using the setup script:
make ollama-initCheck that the model is available:
make ollama-statusYou should see llama3.3:70b in the list of available models.
Run the test script to verify everything is working:
php test_ollama_setup.phpThe following Makefile commands are available for managing Ollama:
make ollama-pull- Pull the llama3.3:70b modelmake ollama-init- Initialize Ollama with the model using the setup scriptmake ollama-status- Check Ollama status and list available modelsmake ollama-shell- Open a shell in the Ollama containermake ollama-logs- Show Ollama container logsmake ollama-chat- Start an interactive chat session with the modelmake ollama-restart- Restart the Ollama service
use App\Agents\CatalystChatbox;
use Vizra\VizraADK\System\AgentContext;
$agent = new CatalystChatbox();
$context = new AgentContext();
$response = $agent->prompt("How do I create a Project Catalyst proposal?", $context);
echo $response;The agent can be accessed through the Vizra ADK web interface at /vizra (if enabled in configuration).
The agent is accessible via the OpenAI-compatible API endpoint at /api/vizra-adk/chat/completions.
-
Model not found error
make ollama-pull make ollama-status
-
Connection timeout
- Check if Ollama service is running:
docker-compose ps - Check logs:
make ollama-logs - Restart service:
make ollama-restart
- Check if Ollama service is running:
-
Out of memory errors
- The llama3.3:70b model requires significant RAM (recommended: 64GB+)
- Consider using a smaller model like
llama3.1:8bfor development
-
Slow responses
- Ensure adequate system resources (CPU, RAM)
- Consider using GPU acceleration if available
- Adjust
OLLAMA_KEEP_ALIVEto keep models in memory longer
- GPU Acceleration: If you have a compatible GPU, Ollama will automatically use it
- Memory Management: The model stays in memory for 24 hours (
OLLAMA_KEEP_ALIVE=24h) - Concurrent Requests: Ollama can handle multiple concurrent requests efficiently
If llama3.3:70b is too resource-intensive, consider these alternatives:
llama3.3:8b- Smaller, faster modelllama3.1:8b- Proven stable modelphi3:medium- Efficient smaller modelcodellama:13b- Code-focused model
To change models, update the configuration and pull the new model:
# Update .env
OLLAMA_MODEL=llama3.1:8b
# Pull new model
docker-compose exec catalystexplorer.ollama ollama pull llama3.1:8b
# Update agent model in CatalystChatbox.php
protected string $model = 'ollama:llama3.1:8b';- Ollama runs locally within the Docker network, no external API calls
- Model weights are stored in the
catalystexplorer-ollamaDocker volume - All AI processing happens on your infrastructure
- No data is sent to external services
- CPU: 4+ cores
- RAM: 32GB (for llama3.3:70b)
- Storage: 50GB for model weights
- Network: Good internet connection for initial model download
- CPU: 8+ cores
- RAM: 64GB+
- GPU: NVIDIA GPU with 24GB+ VRAM (optional, for acceleration)
- Storage: SSD with 100GB+ free space
The Ollama service includes health checks that verify the service is responding:
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:11434/api/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60sMonitor Ollama logs for issues:
make ollama-logsModel weights are stored in the catalystexplorer-ollama volume. To reset:
docker volume rm catalystexplorer-ollama
make ollama-pull # Re-download models