Mastra AI agents running on Runpod Serverless CPU with Load Balancer support.
This project is a starting point for developers to build and deploy AI agents using Mastra. Fork it, modify the agents, add your own tools, and deploy to Runpod.
Note: This project uses automated CI/CD workflows for building and pushing Docker images to Docker Hub.
- Multiple AI agents with tool access (Weather Agent, Runpod Infra Agent, Web Search Agent, Docs RAG Agent)
- MCP (Model Context Protocol) integration for external tools
- Web search with Exa for AI-optimized search results
- RAG (Retrieval Augmented Generation) with LibSQL vector store
- Runpod AI SDK provider with Qwen3-32B model
/pinghealth check endpoint for Runpod serverless load balancer- Optional PostgreSQL storage with PgVector for agent memory
- LibSQL file-based storage with network volume support
- Optimized Docker image
- Production-ready build
This project includes example agents. Use them as templates for your own agents.
A simple agent that fetches weather information for any location.
curl -X POST http://localhost:8080/api/agents/weatherAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "What is the weather in Berlin?"}]}'An agent that manages Runpod infrastructure using the Runpod MCP Server. It can list, create, and delete pods.
# List all pods
curl -X POST http://localhost:8080/api/agents/runpodInfraAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "List all my pods"}]}'
# Create a pod
curl -X POST http://localhost:8080/api/agents/runpodInfraAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Create a pod named my-test-pod with image runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04, RTX 4090 GPU, 1 GPU count. Proceed."}]}'
# Delete a pod (replace POD_ID with actual pod ID)
curl -X POST http://localhost:8080/api/agents/runpodInfraAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Delete pod POD_ID. I confirm."}]}'An agent that searches the web using Exa and provides summarized results with sources.
Required: Set EXA_API_KEY environment variable (get one here)
# Search and summarize
curl -X POST http://localhost:8080/api/agents/webSearchAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "What are the latest developments in AI agents?"}]}'
# Search with specific result count
curl -X POST http://localhost:8080/api/agents/webSearchAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Search for Runpod serverless GPU news, max 3 results"}]}'The agent returns:
- Key points summarizing the findings
- Sources with titles and URLs
A RAG (Retrieval Augmented Generation) agent that answers questions about Runpod documentation using semantic search over vector embeddings. This is an example of how to build a documentation assistant for your own projects.
-
Ingestion: Run
scripts/ingest-docs.tsto clone the runpod/docs repo, chunk the markdown files, generate embeddings using OpenAI'stext-embedding-3-small, and store them in a LibSQL vector database. -
Retrieval: When you ask a question, the agent uses
createVectorQueryToolfrom@mastra/ragto find relevant documentation chunks via semantic similarity search. -
Generation: Retrieved chunks are passed to the LLM (Qwen3-32B) as context, which generates an answer based on the documentation.
User Question
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Vector Query โ โโโบ LibSQL Vector DB โโโบ Relevant Chunks
โ Tool โ
โโโโโโโโโโโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Qwen3-32B LLM โ โโโบ Answer based on retrieved docs
โโโโโโโโโโโโโโโโโโโ
Required environment variables:
OPENAI_API_KEY: For generating embeddingsRUNPOD_API_KEY: For the Qwen3 model
Step 1: Run ingestion (populates the vector store)
OPENAI_API_KEY=your-key npx tsx scripts/ingest-docs.tsThis clones runpod/docs, processes ~80 markdown files into ~1700 chunks, and stores embeddings in vector.db.
Step 2: Query the agent
curl -X POST http://localhost:4111/api/agents/docsRagAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "How do I create a serverless endpoint?"}]}'For Runpod Serverless deployments, attach a network volume to persist the vector database across worker restarts:
- Create a network volume in Runpod console
- Attach it to your endpoint (mounts at
/runpod-volume) - Run ingestion once (stores at
/runpod-volume/vector.db) - Vector database persists across restarts
Without a network volume, you'll need to re-run ingestion after each cold start.
-
Modify ingestion script (
scripts/ingest-docs.ts):- Change
REPO_URLto your documentation repository - Update
INCLUDED_DIRSto match your folder structure
- Change
-
Update agent instructions (
src/mastra/agents/docs-rag-agent.ts):- Modify the system prompt for your documentation topics
| File | Purpose |
|---|---|
src/mastra/agents/docs-rag-agent.ts |
RAG agent with vector query tool |
scripts/ingest-docs.ts |
Documentation ingestion script |
src/mastra/index.ts |
Mastra config with vector store |
- Create a new file in
src/mastra/agents/ - Define your agent with instructions, model, and tools
- Register it in
src/mastra/index.ts
Example structure:
// src/mastra/agents/my-agent.ts
import { Agent } from "@mastra/core/agent";
import { createRunpod } from "@runpod/ai-sdk-provider";
const runpod = createRunpod({ apiKey: process.env.RUNPOD_API_KEY });
export const myAgent = new Agent({
name: "My Agent",
instructions: "You are a helpful assistant...",
model: runpod("qwen/qwen3-32b-awq"),
tools: { /* your tools */ },
});Prerequisites:
- Docker installed and running
- Docker Hub credentials (for
runpodorganization)
Steps:
-
Login to Docker Hub:
docker login
Enter your Docker Hub username and password (or access token for
runpodorg) -
Build and push using the helper script:
./build-and-push.sh latest
Or specify a version:
./build-and-push.sh v1.0.0
-
Or manually:
# Build docker build --platform linux/amd64 -t runpod/worker-mastra:latest . # Push docker push runpod/worker-mastra:latest
npm install
npm run build
docker build --platform linux/amd64 -t runpod/worker-mastra:test .RUNPOD_API_KEY: Your Runpod API key for accessing Qwen3 model via AI SDK Provider
EXA_API_KEY: Exa API key for web search agent (get one here)OPENAI_API_KEY: OpenAI API key for RAG embeddings (required for Docs RAG Agent)
DB_HOST: PostgreSQL database host addressDB_USERNAME: PostgreSQL database usernameDB_NAME: PostgreSQL database nameDB_PASSWORD: PostgreSQL database passwordDB_PORT: PostgreSQL database port (default:6543for transaction pooler)
PORT: Server port (default:80)PORT_HEALTH: Health check port (default: same asPORT)MASTRA_PORT: Internal Mastra server port (default:4111)
This project requires a PostgreSQL database with the pgvector extension for agent memory and storage. We recommend using Supabase for easy setup.
- Create a new project at Supabase Dashboard
- Go to Settings โ Database โ Connection string
- Select Transaction pooler mode
- Copy the connection details and set these environment variables:
DB_HOST=db.[PROJECT_REF].supabase.co
DB_USERNAME=postgres
DB_NAME=postgres
DB_PASSWORD=[YOUR_DATABASE_PASSWORD]
DB_PORT=6543
Note: The transaction pooler (port 6543) is recommended for serverless deployments as it handles connection pooling efficiently. For direct connections, use port 5432.
For other PostgreSQL providers (AWS RDS, DigitalOcean, etc.), configure:
DB_HOST=[YOUR_DB_HOST]
DB_USERNAME=[YOUR_DB_USER]
DB_NAME=[YOUR_DB_NAME]
DB_PASSWORD=[YOUR_DB_PASSWORD]
DB_PORT=5432
Make sure your PostgreSQL database has the pgvector extension installed.
For local development and testing, you can run Mastra directly without building a Docker image:
-
Start PostgreSQL and pgAdmin (if not already running):
docker-compose up -d postgres pgadmin
Note: If port 5432 is already in use, you can either:
- Stop your existing PostgreSQL instance, or
- Modify the port mapping in
docker-compose.ymlto use a different port
-
Create a
.envfile in the project root with your environment variables:RUNPOD_API_KEY=your-runpod-api-key-here DB_HOST=localhost DB_USERNAME=worker_mastra_user DB_NAME=worker_mastra DB_PASSWORD=worker_mastra_password DB_PORT=5432
-
Install dependencies (if not already done):
npm install
-
Start the Mastra development server:
npm run dev
This will start:
- ๐ฎ Playground UI: http://localhost:4111/ - Chat with your agents (weatherAgent, runpodInfraAgent, webSearchAgent, docsRagAgent)
- ๐ API Endpoints: http://localhost:4111/api - REST API for agents
- ๐ API Documentation: http://localhost:4111/swagger-ui - Interactive API explorer
Note: Make sure the runpod-mcp project is built and available at ../runpod-mcp/build/index.js for the Runpod Infra Management agent to work. If you haven't built it yet:
cd ../runpod-mcp
npm install
npm run buildThis will start PostgreSQL with pgvector, pgAdmin, and worker-mastra all together:
# Set your Runpod API key (optional, defaults to 'your-key-here')
export RUNPOD_API_KEY=your-actual-key
# Start all services
docker-compose up -d
# View logs
docker-compose logs -f worker-mastraAccess points:
- worker-mastra API:
http://localhost:3000 - PostgreSQL:
localhost:5432 - pgAdmin:
http://localhost:8080(login:admin@example.com/admin)
Stop services:
docker-compose downStop and remove volumes (clean slate):
docker-compose down -vIf you want to run worker-mastra separately:
# Start only PostgreSQL and pgAdmin
docker-compose up -d postgres pgadmin
# Start worker-mastra (in another terminal)
docker run -p 3000:80 \
-e RUNPOD_API_KEY=your-key \
-e DB_HOST=host.docker.internal \
-e DB_USERNAME=worker_mastra_user \
-e DB_NAME=worker_mastra \
-e DB_PASSWORD=worker_mastra_password \
runpod/worker-mastra:latestNote: Use host.docker.internal as DB_HOST to connect from Docker container to host PostgreSQL.
If you already have PostgreSQL running:
docker run -p 8080:80 \
-e RUNPOD_API_KEY=your-key \
-e DB_HOST=localhost \
-e DB_USERNAME=worker_mastra_user \
-e DB_NAME=worker_mastra \
-e DB_PASSWORD=worker_mastra_password \
runpod/worker-mastra:latestNote: Make sure you have a PostgreSQL database running and accessible at DB_HOST before starting the container.
Test endpoints:
# Health check
curl http://localhost:8080/ping
# Weather tool
curl -X POST http://localhost:8080/api/tools/get-weather/execute \
-H "Content-Type: application/json" \
-d '{"data":{"location":"Berlin"}}'
# List agents
curl http://localhost:8080/api/agentsThe project includes GitHub Actions workflows for automated builds:
- dev.yml: Builds and pushes
dev-<branch-name>tags on PRs - release.yml: Builds and pushes version tags (e.g.,
v1.0.0) on tags or manual dispatch
Required Secrets:
DOCKERHUB_USERNAME: Your Docker Hub usernameDOCKERHUB_TOKEN: Your Docker Hub access token
Optional Variables (can override defaults):
DOCKERHUB_REPO: Docker Hub repository (default:runpod)DOCKERHUB_IMG: Image name (default:worker-mastra)
Docker Image: runpod/worker-mastra:<version>
- Push to GitHub (workflows will build and push to Docker Hub)
- In Runpod Console โ Serverless โ New Endpoint
- Select Import from Docker Registry
- Enter image URL:
runpod/worker-mastra:latest(or specific version likerunpod/worker-mastra:v1.0.0) - Select Endpoint Type: Load Balancer
- Configure GPU (CPU or 16GB+ GPU)
- Set environment variables:
- Required:
RUNPOD_API_KEY: Your Runpod API key (for Qwen3 model)DB_HOST: PostgreSQL database hostDB_USERNAME: PostgreSQL database usernameDB_NAME: PostgreSQL database nameDB_PASSWORD: PostgreSQL database password
- Optional:
PORT: Server port (default: 80)PORT_HEALTH: Health check port (default: same as PORT)MASTRA_PORT: Internal Mastra server port (default: 4111)DB_PORT: PostgreSQL database port (default: 6543 for transaction pooler)EXA_API_KEY: Exa API key for web search agentOPENAI_API_KEY: OpenAI API key for RAG embeddings
- Required:
- Click Create Endpoint
# Build image
docker build --platform linux/amd64 -t YOUR_DOCKERHUB_USERNAME/worker-mastra:latest .
# Push to Docker Hub
docker push YOUR_DOCKERHUB_USERNAME/worker-mastra:latest
# Then use YOUR_DOCKERHUB_USERNAME/worker-mastra:latest in RunpodOnce deployed, access at: https://YOUR_ENDPOINT_ID.api.runpod.ai/
GET /ping- Health check (returns{"status": "healthy"})GET /api/agents- List available agentsPOST /api/agents/{agentName}/generate- Generate response from agentGET /api/tools- List available toolsPOST /api/tools/{toolName}/execute- Execute a tool directly
# Health check
curl https://YOUR_ENDPOINT_ID.api.runpod.ai/ping
# List agents
curl https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents
# Chat with weather agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/weatherAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Weather in Tokyo?"}]}'
# Chat with Runpod infra agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/runpodInfraAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "List my pods"}]}'
# Chat with web search agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/webSearchAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "Latest news on AI agents"}]}'
# Chat with docs RAG agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/docsRagAgent/generate \
-H "Content-Type: application/json" \
-d '{"messages": [{"role": "user", "content": "How do I create a serverless endpoint?"}]}'- โ
/pingendpoint returning{"status": "healthy"}with 200 status - โ
Server listens on
PORT(defaults to 80) - โ
Server binds to
0.0.0.0for external access - โ All API routes exposed through load balancer
- โ Health check compatible with Runpod monitoring
Configure Runpod Git pipeline to build and deploy on push to main branch.