Skip to content

runpod-workers/worker-mastra

Repository files navigation

Worker Mastra

Mastra AI agents running on Runpod Serverless CPU with Load Balancer support.

This project is a starting point for developers to build and deploy AI agents using Mastra. Fork it, modify the agents, add your own tools, and deploy to Runpod.

Note: This project uses automated CI/CD workflows for building and pushing Docker images to Docker Hub.

Features

  • Multiple AI agents with tool access (Weather Agent, Runpod Infra Agent, Web Search Agent, Docs RAG Agent)
  • MCP (Model Context Protocol) integration for external tools
  • Web search with Exa for AI-optimized search results
  • RAG (Retrieval Augmented Generation) with LibSQL vector store
  • Runpod AI SDK provider with Qwen3-32B model
  • /ping health check endpoint for Runpod serverless load balancer
  • Optional PostgreSQL storage with PgVector for agent memory
  • LibSQL file-based storage with network volume support
  • Optimized Docker image
  • Production-ready build

Agents

This project includes example agents. Use them as templates for your own agents.

Weather Agent

A simple agent that fetches weather information for any location.

curl -X POST http://localhost:8080/api/agents/weatherAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What is the weather in Berlin?"}]}'

Runpod Infra Agent

An agent that manages Runpod infrastructure using the Runpod MCP Server. It can list, create, and delete pods.

# List all pods
curl -X POST http://localhost:8080/api/agents/runpodInfraAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "List all my pods"}]}'

# Create a pod
curl -X POST http://localhost:8080/api/agents/runpodInfraAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Create a pod named my-test-pod with image runpod/pytorch:2.4.0-py3.11-cuda12.4.1-devel-ubuntu22.04, RTX 4090 GPU, 1 GPU count. Proceed."}]}'

# Delete a pod (replace POD_ID with actual pod ID)
curl -X POST http://localhost:8080/api/agents/runpodInfraAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Delete pod POD_ID. I confirm."}]}'

Web Search Agent

An agent that searches the web using Exa and provides summarized results with sources.

Required: Set EXA_API_KEY environment variable (get one here)

# Search and summarize
curl -X POST http://localhost:8080/api/agents/webSearchAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "What are the latest developments in AI agents?"}]}'

# Search with specific result count
curl -X POST http://localhost:8080/api/agents/webSearchAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Search for Runpod serverless GPU news, max 3 results"}]}'

The agent returns:

  • Key points summarizing the findings
  • Sources with titles and URLs

Docs RAG Agent

A RAG (Retrieval Augmented Generation) agent that answers questions about Runpod documentation using semantic search over vector embeddings. This is an example of how to build a documentation assistant for your own projects.

How It Works

  1. Ingestion: Run scripts/ingest-docs.ts to clone the runpod/docs repo, chunk the markdown files, generate embeddings using OpenAI's text-embedding-3-small, and store them in a LibSQL vector database.

  2. Retrieval: When you ask a question, the agent uses createVectorQueryTool from @mastra/rag to find relevant documentation chunks via semantic similarity search.

  3. Generation: Retrieved chunks are passed to the LLM (Qwen3-32B) as context, which generates an answer based on the documentation.

Architecture

User Question
      โ”‚
      โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Vector Query    โ”‚ โ”€โ”€โ–บ LibSQL Vector DB โ”€โ”€โ–บ Relevant Chunks
โ”‚ Tool            โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
      โ”‚
      โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Qwen3-32B LLM   โ”‚ โ”€โ”€โ–บ Answer based on retrieved docs
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Setup

Required environment variables:

  • OPENAI_API_KEY: For generating embeddings
  • RUNPOD_API_KEY: For the Qwen3 model

Step 1: Run ingestion (populates the vector store)

OPENAI_API_KEY=your-key npx tsx scripts/ingest-docs.ts

This clones runpod/docs, processes ~80 markdown files into ~1700 chunks, and stores embeddings in vector.db.

Step 2: Query the agent

curl -X POST http://localhost:4111/api/agents/docsRagAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "How do I create a serverless endpoint?"}]}'

Persistence with Network Volume

For Runpod Serverless deployments, attach a network volume to persist the vector database across worker restarts:

  1. Create a network volume in Runpod console
  2. Attach it to your endpoint (mounts at /runpod-volume)
  3. Run ingestion once (stores at /runpod-volume/vector.db)
  4. Vector database persists across restarts

Without a network volume, you'll need to re-run ingestion after each cold start.

Customizing for Your Own Docs

  1. Modify ingestion script (scripts/ingest-docs.ts):

    • Change REPO_URL to your documentation repository
    • Update INCLUDED_DIRS to match your folder structure
  2. Update agent instructions (src/mastra/agents/docs-rag-agent.ts):

    • Modify the system prompt for your documentation topics

Key Files

File Purpose
src/mastra/agents/docs-rag-agent.ts RAG agent with vector query tool
scripts/ingest-docs.ts Documentation ingestion script
src/mastra/index.ts Mastra config with vector store

Creating Your Own Agent

  1. Create a new file in src/mastra/agents/
  2. Define your agent with instructions, model, and tools
  3. Register it in src/mastra/index.ts

Example structure:

// src/mastra/agents/my-agent.ts
import { Agent } from "@mastra/core/agent";
import { createRunpod } from "@runpod/ai-sdk-provider";

const runpod = createRunpod({ apiKey: process.env.RUNPOD_API_KEY });

export const myAgent = new Agent({
  name: "My Agent",
  instructions: "You are a helpful assistant...",
  model: runpod("qwen/qwen3-32b-awq"),
  tools: { /* your tools */ },
});

Build

Local Build and Push to Docker Hub

Prerequisites:

  • Docker installed and running
  • Docker Hub credentials (for runpod organization)

Steps:

  1. Login to Docker Hub:

    docker login

    Enter your Docker Hub username and password (or access token for runpod org)

  2. Build and push using the helper script:

    ./build-and-push.sh latest

    Or specify a version:

    ./build-and-push.sh v1.0.0
  3. Or manually:

    # Build
    docker build --platform linux/amd64 -t runpod/worker-mastra:latest .
    
    # Push
    docker push runpod/worker-mastra:latest

Development Build (without push)

npm install
npm run build
docker build --platform linux/amd64 -t runpod/worker-mastra:test .

Environment Variables

Required

  • RUNPOD_API_KEY: Your Runpod API key for accessing Qwen3 model via AI SDK Provider

Optional (Agent-specific)

  • EXA_API_KEY: Exa API key for web search agent (get one here)
  • OPENAI_API_KEY: OpenAI API key for RAG embeddings (required for Docs RAG Agent)

Optional (Database - for persistent storage)

  • DB_HOST: PostgreSQL database host address
  • DB_USERNAME: PostgreSQL database username
  • DB_NAME: PostgreSQL database name
  • DB_PASSWORD: PostgreSQL database password
  • DB_PORT: PostgreSQL database port (default: 6543 for transaction pooler)

Optional (Server)

  • PORT: Server port (default: 80)
  • PORT_HEALTH: Health check port (default: same as PORT)
  • MASTRA_PORT: Internal Mastra server port (default: 4111)

PostgreSQL Database Setup

This project requires a PostgreSQL database with the pgvector extension for agent memory and storage. We recommend using Supabase for easy setup.

Using Supabase

  1. Create a new project at Supabase Dashboard
  2. Go to Settings โ†’ Database โ†’ Connection string
  3. Select Transaction pooler mode
  4. Copy the connection details and set these environment variables:
DB_HOST=db.[PROJECT_REF].supabase.co
DB_USERNAME=postgres
DB_NAME=postgres
DB_PASSWORD=[YOUR_DATABASE_PASSWORD]
DB_PORT=6543

Note: The transaction pooler (port 6543) is recommended for serverless deployments as it handles connection pooling efficiently. For direct connections, use port 5432.

Using Other PostgreSQL Providers

For other PostgreSQL providers (AWS RDS, DigitalOcean, etc.), configure:

DB_HOST=[YOUR_DB_HOST]
DB_USERNAME=[YOUR_DB_USER]
DB_NAME=[YOUR_DB_NAME]
DB_PASSWORD=[YOUR_DB_PASSWORD]
DB_PORT=5432

Make sure your PostgreSQL database has the pgvector extension installed.

Run Locally

Option 1: Local Development with Mastra Dev (Recommended for Development)

For local development and testing, you can run Mastra directly without building a Docker image:

  1. Start PostgreSQL and pgAdmin (if not already running):

    docker-compose up -d postgres pgadmin

    Note: If port 5432 is already in use, you can either:

    • Stop your existing PostgreSQL instance, or
    • Modify the port mapping in docker-compose.yml to use a different port
  2. Create a .env file in the project root with your environment variables:

    RUNPOD_API_KEY=your-runpod-api-key-here
    DB_HOST=localhost
    DB_USERNAME=worker_mastra_user
    DB_NAME=worker_mastra
    DB_PASSWORD=worker_mastra_password
    DB_PORT=5432
  3. Install dependencies (if not already done):

    npm install
  4. Start the Mastra development server:

    npm run dev

    This will start:

Note: Make sure the runpod-mcp project is built and available at ../runpod-mcp/build/index.js for the Runpod Infra Management agent to work. If you haven't built it yet:

cd ../runpod-mcp
npm install
npm run build

Option 2: Using Docker Compose (Full Stack)

This will start PostgreSQL with pgvector, pgAdmin, and worker-mastra all together:

# Set your Runpod API key (optional, defaults to 'your-key-here')
export RUNPOD_API_KEY=your-actual-key

# Start all services
docker-compose up -d

# View logs
docker-compose logs -f worker-mastra

Access points:

  • worker-mastra API: http://localhost:3000
  • PostgreSQL: localhost:5432
  • pgAdmin: http://localhost:8080 (login: admin@example.com / admin)

Stop services:

docker-compose down

Stop and remove volumes (clean slate):

docker-compose down -v

Option 3: Docker Compose (Database Only)

If you want to run worker-mastra separately:

# Start only PostgreSQL and pgAdmin
docker-compose up -d postgres pgadmin

# Start worker-mastra (in another terminal)
docker run -p 3000:80 \
  -e RUNPOD_API_KEY=your-key \
  -e DB_HOST=host.docker.internal \
  -e DB_USERNAME=worker_mastra_user \
  -e DB_NAME=worker_mastra \
  -e DB_PASSWORD=worker_mastra_password \
  runpod/worker-mastra:latest

Note: Use host.docker.internal as DB_HOST to connect from Docker container to host PostgreSQL.

Option 4: Manual Docker Run

If you already have PostgreSQL running:

docker run -p 8080:80 \
  -e RUNPOD_API_KEY=your-key \
  -e DB_HOST=localhost \
  -e DB_USERNAME=worker_mastra_user \
  -e DB_NAME=worker_mastra \
  -e DB_PASSWORD=worker_mastra_password \
  runpod/worker-mastra:latest

Note: Make sure you have a PostgreSQL database running and accessible at DB_HOST before starting the container.

Test endpoints:

# Health check
curl http://localhost:8080/ping

# Weather tool
curl -X POST http://localhost:8080/api/tools/get-weather/execute \
  -H "Content-Type: application/json" \
  -d '{"data":{"location":"Berlin"}}'

# List agents
curl http://localhost:8080/api/agents

GitHub Workflows

The project includes GitHub Actions workflows for automated builds:

  • dev.yml: Builds and pushes dev-<branch-name> tags on PRs
  • release.yml: Builds and pushes version tags (e.g., v1.0.0) on tags or manual dispatch

Required Secrets:

  • DOCKERHUB_USERNAME: Your Docker Hub username
  • DOCKERHUB_TOKEN: Your Docker Hub access token

Optional Variables (can override defaults):

  • DOCKERHUB_REPO: Docker Hub repository (default: runpod)
  • DOCKERHUB_IMG: Image name (default: worker-mastra)

Docker Image: runpod/worker-mastra:<version>

Runpod Serverless Deployment

Option 1: Use Pre-built Image from Docker Hub

  1. Push to GitHub (workflows will build and push to Docker Hub)
  2. In Runpod Console โ†’ Serverless โ†’ New Endpoint
  3. Select Import from Docker Registry
  4. Enter image URL: runpod/worker-mastra:latest (or specific version like runpod/worker-mastra:v1.0.0)
  5. Select Endpoint Type: Load Balancer
  6. Configure GPU (CPU or 16GB+ GPU)
  7. Set environment variables:
    • Required:
      • RUNPOD_API_KEY: Your Runpod API key (for Qwen3 model)
      • DB_HOST: PostgreSQL database host
      • DB_USERNAME: PostgreSQL database username
      • DB_NAME: PostgreSQL database name
      • DB_PASSWORD: PostgreSQL database password
    • Optional:
      • PORT: Server port (default: 80)
      • PORT_HEALTH: Health check port (default: same as PORT)
      • MASTRA_PORT: Internal Mastra server port (default: 4111)
      • DB_PORT: PostgreSQL database port (default: 6543 for transaction pooler)
      • EXA_API_KEY: Exa API key for web search agent
      • OPENAI_API_KEY: OpenAI API key for RAG embeddings
  8. Click Create Endpoint

Option 2: Build and Push Locally

# Build image
docker build --platform linux/amd64 -t YOUR_DOCKERHUB_USERNAME/worker-mastra:latest .

# Push to Docker Hub
docker push YOUR_DOCKERHUB_USERNAME/worker-mastra:latest

# Then use YOUR_DOCKERHUB_USERNAME/worker-mastra:latest in Runpod

API Endpoints

Once deployed, access at: https://YOUR_ENDPOINT_ID.api.runpod.ai/

  • GET /ping - Health check (returns {"status": "healthy"})
  • GET /api/agents - List available agents
  • POST /api/agents/{agentName}/generate - Generate response from agent
  • GET /api/tools - List available tools
  • POST /api/tools/{toolName}/execute - Execute a tool directly

Example Requests

# Health check
curl https://YOUR_ENDPOINT_ID.api.runpod.ai/ping

# List agents
curl https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents

# Chat with weather agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/weatherAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Weather in Tokyo?"}]}'

# Chat with Runpod infra agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/runpodInfraAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "List my pods"}]}'

# Chat with web search agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/webSearchAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "Latest news on AI agents"}]}'

# Chat with docs RAG agent
curl -X POST https://YOUR_ENDPOINT_ID.api.runpod.ai/api/agents/docsRagAgent/generate \
  -H "Content-Type: application/json" \
  -d '{"messages": [{"role": "user", "content": "How do I create a serverless endpoint?"}]}'

Requirements Met

  • โœ… /ping endpoint returning {"status": "healthy"} with 200 status
  • โœ… Server listens on PORT (defaults to 80)
  • โœ… Server binds to 0.0.0.0 for external access
  • โœ… All API routes exposed through load balancer
  • โœ… Health check compatible with Runpod monitoring

GitOps Pipeline

Configure Runpod Git pipeline to build and deploy on push to main branch.