Claude Task Runner

Automated pipeline that receives ClickUp webhooks, runs Claude Code headlessly on a cloned repo, and creates GitHub PRs for review.

Flow: ClickUp task assigned to "Claude" user → Webhook fires → Runner clones repo → Kickoff (generates prd.json) → Ralph loop (implements stories one by one) → Creates PR → Updates ClickUp card to "in review"

Powered by the Ralph autonomous agent loop and the Buildwright plugin.

Currently supports Next.js projects. Designed to run on Railway, a Linux VPS, or Docker container.

Quick Start

Click the Railway button above. The first deploy will fail — that's expected. Note the URL it deploys to (e.g., https://your-app.up.railway.app).

Clone this repo and install dependencies:

git clone https://github.com/Topflightapps/claude-task-runner.git
cd claude-task-runner
pnpm install

Get your ClickUp API token: Go to ClickUp Settings → ClickUp API (or navigate to Settings → Integrations & ClickApps → ClickUp API). Copy your personal API token (pk_...).
Run the setup script:
```
pnpm run setup
```
It will ask for your ClickUp API token and your Railway deployment URL, then generate a .env file with all the variables you need.
Paste the .env variables into Railway. Go to your Railway service → Variables, and add each variable from the generated .env file.
Railway should now deploy successfully.

After deployment, you'll need to authenticate Claude and add a volume if the template didn't create one automatically.

Quick Start
Environment Variables
ClickUp Setup
Claude Authentication
Running Locally
Running with Docker
How It Works
Architecture
Two-Phase Execution
Ralph Loop
Task Lifecycle
Database
Prompt Generation
Error Handling
MCP Integrations
Librarian (Cross-Task Learning)
Admin Dashboard
Development
Testing
Troubleshooting

Railway Details

What you get

Pre-built Docker image with Node.js 22, git, gh CLI, Claude Code CLI, and Playwright + Chromium
Persistent volume for database, cloned repos, and Claude auth tokens
Auto-restart on failure with health checks
Public URL for ClickUp/GitHub webhooks

Adding a volume

Railway volumes persist data across redeploys. You must attach one. The template should include this already, but if you need to attach one manually:

Open your service in the Railway dashboard
Go to Settings → Volumes
Click Add Volume
Set mount path to /data
Click Create

This single volume stores:

/data/db/ — SQLite database (run history, crash recovery)
/data/repos/ — Cloned git repositories
/data/claude/ — Claude CLI auth tokens (persists claude login)

Environment Variables

All configuration is via environment variables, validated at startup with Zod. The process will fail fast if any required values are missing.

Required

Variable	Description
`CLICKUP_API_TOKEN`	ClickUp personal API token (`pk_...`). Get it from ClickUp → Settings → Apps.
`CLICKUP_TEAM_ID`	Your ClickUp team/workspace ID. Found in the URL: `app.clickup.com/{team_id}/...`
`CLICKUP_CLAUDE_USER_ID`	ClickUp user ID for the "Claude" user. Assignment to this user triggers task pickup. Find it via the ClickUp API or `pnpm run setup`.
`CLICKUP_REPO_FIELD_ID`	ID of your "GitHub Repo" custom field (URL type). The runner reads this field to know which repo to clone. Find it via the ClickUp API or `pnpm run setup`.
`WEBHOOK_SECRET`	Shared secret for verifying ClickUp webhook signatures (HMAC-SHA256). Generate one with `openssl rand -hex 32`. Must match what you register with ClickUp.
`GITHUB_TOKEN`	GitHub personal access token with `repo` scope. Used for cloning, pushing, and creating PRs. Create at github.com → Settings → Developer settings → Personal access tokens.

Optional

Variable	Default	Description
`ANTHROPIC_API_KEY`	—	Anthropic API key for pay-per-use Claude. Not needed if you use `claude login` with a Max/Pro subscription. Get one at console.anthropic.com.
`WEBHOOK_PORT`	`3000`	Port for the webhook HTTP server. Railway sets `PORT` automatically — the Dockerfile exposes 3000.
`WORK_DIR`	`/data/repos`	Directory where repos are cloned. On Railway, this is overridden to `/data/repos` via the Dockerfile.
`DB_PATH`	`/data/db/task-runner.db`	SQLite database file path. On Railway, this is overridden to `/data/db/task-runner.db` via the Dockerfile.
`CLAUDE_MAX_TURNS`	`10`	Max agentic turns per Claude Code run. Higher = more thorough but slower/costlier.
`FIGMA_MCP_TOKEN`	—	Figma MCP token for design-to-code tasks. Include Figma URLs in your ClickUp task description and the runner auto-detects them.
`GITHUB_PR_ASSIGNEE`	—	GitHub username to auto-assign created PRs to.
`GITHUB_USERNAME`	—	Your GitHub username. Enables the PR review pipeline when set alongside `GITHUB_WEBHOOK_SECRET`.
`GITHUB_WEBHOOK_SECRET`	—	Secret for GitHub webhook signature verification. Enables automated PR reviews when set alongside `GITHUB_USERNAME`.
`REVIEW_TIMEOUT_MS`	`900000` (15 min)	Timeout for the PR review phase.
`SLACK_BOT_TOKEN`	—	Slack bot token (`xoxb-...`) for DM notifications when tasks complete or reviews are ready.
`SLACK_USER_ID`	—	Your Slack user ID for receiving DM notifications.
`LIBRARIAN_ENABLED`	`false`	Enable the Librarian module for cross-task learning. Requires `OPENAI_API_KEY`.
`OPENAI_API_KEY`	—	OpenAI API key for generating embeddings (`text-embedding-3-small`). Required when `LIBRARIAN_ENABLED=true`. Get one at platform.openai.com.
`ADMIN_PASSWORD`	—	Password to protect the admin dashboard. Leave empty to disable the dashboard.

ClickUp Setup

1. Create a "Claude" User (or use your existing user)

Create a dedicated ClickUp user (e.g., "Claude") that will act as the trigger. When you assign a task to this user, the webhook fires and the runner picks it up.

2. Create the "GitHub Repo" Custom Field (if one doesn't already exist)

In your ClickUp space, create a custom field:

Name: "GitHub Repo"
Type: URL
Value: Full GitHub repo URL (e.g., https://github.com/yourorg/your-next-app)

3. Register a Webhook

Point your ClickUp webhook to your Railway URL:

POST https://api.clickup.com/api/v2/team/{team_id}/webhook
{
  "endpoint": "https://your-app.up.railway.app/webhook",
  "events": ["taskAssigneeUpdated"],
  "secret": "your-webhook-secret"
}

Or run pnpm run setup locally to auto-register it.

4. Task Requirements

For the runner to pick up a task:

Assigned to the "Claude" user (matching CLICKUP_CLAUDE_USER_ID)
Has a valid "GitHub Repo" URL in the custom field
Not already processed (tracked in SQLite)

Tasks are processed one at a time. Concurrent webhooks are queued.

5. Task Content Tips

For best results:

Clear, descriptive task name (becomes the PR title)
Detailed description in markdown (Claude's primary instruction)
Checklists for acceptance criteria
Figma URLs in the description (auto-detected for design-to-code)

Claude Authentication

Claude Task Runner needs access to Claude Code. You have two options:

Option A: API Key (Simplest)

Set ANTHROPIC_API_KEY in your environment variables. This is pay-per-use billing through console.anthropic.com.

No SSH or manual setup required — just add the env var and deploy.

Option B: Claude Max/Pro Subscription (SSH into Railway)

If you have a Claude Max or Pro subscription and want to use claude login instead of an API key:

Deploy first — make sure the service is running on Railway

SSH into your Railway service:

# Install Railway CLI if you haven't
npm install -g @railway/cli

# Login to Railway
railway login

# Link to your project
railway link

# SSH into the running service
railway ssh

Inside the Railway shell, run:

# The entrypoint already creates the claude user and .claude directory
# Just switch to the claude user and login
su claude
claude login

Follow the OAuth prompts — Claude will give you a URL to visit in your browser. Authenticate and the tokens are saved to /data/claude/ (persisted across deploys via the volume).
Verify it worked:
```
claude --version
```
Exit and restart the service — the tokens persist in the /data volume.

Note: The entrypoint script automatically creates the claude user, sets up /home/claude/.claude, and symlinks it to /data/claude on the persistent volume. You do NOT need to manually create directories.

Running Locally

Prerequisites

Node.js 22+
pnpm 10.11+ (corepack enable && corepack prepare pnpm@10.11.0 --activate)
git and gh CLI installed and authenticated
Claude Code CLI: npm install -g @anthropic-ai/claude-code
Claude Code authenticated via claude login or ANTHROPIC_API_KEY env var

Setup

Follow the Quick Start steps above to clone, install, and run the setup script. Then:

# Dev mode with hot reload
pnpm dev

# Or production
pnpm build
pnpm start

Note: Use pnpm run setup, not pnpm setup (the latter runs pnpm's own setup command).

Running with Docker

Build and run

docker compose up -d

Claude Code Authentication (Docker)

# Run claude login inside the container
docker compose exec runner su claude -c "claude login"

Or set ANTHROPIC_API_KEY in your .env file.

Persistent volumes

Volume	Mount	Purpose
`runner-data`	`/data`	SQLite database, cloned repos, Claude auth tokens

Commands

# View logs
docker compose logs -f runner

# Rebuild after code changes
docker compose up -d --build

How It Works

Architecture

Single-process Node.js service with a webhook-driven execution model:

ClickUp Webhook → Verify Signature → Fetch Task → Clone Repo → Kickoff (prd.json) → Ralph Loop (stories) → Create PR → Update ClickUp

The runner is intentionally simple — one process, one task at a time (with queuing), SQLite for state. No message queues, no workers, no distributed coordination.

Two-Phase Execution

Every task goes through two phases:

Phase 1 — Kickoff: A single Claude Code invocation reads the ClickUp task details, explores the codebase, and generates scripts/ralph/prd.json — a structured breakdown of the task into small, ordered user stories. Timeout: 10 minutes.

Phase 2 — Ralph Loop: The ralph.sh script runs in a loop, spawning a fresh Claude Code instance per iteration. Each iteration:

Reads prd.json and progress.txt (for cross-iteration memory)
Picks the highest-priority story where passes: false
Implements that single story
Runs quality checks (typecheck, lint, test)
Commits with message feat: [Story ID] - [Story Title]
Updates prd.json to mark the story as passes: true
Appends learnings to progress.txt

The loop exits when all stories pass or max iterations are reached. Timeout: 60 minutes.

Ralph Loop

The Ralph loop files live in scripts/ralph/ and are copied into each target repo at runtime:

File	Purpose
`ralph.sh`	Bash loop that spawns fresh Claude instances per iteration
`CLAUDE.md`	Instructions piped to each Claude instance via stdin
`prd.json`	Generated by Phase 1 — structured user stories
`progress.txt`	Cross-iteration memory — learnings, patterns, gotchas
`learnings.md`	Librarian-provided context from previous tasks (optional)

Key design principle: Each Ralph iteration gets a fresh context window. Memory between iterations is maintained only through git commits, progress.txt, and prd.json.

prd.json Format

{
  "project": "MyApp",
  "branchName": "claude/abc123-add-login-page",
  "description": "Add login page with email/password auth",
  "userStories": [
    {
      "id": "US-001",
      "title": "Add auth schema and migration",
      "description": "As a developer, I need user auth tables in the database.",
      "acceptanceCriteria": [
        "Add users table with email, password_hash columns",
        "Generate and run migration successfully",
        "Typecheck passes"
      ],
      "priority": 1,
      "passes": false,
      "notes": ""
    }
  ]
}

Task Lifecycle

CLAIMED         → Updates ClickUp status to "in progress", posts a comment, inserts DB row
    |
CLONING         → Clones repo (or fetches + resets if already cloned), creates feature branch
    |
RUNNING_CLAUDE  → Phase 1: Kickoff generates prd.json (10 min timeout)
    |              Phase 2: Ralph loop implements stories (60 min timeout)
    |
CREATING_PR     → Pushes branch, creates PR via `gh pr create`
    |
DONE            → Updates ClickUp to "in review", posts PR link comment, un-assigns Claude user

On any error → FAILED: posts error details as ClickUp comment, updates DB

Database

SQLite (better-sqlite3) with WAL mode. Tables:

task_runs — One record per task execution (status, repo, branch, PR URL, cost, timestamps)
review_runs — PR review tracking
cloned_repos — Cache of cloned repositories
learnings — Librarian knowledge base (content, embeddings, metadata, provenance)

Auto-created on first run. Used for duplicate prevention, crash recovery, and cost tracking.

Prompt Generation

The prompt builder (src/clickup/prompt-builder.ts) generates detailed prompts from ClickUp task data, including task name, description, checklists (as acceptance criteria), and auto-detected Figma URLs.

Error Handling

Scenario	Behavior
Process crash/restart	Non-terminal DB rows marked `failed` with "Process restarted"
Kickoff timeout	10-minute timeout. Task marked `failed`.
Ralph loop timeout	60-minute timeout. Task marked `failed`.
Missing repo URL	Task skipped, comment posted to ClickUp
ClickUp API errors	Rate-limited to 90 req/min via `p-throttle`
Duplicate tasks	DB check prevents re-processing
Any unhandled error	Error posted as ClickUp comment, task marked `failed`

MCP Integrations

Playwright MCP — browser-based verification of UI changes (Chromium pre-installed)
Figma MCP — design-to-code tasks (set FIGMA_MCP_TOKEN, include Figma URLs in task description)

Configure MCP servers in each target repo's .mcp.json file.

Librarian (Cross-Task Learning)

The Librarian is an optional module that gives agents persistent memory across tasks. When enabled, the runner extracts learnings from each completed phase (kickoff, Ralph loop, reviews) and stores them in a vector database for future retrieval.

How it works

Before each phase — the Librarian searches for semantically relevant learnings from past tasks and injects them into the agent's prompt (or writes learnings.md for the Ralph loop)
After each phase — the agent's output is analyzed by a separate Claude call that extracts reusable learnings (patterns, gotchas, architecture decisions)
Deduplication — each new learning is compared against existing ones via embedding similarity. A Librarian decision agent decides whether to file it as new, merge it with an existing learning, replace an outdated one, or skip it

Storage

Learnings are stored in the same SQLite database as everything else. Embeddings are generated via OpenAI's text-embedding-3-small model and stored as BLOBs. Cosine similarity search runs in-process — no external vector DB needed for the expected corpus size (<10K entries).

Setup

# Add to your .env
LIBRARIAN_ENABLED=true
OPENAI_API_KEY=sk-...

That's it. The Librarian is a no-op when disabled, so existing behavior is unchanged.

Admin API

Endpoint	Description
`GET /api/learnings`	List learnings with optional filters (`category`, `project_type`, `source_agent`) and pagination
`GET /api/learnings/stats`	Counts by category and source agent
`DELETE /api/learnings/:id`	Remove a learning

Admin Dashboard

The runner includes a web-based admin dashboard for monitoring:

Queue status and active/completed runs
Real-time logs via WebSocket
Cloned repos cache management

Access it at your service URL (e.g., https://your-app.up.railway.app/). Protected by ADMIN_PASSWORD if set.

Development

Scripts

Command	Description
`pnpm dev`	Start with hot reload (tsx --watch), loads .env
`pnpm dev:admin`	Run both backend and web frontend in parallel
`pnpm build`	Compile TypeScript to dist/
`pnpm start`	Run compiled output (production)
`pnpm setup`	Interactive .env file generator
`pnpm type-check`	TypeScript type checking (no emit)
`pnpm lint`	Run ESLint
`pnpm lint:fix`	Run ESLint with auto-fix
`pnpm format`	Format all files with Prettier
`pnpm format:check`	Check formatting without writing
`pnpm test`	Run tests in watch mode (Vitest)
`pnpm test:run`	Run tests once
`pnpm coverage`	Run tests with coverage report

Tooling

Runtime: Node.js 22 (ESM)
Package manager: pnpm 10.11
TypeScript: Strict mode, extends @tsconfig/node22
Linting: ESLint 9 with typescript-eslint
Formatting: Prettier
Testing: Vitest with v8 coverage
Pre-commit hooks: Husky + lint-staged

Dependencies

Package	Purpose
`better-sqlite3`	SQLite database for run tracking and crash recovery
`p-throttle`	Rate limiting for ClickUp API calls (90 req/min)
`pino`	Structured JSON logging
`ws`	WebSocket for real-time admin updates
`zod`	Environment variable validation

External CLIs (pre-installed in Docker/Railway):

CLI	Purpose
`git`	Repository operations
`gh`	GitHub PR creation
`claude`	Claude Code headless execution

Testing

# Watch mode
pnpm test

# Single run
pnpm test:run

# With coverage
pnpm coverage

Integration testing (manual)

Set up a test repo on GitHub
Create a ClickUp task with "GitHub Repo" field pointing to your test repo
Give it a simple description like "Add a hello world page at /hello"
Assign it to the "Claude" user
Watch the logs for the full lifecycle

Troubleshooting

"Config not loaded" error

Make sure all required environment variables are set. Run pnpm run setup locally to generate a .env file.

ClickUp API 401

Your CLICKUP_API_TOKEN is invalid or expired. Generate a new one from ClickUp → Settings → Apps.

No tasks being picked up

Task is assigned to the correct Claude user (CLICKUP_CLAUDE_USER_ID)
Webhook is registered and pointing to the correct URL
WEBHOOK_SECRET matches the registered secret
"GitHub Repo" custom field ID matches CLICKUP_REPO_FIELD_ID
Task hasn't already been processed (check admin dashboard or SQLite DB)

Kickoff fails (Phase 1)

Check logs for Claude's output
Verify ANTHROPIC_API_KEY is valid (or claude login was successful)
Ensure the ClickUp task has enough detail

Ralph loop fails or times out (Phase 2)

Check scripts/ralph/progress.txt in the cloned repo
Check scripts/ralph/prd.json for story status
Stories may be too large — the kickoff should split them smaller

gh CLI errors

The GITHUB_TOKEN needs repo scope. The entrypoint auto-authenticates gh with this token.

Railway: Claude login tokens lost after redeploy

Make sure you have a volume mounted at /data. The entrypoint symlinks /home/claude/.claude → /data/claude/ so tokens persist.

Logs

Structured JSON via Pino. Key fields: module, taskId, runId, branchName, prUrl.

# Railway logs
railway logs

# Docker logs
docker compose logs -f runner

# Local dev with pretty printing
pnpm dev | npx pino-pretty

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
.claude		.claude
docs		docs
scripts		scripts
src		src
web		web
.env.example		.env.example
.gitignore		.gitignore
.prettierrc		.prettierrc
DEPLOY.md		DEPLOY.md
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
eslint.config.js		eslint.config.js
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
railway.json		railway.json
railway.toml		railway.toml
tsconfig.build.json		tsconfig.build.json
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts

Folders and files

Latest commit

History

Repository files navigation

Claude Task Runner

Quick Start

Table of Contents

Railway Details

What you get

Adding a volume

Environment Variables

Required

Optional

ClickUp Setup

1. Create a "Claude" User (or use your existing user)

2. Create the "GitHub Repo" Custom Field (if one doesn't already exist)

3. Register a Webhook

4. Task Requirements

5. Task Content Tips

Claude Authentication

Option A: API Key (Simplest)

Option B: Claude Max/Pro Subscription (SSH into Railway)

Running Locally

Prerequisites

Setup

Running with Docker

Build and run

Claude Code Authentication (Docker)

Persistent volumes

Commands

How It Works

Architecture

Two-Phase Execution

Ralph Loop

prd.json Format

Task Lifecycle

Database

Prompt Generation

Error Handling

MCP Integrations

Librarian (Cross-Task Learning)

How it works

Storage

Setup

Admin API

Admin Dashboard

Development

Scripts

Tooling

Dependencies

Testing

Integration testing (manual)

Troubleshooting

"Config not loaded" error

ClickUp API 401

No tasks being picked up

Kickoff fails (Phase 1)

Ralph loop fails or times out (Phase 2)

gh CLI errors

Railway: Claude login tokens lost after redeploy

Logs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages