Batch Build — OpenAI Batch API Support

[Experimental] This feature may change in future releases.

Overview

LLM calls dominate Synix build cost. OpenAI's Batch API offers 50% cheaper inference with async processing (typically completes in minutes, 24h SLA). synix batch-build uses Batch API for eligible layers while keeping the existing synix build completely unchanged.

The only new core primitive is a BatchLLMClient — a drop-in replacement for LLMClient. Transforms don't change at all. When a transform calls client.complete(), the batch client intercepts: it queues the request, raises a control-flow exception, and the runner handles the batch lifecycle. On resume, the same transform re-executes, the client returns the cached batch result, and artifacts are created normally.

Transform contract for batch mode: Transforms are already required to be stateless during execution (see Transform docstring in core/models.py). Batch mode adds one additional requirement: execute() must be idempotent with respect to the LLM call. Specifically, given the same inputs and config, execute() must produce the same client.complete() arguments. This is already true for all built-in transforms (they derive prompts deterministically from inputs). Custom transforms that have non-deterministic prompt construction (e.g., timestamp injection, random sampling) will produce different request keys on resume, causing cache misses and duplicate batch submissions. If your transform has side effects beyond returning artifacts (e.g., writing to external systems, sending notifications), guard them with a check against the batch state or make them idempotent.

When to Use

Use batch-build when:

Your pipeline has OpenAI layers with many artifacts (episodes, rollups)
You want to cut LLM cost by 50%
You can tolerate async wait times (minutes to hours, 24h SLA)

Don't use batch-build when:

All your layers use Anthropic (not supported by OpenAI Batch API)
You need results immediately
You're iterating on prompts and need fast feedback loops

Architecture

Build Instances

Each batch-build creates a named build instance with its own ID and lifecycle. This is the on-ramp to snapshot builds — each instance tracks which layers are complete, which batches are in flight, and which results have been collected.

<build_dir>/builds/<build_id>/
    manifest.json      # BuildInstance metadata (status, layers completed)
    batch_state.json   # Requests, batches, results, errors

Artifacts live in the shared cache (existing ArtifactStore). The build instance only tracks batch lifecycle state.

Request Flow

Transform.execute()
    └── client.complete()
         ├── Result cached? → return LLMResponse (normal path)
         ├── Batch complete? → download, cache, return LLMResponse
         ├── Batch in-progress? → raise BatchInProgress
         └── New request? → queue, raise BatchCollecting

Collect phase: Runner walks the DAG. For each batchable layer, it calls split() then execute() per unit. The BatchLLMClient queues each request and raises BatchCollecting. After all units are attempted, queued requests are submitted as an OpenAI batch.
Wait phase: Either poll (--poll) or exit with resume instructions.
Resume phase: Runner re-walks the DAG. Completed layers are skipped. For in-progress layers, the batch client checks batch status, downloads results, and returns them through the normal complete() → LLMResponse path. Transforms execute identically — they don't know they're in batch mode.

Batch Mode Resolution

Each transform has an optional batch parameter:

`batch` value	Provider is OpenAI	Behavior
`None` (default)	Yes, and `split_count > 1`	Auto-batch
`None` (default)	No, or single unit	Sync
`True`	Yes	Force batch
`True`	No	Hard error (`ValueError`)
`False`	Any	Force sync

Layers that run sync in a batch-build execute identically to synix build — same code path, same runner logic.

Mixed Providers

A pipeline can mix Anthropic and OpenAI layers. Only OpenAI layers participate in batching; Anthropic layers run synchronously as normal:

episodes = EpisodeSummary("episodes", depends_on=[transcripts])  # Uses pipeline default (Anthropic) → sync
monthly = MonthlyRollup("monthly", depends_on=[episodes],
    config={"llm_config": {"provider": "openai", "model": "gpt-4o-mini"}})  # → batched

CLI Commands

All commands are under the batch-build group. Every invocation prints an experimental warning.

`synix batch-build run`

Create a build instance and submit the first batch.

uvx synix batch-build run pipeline.py              # Submit and exit
uvx synix batch-build run pipeline.py --poll        # Submit and wait
uvx synix batch-build run pipeline.py --poll --poll-interval 30

Options:

--poll: Stay alive and poll until the build completes
--poll-interval N: Seconds between status checks (default: 60)

`synix batch-build resume`

Resume an existing build instance. Checks batch status, downloads results, submits next layer's batch if needed.

uvx synix batch-build resume <build-id>                          # Check and advance
uvx synix batch-build resume <build-id> --poll                   # Check and wait
uvx synix batch-build resume <build-id> --allow-pipeline-mismatch  # Resume despite pipeline changes
uvx synix batch-build resume <build-id> --reset-state            # Restart layer after state corruption

Options:

--poll: Stay alive and poll until the build completes
--allow-pipeline-mismatch: Resume even if the pipeline fingerprint has changed since submission (dangerous — can mix artifacts from incompatible pipeline definitions)
--reset-state: Acknowledge corrupted state and restart the current layer from scratch (discards pending/in-flight requests for the corrupted layer)

`synix batch-build list`

Show all build instances and their status.

uvx synix batch-build list

Output (Rich table):

Build ID     Status      Pipeline Hash  Created          Layers Done  Current
batch-a1b2   submitted   fp:3e8a...     2026-02-16 10:30  1/4         episodes
batch-c3d4   completed   fp:3e8a...     2026-02-15 14:00  4/4         —

`synix batch-build status`

Detailed status for a specific build instance.

uvx synix batch-build status <build-id>

Output: per-layer breakdown, per-batch status, request counts, errors.

`synix batch-build plan`

Dry-run showing which layers would batch vs sync, estimated batch count.

uvx synix batch-build plan pipeline.py

Pipeline API

`batch` Parameter on Transform

from synix.ext import EpisodeSummary, MonthlyRollup, CoreSynthesis

# Auto (default) — batches if OpenAI and multiple work units
episodes = EpisodeSummary("episodes", depends_on=[transcripts])

# Force batch — error if not OpenAI
monthly = MonthlyRollup("monthly", depends_on=[episodes], batch=True)

# Force sync — never batch, even if OpenAI
core = CoreSynthesis("core", depends_on=[monthly], batch=False)

The batch parameter is None by default (auto-detect). It has no effect on synix build — only synix batch-build reads it.

Error Handling

Hard Errors (fail fast)

Condition	Error
`batch-build` with zero batchable OpenAI layers	`UsageError`
`batch=True` on non-OpenAI layer	`ValueError` at validation
Resume with unknown `build_id`	`UsageError`
Resume with changed pipeline (fingerprint mismatch)	`UsageError` (override with `--allow-pipeline-mismatch`)
`OPENAI_API_KEY` not set for batchable layers	`ValueError`
`SYNIX_CASSETTE_MODE` set to `replay`	`UsageError` (incompatible — batch client manages its own replay via `batch_state.json`)

Recoverable

Condition	Behavior
Batch expired (24h SLA)	Re-queue failed requests, submit new batch, log warning
Individual request failed in batch	Mark request as failed, report per-artifact error, continue with others (see Build Completion Semantics below)
Network error checking batch	If polling, retry next interval; if not, save state and exit
Corrupted `batch_state.json`	Quarantine corrupted file, set build to `failed`, require `--reset-state` to restart (see State Persistence)

Edge Cases

Condition	Behavior
Pipeline changed between submit/resume	Hard error by default (fingerprint mismatch). Override with `--allow-pipeline-mismatch` to resume anyway.
All layers cached	"Nothing to build", no batch submission
Empty layer (no inputs)	Skip

Pipeline fingerprint mismatch is a hard error because resuming a build with a different pipeline definition can silently mix artifacts from incompatible prompts, models, or transform logic. The --allow-pipeline-mismatch flag on resume opts into this explicitly.

Build Completion Semantics

A build instance has the following terminal states:

Status	Meaning
`completed`	All layers finished, all requests succeeded, all artifacts created
`completed_with_errors`	All layers attempted, but some requests failed (artifacts missing)
`failed`	Unrecoverable error (corrupted state, validation failure, etc.)

Per-request failure policy: When individual requests fail within a completed batch (e.g., content filter, context length exceeded):

The failed request is recorded in errors with its error code and message.
The corresponding artifact is not created — there is no silent fallback or placeholder.
The layer continues processing all other requests normally.
The build status becomes completed_with_errors (not completed).

Downstream layer behavior: When a layer has failed requests, downstream layers that depend on it receive only the successfully-produced artifacts as inputs. This means:

A 1:1 transform (e.g., EpisodeSummary) with 2 of 50 failures produces 48 episodes. Downstream MonthlyRollup processes those 48.
An N:1 transform (e.g., CoreSynthesis) receives whatever is available.

The status command shows per-layer error counts. The list command shows the terminal status. This makes partial failures visible without blocking the entire pipeline.

State Persistence

`manifest.json`

{
  "build_id": "batch-a1b2c3d4",
  "pipeline_hash": "fp:3e8a...",
  "status": "submitted",
  "created_at": 1708099200.0,
  "layers_completed": ["transcripts"],
  "current_layer": "episodes",
  "failed_requests": 0,
  "error": null
}

Valid status values: pending, collecting, submitted, completed, completed_with_errors, failed.

`batch_state.json`

{
  "pending": {
    "<request_key>": {"layer": "episodes", "body": {...}, "desc": "episode ep-conv-123"}
  },
  "batch_map": {
    "<request_key>": "<batch_id>"
  },
  "batches": {
    "<batch_id>": {"layer": "episodes", "keys": ["<key1>", "<key2>"], "status": "completed"}
  },
  "results": {
    "<request_key>": {"content": "...", "model": "gpt-4o-mini", "tokens": {"input": 500, "output": 200}}
  },
  "errors": {
    "<request_key>": {"code": "content_filter", "message": "Content filtered"}
  }
}

All writes use atomic_write() for crash safety.

Corruption recovery: If batch_state.json is corrupted (invalid JSON, truncated write), the system does not silently start from empty. Instead:

The corrupted file is quarantined to batch_state.json.corrupt with a timestamp suffix.
The build status is set to failed with an error message describing the corruption.
The resume command refuses to continue until the user either:
- Fixes or restores the state file manually, or
- Runs resume --reset-state to acknowledge data loss and restart the current layer from scratch.

This prevents orphaned in-flight batches and duplicate submissions. The manifest.json (which tracks high-level build status and is much smaller) uses the same quarantine behavior.

Request Keying

Batch request keys reuse compute_cassette_key() from the cassette system. The key is a SHA256 over the complete set of output-affecting parameters:

provider — which API backend
model — model identifier
messages — canonicalized for hashing (JSON dict keys sorted, trailing whitespace on string values stripped, \r\n → \n). Message content bytes are preserved — this is serialization normalization, not content mutation.
max_tokens — max output length
temperature — sampling temperature

Scope limitation: The current key covers all parameters that Synix's LLMClient.complete() accepts. Parameters like top_p, seed, response_format, tools, and frequency_penalty are not part of the complete() interface and therefore not in the key. If LLMClient.complete() is extended with new output-affecting parameters in the future, compute_cassette_key() must be updated to include them — both for cassette correctness and batch correctness. This is a single point of change (one function).

Properties:

Identical requests produce identical keys (deduplication)
A cassette-recorded response and a batch response for the same request have the same key
The key is deterministic across sessions

Design Decisions

Sync-only layers in batch-build

batch-build run processes all layers in DAG order. Source layers always run synchronously. Transform layers that resolve to sync (Anthropic provider, batch=False, single work unit) also run synchronously inline. The command only exits to await a batch when it encounters the first batchable layer with pending requests. This means a pipeline with sync-only layers at the top runs those immediately — there's no wasted round-trip.

If the pipeline has zero batchable layers, batch-build run fails immediately with a UsageError rather than doing a full sync build (use synix build for that).

Permanent request failures and downstream layers

When requests fail permanently (content filter, validation error), the failed artifacts are simply absent from that layer's output. Downstream layers receive only successfully-produced artifacts as inputs. This follows Synix's existing pattern — transforms already handle variable-length inputs (e.g., MonthlyRollup groups whatever episodes exist for a month).

No special "partial input" or "structured error" objects are injected into downstream transforms. The errors are recorded in batch_state.json and visible via status, but they don't propagate as data through the DAG. This keeps the transform interface clean — execute() never receives error sentinels.

Demo Template: `05-batch-build`

A self-contained demo that exercises the full batch-build lifecycle using the existing team-report pipeline sources (bios + project brief), but with an OpenAI layer for the 1:1 work-style transform. This is both a user-facing template and the source of truth for e2e cassette replay tests.

Why a Separate Template

The existing demos (01-04) all use Anthropic. This demo needs at least one OpenAI layer to exercise batch mode. Rather than retrofitting an existing template, a dedicated 05-batch-build template keeps the batch-specific concerns isolated and gives users a clear example of how to set up batch builds.

Pipeline Design

Level 0: bios [parse]              → 3 artifacts (reuses 03-team-report sources)
Level 1: work_styles [WorkStyle]   → 3 artifacts, OpenAI gpt-4o-mini, batch=True
Level 2: summary [TeamSummary]     → 1 artifact, Anthropic (sync, batch=False)

Key properties:

Mixed providers: Level 1 is OpenAI (batched), Level 2 is Anthropic (sync)
batch=True explicit: Forces batch mode on the OpenAI layer
batch=False explicit: Forces sync on the Anthropic layer
Small enough to record: 3 OpenAI batch requests + 1 Anthropic sync call

Directory Layout

templates/05-batch-build/
├── README.md              # User-facing instructions
├── pipeline.py            # Pipeline definition (mixed providers)
├── sources/
│   └── bios/              # Symlink or copy from 03-team-report
│       ├── alice.md
│       ├── bob.md
│       └── carol.md
├── cassettes/
│   ├── llm.yaml           # Anthropic cassette (sync layer)
│   └── batch_responses.json  # Recorded OpenAI Batch API responses
├── golden/
│   ├── plan.stdout.txt
│   ├── batch_run.stdout.txt
│   ├── batch_status.stdout.txt
│   ├── batch_resume.stdout.txt
│   └── list.stdout.txt
└── case.py                # Demo case definition

Cassette Recording Strategy

The batch-build flow involves two distinct kinds of LLM calls that need recording:

Sync Anthropic calls (Level 2 summary): Use the existing cassette system (SYNIX_CASSETTE_MODE=record). These are identical to how other demos work.
OpenAI Batch API calls (Level 1 work styles): The Batch API is async — you upload a JSONL file, poll for completion, then download results. We can't use the standard cassette system for this because the batch client talks to OpenAI's file/batch endpoints, not the chat completions endpoint.

Recording approach: Run batch-build once against the real OpenAI Batch API. The BatchLLMClient records:
- The JSONL request file it would upload (deterministic from inputs)
- The batch ID returned by OpenAI
- The JSONL response file downloaded on completion
These are saved to cassettes/batch_responses.json as a map from request key to response. On replay, the BatchLLMClient checks this file before hitting OpenAI — same pattern as the cassette store, just for batch results.

In practice: Since BatchLLMClient stores results in batch_state.json during a real run, and on resume it checks batch_state.get_result(key) before anything else, we can seed the cassette by copying the results section from a completed batch_state.json. The replay path never touches OpenAI at all.

Case Definition (`case.py`)

case = {
    "name": "batch_build",
    "pipeline": "pipeline.py",
    "steps": [
        # Step 1: Plan — show batch sequencing
        {"name": "note_plan", "command": ["synix", "demo", "note", "1/5 Planning batch build..."]},
        {"name": "plan", "command": ["synix", "batch-build", "plan", "PIPELINE"]},

        # Step 2: Run — submit batch (mocked: results pre-seeded)
        {"name": "note_run", "command": ["synix", "demo", "note", "2/5 Submitting batch..."]},
        {"name": "batch_run", "command": ["synix", "batch-build", "run", "PIPELINE", "--poll"]},

        # Step 3: List — show build instances
        {"name": "note_list", "command": ["synix", "demo", "note", "3/5 Listing builds..."]},
        {"name": "list", "command": ["synix", "batch-build", "list"]},

        # Step 4: Status — detailed per-batch info
        {"name": "note_status", "command": ["synix", "demo", "note", "4/5 Build status..."]},
        {"name": "batch_status", "command": ["synix", "batch-build", "status", "BATCH_ID"]},

        # Step 5: Verify artifacts
        {"name": "note_verify", "command": ["synix", "demo", "note", "5/5 Verifying artifacts..."]},
        {"name": "verify", "command": ["synix", "list"]},
    ],
    "goldens": {},
}

Note: BATCH_ID placeholder will need special handling in the demo runner (resolve from the builds/ directory), or the case.py can shell out to discover it.

E2E Test (`tests/e2e/test_batch_build.py`)

The e2e test does not use the demo runner. It directly exercises the batch-build CLI commands with mocked OpenAI:

# Mock strategy: monkeypatch openai.OpenAI to return controlled responses
# - files.create() → returns mock file ID
# - batches.create() → returns mock batch with status "completed"
# - batches.retrieve() → returns completed batch
# - files.content() → returns JSONL with pre-recorded responses

Test cases:

Full flow: run --poll → completes, artifacts created
Resume flow: run (exits with pending) → resume --poll (completes)
list and status show correct info
plan shows batch sequencing
No OpenAI layers → hard error
Mixed providers: OpenAI batches, Anthropic runs sync
batch=False layer runs sync even with OpenAI
All cached → "nothing to build"
Experimental warning displayed on every command
Fingerprint mismatch on resume → hard error without --allow-pipeline-mismatch
Fingerprint mismatch with --allow-pipeline-mismatch → proceeds with warning
Corrupted batch_state.json → quarantine file, set status to failed
Corrupted state with --reset-state → restarts current layer
Corrupted state without --reset-state → refuses to continue
Partial request failures → completed_with_errors status, downstream receives partial input
Resume is idempotent (repeated resume doesn't duplicate artifacts or batch submissions)
SYNIX_CASSETTE_MODE=replay with batch-build → hard error
SYNIX_CASSETTE_MODE=record with synix build (not batch-build) → allowed (recording workflow)
Request key correctness: distinct keys when any output-affecting param changes (model, temperature, messages, max_tokens)

Recording the Initial Cassettes

One-time manual step (requires both ANTHROPIC_API_KEY and OPENAI_API_KEY):

cd templates/05-batch-build

# Step 1: Record Anthropic cassette for the sync layer (standard synix build)
export ANTHROPIC_API_KEY=sk-ant-...
export SYNIX_CASSETTE_MODE=record
export SYNIX_CASSETTE_DIR=./cassettes
uvx synix build pipeline.py   # records llm.yaml for the Anthropic call
unset SYNIX_CASSETTE_MODE SYNIX_CASSETTE_DIR

# Step 2: Run batch-build against real OpenAI Batch API
# (cassette mode must NOT be set — batch-build rejects SYNIX_CASSETTE_MODE=replay,
#  and batch results are captured via batch_state.json, not the cassette system)
export OPENAI_API_KEY=sk-...
uvx synix batch-build run pipeline.py --poll

# Step 3: Seed batch cassette from completed build state
# Copy the "results" section from build/builds/<id>/batch_state.json
# into cassettes/batch_responses.json
python -c "
import json
from pathlib import Path
builds = sorted(Path('build/builds').iterdir())[-1]
state = json.loads((builds / 'batch_state.json').read_text())
Path('cassettes/batch_responses.json').write_text(
    json.dumps(state['results'], indent=2))
"

# Step 4: Clean build dir and generate golden outputs
rm -rf build/
uvx synix demo run . --update-goldens

# Step 5: Verify replay works (no API keys needed)
rm -rf build/
uvx synix demo run .

After this, CI runs the demo in replay mode with no API keys needed. The cassette system handles the Anthropic call; the batch client finds pre-seeded results in cassettes/batch_responses.json and never contacts OpenAI.

What Doesn't Change

synix build — completely untouched, same code path
All existing transforms — zero interface changes
ArtifactStore — shared cache, content-addressed, no changes
synix clean — already deletes entire build_dir which includes builds/
Cassette system — compute_cassette_key() reused as-is, not modified

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Batch Build — OpenAI Batch API Support

Overview

When to Use

Architecture

Build Instances

Request Flow

Batch Mode Resolution

Mixed Providers

CLI Commands

`synix batch-build run`

`synix batch-build resume`

`synix batch-build list`

`synix batch-build status`

`synix batch-build plan`

Pipeline API

`batch` Parameter on Transform

Error Handling

Hard Errors (fail fast)

Recoverable

Edge Cases

Build Completion Semantics

State Persistence

`manifest.json`

`batch_state.json`

Request Keying

Design Decisions

Sync-only layers in batch-build

Permanent request failures and downstream layers

Demo Template: `05-batch-build`

Why a Separate Template

Pipeline Design

Directory Layout

Cassette Recording Strategy

Case Definition (`case.py`)

E2E Test (`tests/e2e/test_batch_build.py`)

Recording the Initial Cassettes

What Doesn't Change

FilesExpand file tree

batch-build.md

Latest commit

History

batch-build.md

File metadata and controls

Batch Build — OpenAI Batch API Support

Overview

When to Use

Architecture

Build Instances

Request Flow

Batch Mode Resolution

Mixed Providers

CLI Commands

synix batch-build run

synix batch-build resume

synix batch-build list

synix batch-build status

synix batch-build plan

Pipeline API

batch Parameter on Transform

Error Handling

Hard Errors (fail fast)

Recoverable

Edge Cases

Build Completion Semantics

State Persistence

manifest.json

batch_state.json

Request Keying

Design Decisions

Sync-only layers in batch-build

Permanent request failures and downstream layers

Demo Template: 05-batch-build

Why a Separate Template

Pipeline Design

Directory Layout

Cassette Recording Strategy

Case Definition (case.py)

E2E Test (tests/e2e/test_batch_build.py)

Recording the Initial Cassettes

What Doesn't Change

`synix batch-build run`

`synix batch-build resume`

`synix batch-build list`

`synix batch-build status`

`synix batch-build plan`

`batch` Parameter on Transform

`manifest.json`

`batch_state.json`

Demo Template: `05-batch-build`

Case Definition (`case.py`)

E2E Test (`tests/e2e/test_batch_build.py`)