[Experimental] This feature may change in future releases.
LLM calls dominate Synix build cost. OpenAI's Batch API offers 50% cheaper inference with async processing (typically completes in minutes, 24h SLA). synix batch-build uses Batch API for eligible layers while keeping the existing synix build completely unchanged.
The only new core primitive is a BatchLLMClient — a drop-in replacement for LLMClient. Transforms don't change at all. When a transform calls client.complete(), the batch client intercepts: it queues the request, raises a control-flow exception, and the runner handles the batch lifecycle. On resume, the same transform re-executes, the client returns the cached batch result, and artifacts are created normally.
Transform contract for batch mode: Transforms are already required to be stateless during execution (see Transform docstring in core/models.py). Batch mode adds one additional requirement: execute() must be idempotent with respect to the LLM call. Specifically, given the same inputs and config, execute() must produce the same client.complete() arguments. This is already true for all built-in transforms (they derive prompts deterministically from inputs). Custom transforms that have non-deterministic prompt construction (e.g., timestamp injection, random sampling) will produce different request keys on resume, causing cache misses and duplicate batch submissions. If your transform has side effects beyond returning artifacts (e.g., writing to external systems, sending notifications), guard them with a check against the batch state or make them idempotent.
Use batch-build when:
- Your pipeline has OpenAI layers with many artifacts (episodes, rollups)
- You want to cut LLM cost by 50%
- You can tolerate async wait times (minutes to hours, 24h SLA)
Don't use batch-build when:
- All your layers use Anthropic (not supported by OpenAI Batch API)
- You need results immediately
- You're iterating on prompts and need fast feedback loops
Each batch-build creates a named build instance with its own ID and lifecycle. This is the on-ramp to snapshot builds — each instance tracks which layers are complete, which batches are in flight, and which results have been collected.
<build_dir>/builds/<build_id>/
manifest.json # BuildInstance metadata (status, layers completed)
batch_state.json # Requests, batches, results, errors
Artifacts live in the shared cache (existing ArtifactStore). The build instance only tracks batch lifecycle state.
Transform.execute()
└── client.complete()
├── Result cached? → return LLMResponse (normal path)
├── Batch complete? → download, cache, return LLMResponse
├── Batch in-progress? → raise BatchInProgress
└── New request? → queue, raise BatchCollecting
-
Collect phase: Runner walks the DAG. For each batchable layer, it calls
split()thenexecute()per unit. TheBatchLLMClientqueues each request and raisesBatchCollecting. After all units are attempted, queued requests are submitted as an OpenAI batch. -
Wait phase: Either poll (
--poll) or exit with resume instructions. -
Resume phase: Runner re-walks the DAG. Completed layers are skipped. For in-progress layers, the batch client checks batch status, downloads results, and returns them through the normal
complete()→LLMResponsepath. Transforms execute identically — they don't know they're in batch mode.
Each transform has an optional batch parameter:
batch value |
Provider is OpenAI | Behavior |
|---|---|---|
None (default) |
Yes, and split_count > 1 |
Auto-batch |
None (default) |
No, or single unit | Sync |
True |
Yes | Force batch |
True |
No | Hard error (ValueError) |
False |
Any | Force sync |
Layers that run sync in a batch-build execute identically to synix build — same code path, same runner logic.
A pipeline can mix Anthropic and OpenAI layers. Only OpenAI layers participate in batching; Anthropic layers run synchronously as normal:
episodes = EpisodeSummary("episodes", depends_on=[transcripts]) # Uses pipeline default (Anthropic) → sync
monthly = MonthlyRollup("monthly", depends_on=[episodes],
config={"llm_config": {"provider": "openai", "model": "gpt-4o-mini"}}) # → batchedAll commands are under the batch-build group. Every invocation prints an experimental warning.
Create a build instance and submit the first batch.
uvx synix batch-build run pipeline.py # Submit and exit
uvx synix batch-build run pipeline.py --poll # Submit and wait
uvx synix batch-build run pipeline.py --poll --poll-interval 30Options:
--poll: Stay alive and poll until the build completes--poll-interval N: Seconds between status checks (default: 60)
Resume an existing build instance. Checks batch status, downloads results, submits next layer's batch if needed.
uvx synix batch-build resume <build-id> # Check and advance
uvx synix batch-build resume <build-id> --poll # Check and wait
uvx synix batch-build resume <build-id> --allow-pipeline-mismatch # Resume despite pipeline changes
uvx synix batch-build resume <build-id> --reset-state # Restart layer after state corruptionOptions:
--poll: Stay alive and poll until the build completes--allow-pipeline-mismatch: Resume even if the pipeline fingerprint has changed since submission (dangerous — can mix artifacts from incompatible pipeline definitions)--reset-state: Acknowledge corrupted state and restart the current layer from scratch (discards pending/in-flight requests for the corrupted layer)
Show all build instances and their status.
uvx synix batch-build listOutput (Rich table):
Build ID Status Pipeline Hash Created Layers Done Current
batch-a1b2 submitted fp:3e8a... 2026-02-16 10:30 1/4 episodes
batch-c3d4 completed fp:3e8a... 2026-02-15 14:00 4/4 —
Detailed status for a specific build instance.
uvx synix batch-build status <build-id>Output: per-layer breakdown, per-batch status, request counts, errors.
Dry-run showing which layers would batch vs sync, estimated batch count.
uvx synix batch-build plan pipeline.pyfrom synix.ext import EpisodeSummary, MonthlyRollup, CoreSynthesis
# Auto (default) — batches if OpenAI and multiple work units
episodes = EpisodeSummary("episodes", depends_on=[transcripts])
# Force batch — error if not OpenAI
monthly = MonthlyRollup("monthly", depends_on=[episodes], batch=True)
# Force sync — never batch, even if OpenAI
core = CoreSynthesis("core", depends_on=[monthly], batch=False)The batch parameter is None by default (auto-detect). It has no effect on synix build — only synix batch-build reads it.
| Condition | Error |
|---|---|
batch-build with zero batchable OpenAI layers |
UsageError |
batch=True on non-OpenAI layer |
ValueError at validation |
Resume with unknown build_id |
UsageError |
| Resume with changed pipeline (fingerprint mismatch) | UsageError (override with --allow-pipeline-mismatch) |
OPENAI_API_KEY not set for batchable layers |
ValueError |
SYNIX_CASSETTE_MODE set to replay |
UsageError (incompatible — batch client manages its own replay via batch_state.json) |
| Condition | Behavior |
|---|---|
| Batch expired (24h SLA) | Re-queue failed requests, submit new batch, log warning |
| Individual request failed in batch | Mark request as failed, report per-artifact error, continue with others (see Build Completion Semantics below) |
| Network error checking batch | If polling, retry next interval; if not, save state and exit |
Corrupted batch_state.json |
Quarantine corrupted file, set build to failed, require --reset-state to restart (see State Persistence) |
| Condition | Behavior |
|---|---|
| Pipeline changed between submit/resume | Hard error by default (fingerprint mismatch). Override with --allow-pipeline-mismatch to resume anyway. |
| All layers cached | "Nothing to build", no batch submission |
| Empty layer (no inputs) | Skip |
Pipeline fingerprint mismatch is a hard error because resuming a build with a different pipeline definition can silently mix artifacts from incompatible prompts, models, or transform logic. The --allow-pipeline-mismatch flag on resume opts into this explicitly.
A build instance has the following terminal states:
| Status | Meaning |
|---|---|
completed |
All layers finished, all requests succeeded, all artifacts created |
completed_with_errors |
All layers attempted, but some requests failed (artifacts missing) |
failed |
Unrecoverable error (corrupted state, validation failure, etc.) |
Per-request failure policy: When individual requests fail within a completed batch (e.g., content filter, context length exceeded):
- The failed request is recorded in
errorswith its error code and message. - The corresponding artifact is not created — there is no silent fallback or placeholder.
- The layer continues processing all other requests normally.
- The build status becomes
completed_with_errors(notcompleted).
Downstream layer behavior: When a layer has failed requests, downstream layers that depend on it receive only the successfully-produced artifacts as inputs. This means:
- A 1:1 transform (e.g.,
EpisodeSummary) with 2 of 50 failures produces 48 episodes. DownstreamMonthlyRollupprocesses those 48. - An N:1 transform (e.g.,
CoreSynthesis) receives whatever is available.
The status command shows per-layer error counts. The list command shows the terminal status. This makes partial failures visible without blocking the entire pipeline.
{
"build_id": "batch-a1b2c3d4",
"pipeline_hash": "fp:3e8a...",
"status": "submitted",
"created_at": 1708099200.0,
"layers_completed": ["transcripts"],
"current_layer": "episodes",
"failed_requests": 0,
"error": null
}Valid status values: pending, collecting, submitted, completed, completed_with_errors, failed.
{
"pending": {
"<request_key>": {"layer": "episodes", "body": {...}, "desc": "episode ep-conv-123"}
},
"batch_map": {
"<request_key>": "<batch_id>"
},
"batches": {
"<batch_id>": {"layer": "episodes", "keys": ["<key1>", "<key2>"], "status": "completed"}
},
"results": {
"<request_key>": {"content": "...", "model": "gpt-4o-mini", "tokens": {"input": 500, "output": 200}}
},
"errors": {
"<request_key>": {"code": "content_filter", "message": "Content filtered"}
}
}All writes use atomic_write() for crash safety.
Corruption recovery: If batch_state.json is corrupted (invalid JSON, truncated write), the system does not silently start from empty. Instead:
- The corrupted file is quarantined to
batch_state.json.corruptwith a timestamp suffix. - The build status is set to
failedwith an error message describing the corruption. - The
resumecommand refuses to continue until the user either:- Fixes or restores the state file manually, or
- Runs
resume --reset-stateto acknowledge data loss and restart the current layer from scratch.
This prevents orphaned in-flight batches and duplicate submissions. The manifest.json (which tracks high-level build status and is much smaller) uses the same quarantine behavior.
Batch request keys reuse compute_cassette_key() from the cassette system. The key is a SHA256 over the complete set of output-affecting parameters:
provider— which API backendmodel— model identifiermessages— canonicalized for hashing (JSON dict keys sorted, trailing whitespace on string values stripped,\r\n→\n). Message content bytes are preserved — this is serialization normalization, not content mutation.max_tokens— max output lengthtemperature— sampling temperature
Scope limitation: The current key covers all parameters that Synix's LLMClient.complete() accepts. Parameters like top_p, seed, response_format, tools, and frequency_penalty are not part of the complete() interface and therefore not in the key. If LLMClient.complete() is extended with new output-affecting parameters in the future, compute_cassette_key() must be updated to include them — both for cassette correctness and batch correctness. This is a single point of change (one function).
Properties:
- Identical requests produce identical keys (deduplication)
- A cassette-recorded response and a batch response for the same request have the same key
- The key is deterministic across sessions
batch-build run processes all layers in DAG order. Source layers always run synchronously. Transform layers that resolve to sync (Anthropic provider, batch=False, single work unit) also run synchronously inline. The command only exits to await a batch when it encounters the first batchable layer with pending requests. This means a pipeline with sync-only layers at the top runs those immediately — there's no wasted round-trip.
If the pipeline has zero batchable layers, batch-build run fails immediately with a UsageError rather than doing a full sync build (use synix build for that).
When requests fail permanently (content filter, validation error), the failed artifacts are simply absent from that layer's output. Downstream layers receive only successfully-produced artifacts as inputs. This follows Synix's existing pattern — transforms already handle variable-length inputs (e.g., MonthlyRollup groups whatever episodes exist for a month).
No special "partial input" or "structured error" objects are injected into downstream transforms. The errors are recorded in batch_state.json and visible via status, but they don't propagate as data through the DAG. This keeps the transform interface clean — execute() never receives error sentinels.
A self-contained demo that exercises the full batch-build lifecycle using the existing team-report pipeline sources (bios + project brief), but with an OpenAI layer for the 1:1 work-style transform. This is both a user-facing template and the source of truth for e2e cassette replay tests.
The existing demos (01-04) all use Anthropic. This demo needs at least one OpenAI layer to exercise batch mode. Rather than retrofitting an existing template, a dedicated 05-batch-build template keeps the batch-specific concerns isolated and gives users a clear example of how to set up batch builds.
Level 0: bios [parse] → 3 artifacts (reuses 03-team-report sources)
Level 1: work_styles [WorkStyle] → 3 artifacts, OpenAI gpt-4o-mini, batch=True
Level 2: summary [TeamSummary] → 1 artifact, Anthropic (sync, batch=False)
Key properties:
- Mixed providers: Level 1 is OpenAI (batched), Level 2 is Anthropic (sync)
- batch=True explicit: Forces batch mode on the OpenAI layer
- batch=False explicit: Forces sync on the Anthropic layer
- Small enough to record: 3 OpenAI batch requests + 1 Anthropic sync call
templates/05-batch-build/
├── README.md # User-facing instructions
├── pipeline.py # Pipeline definition (mixed providers)
├── sources/
│ └── bios/ # Symlink or copy from 03-team-report
│ ├── alice.md
│ ├── bob.md
│ └── carol.md
├── cassettes/
│ ├── llm.yaml # Anthropic cassette (sync layer)
│ └── batch_responses.json # Recorded OpenAI Batch API responses
├── golden/
│ ├── plan.stdout.txt
│ ├── batch_run.stdout.txt
│ ├── batch_status.stdout.txt
│ ├── batch_resume.stdout.txt
│ └── list.stdout.txt
└── case.py # Demo case definition
The batch-build flow involves two distinct kinds of LLM calls that need recording:
-
Sync Anthropic calls (Level 2 summary): Use the existing cassette system (
SYNIX_CASSETTE_MODE=record). These are identical to how other demos work. -
OpenAI Batch API calls (Level 1 work styles): The Batch API is async — you upload a JSONL file, poll for completion, then download results. We can't use the standard cassette system for this because the batch client talks to OpenAI's file/batch endpoints, not the chat completions endpoint.
Recording approach: Run
batch-buildonce against the real OpenAI Batch API. TheBatchLLMClientrecords:- The JSONL request file it would upload (deterministic from inputs)
- The batch ID returned by OpenAI
- The JSONL response file downloaded on completion
These are saved to
cassettes/batch_responses.jsonas a map from request key to response. On replay, theBatchLLMClientchecks this file before hitting OpenAI — same pattern as the cassette store, just for batch results.In practice: Since
BatchLLMClientstores results inbatch_state.jsonduring a real run, and on resume it checksbatch_state.get_result(key)before anything else, we can seed the cassette by copying the results section from a completedbatch_state.json. The replay path never touches OpenAI at all.
case = {
"name": "batch_build",
"pipeline": "pipeline.py",
"steps": [
# Step 1: Plan — show batch sequencing
{"name": "note_plan", "command": ["synix", "demo", "note", "1/5 Planning batch build..."]},
{"name": "plan", "command": ["synix", "batch-build", "plan", "PIPELINE"]},
# Step 2: Run — submit batch (mocked: results pre-seeded)
{"name": "note_run", "command": ["synix", "demo", "note", "2/5 Submitting batch..."]},
{"name": "batch_run", "command": ["synix", "batch-build", "run", "PIPELINE", "--poll"]},
# Step 3: List — show build instances
{"name": "note_list", "command": ["synix", "demo", "note", "3/5 Listing builds..."]},
{"name": "list", "command": ["synix", "batch-build", "list"]},
# Step 4: Status — detailed per-batch info
{"name": "note_status", "command": ["synix", "demo", "note", "4/5 Build status..."]},
{"name": "batch_status", "command": ["synix", "batch-build", "status", "BATCH_ID"]},
# Step 5: Verify artifacts
{"name": "note_verify", "command": ["synix", "demo", "note", "5/5 Verifying artifacts..."]},
{"name": "verify", "command": ["synix", "list"]},
],
"goldens": {},
}Note:
BATCH_IDplaceholder will need special handling in the demo runner (resolve from thebuilds/directory), or the case.py can shell out to discover it.
The e2e test does not use the demo runner. It directly exercises the batch-build CLI commands with mocked OpenAI:
# Mock strategy: monkeypatch openai.OpenAI to return controlled responses
# - files.create() → returns mock file ID
# - batches.create() → returns mock batch with status "completed"
# - batches.retrieve() → returns completed batch
# - files.content() → returns JSONL with pre-recorded responsesTest cases:
- Full flow:
run --poll→ completes, artifacts created - Resume flow:
run(exits with pending) →resume --poll(completes) listandstatusshow correct infoplanshows batch sequencing- No OpenAI layers → hard error
- Mixed providers: OpenAI batches, Anthropic runs sync
batch=Falselayer runs sync even with OpenAI- All cached → "nothing to build"
- Experimental warning displayed on every command
- Fingerprint mismatch on resume → hard error without
--allow-pipeline-mismatch - Fingerprint mismatch with
--allow-pipeline-mismatch→ proceeds with warning - Corrupted
batch_state.json→ quarantine file, set status tofailed - Corrupted state with
--reset-state→ restarts current layer - Corrupted state without
--reset-state→ refuses to continue - Partial request failures →
completed_with_errorsstatus, downstream receives partial input - Resume is idempotent (repeated resume doesn't duplicate artifacts or batch submissions)
SYNIX_CASSETTE_MODE=replaywithbatch-build→ hard errorSYNIX_CASSETTE_MODE=recordwithsynix build(notbatch-build) → allowed (recording workflow)- Request key correctness: distinct keys when any output-affecting param changes (model, temperature, messages, max_tokens)
One-time manual step (requires both ANTHROPIC_API_KEY and OPENAI_API_KEY):
cd templates/05-batch-build
# Step 1: Record Anthropic cassette for the sync layer (standard synix build)
export ANTHROPIC_API_KEY=sk-ant-...
export SYNIX_CASSETTE_MODE=record
export SYNIX_CASSETTE_DIR=./cassettes
uvx synix build pipeline.py # records llm.yaml for the Anthropic call
unset SYNIX_CASSETTE_MODE SYNIX_CASSETTE_DIR
# Step 2: Run batch-build against real OpenAI Batch API
# (cassette mode must NOT be set — batch-build rejects SYNIX_CASSETTE_MODE=replay,
# and batch results are captured via batch_state.json, not the cassette system)
export OPENAI_API_KEY=sk-...
uvx synix batch-build run pipeline.py --poll
# Step 3: Seed batch cassette from completed build state
# Copy the "results" section from build/builds/<id>/batch_state.json
# into cassettes/batch_responses.json
python -c "
import json
from pathlib import Path
builds = sorted(Path('build/builds').iterdir())[-1]
state = json.loads((builds / 'batch_state.json').read_text())
Path('cassettes/batch_responses.json').write_text(
json.dumps(state['results'], indent=2))
"
# Step 4: Clean build dir and generate golden outputs
rm -rf build/
uvx synix demo run . --update-goldens
# Step 5: Verify replay works (no API keys needed)
rm -rf build/
uvx synix demo run .After this, CI runs the demo in replay mode with no API keys needed. The cassette system handles the Anthropic call; the batch client finds pre-seeded results in cassettes/batch_responses.json and never contacts OpenAI.
synix build— completely untouched, same code path- All existing transforms — zero interface changes
ArtifactStore— shared cache, content-addressed, no changessynix clean— already deletes entirebuild_dirwhich includesbuilds/- Cassette system —
compute_cassette_key()reused as-is, not modified