-
Notifications
You must be signed in to change notification settings - Fork 15
feat: Image composition/composite support (split, mirror, overlay before video) #178
Description
Problem
When building video pipelines, we often need to compose multiple generated images into a single frame BEFORE passing it to a video model (e.g. kling-v3). Currently this requires a separate ffmpeg script outside the TSX pipeline, breaking the single-file workflow.
Concrete Use Case
Eye + Ceiling Fan Composite:
- Generate macro eye image (nano-banana-pro)
- Generate ceiling fan image (nano-banana-pro)
- Composite: fan mirrored on left+right sides, eye centered → single 1920x1080 image
- Animate this composite via kling-v3
Step 3 currently requires ffmpeg:
ffmpeg -i fan.png -i eye.png -filter_complex \
"[0:v]split[f1][f2]; [f2]hflip[fm]; \
[f1]crop=700:1080[left]; [fm]crop=700:1080[right]; \
[1:v]crop=860:1080[eye]; \
color=c=black:s=2060x1080[canvas]; \
[canvas][left]overlay=0:0[c1]; \
[c1][right]overlay=W-700:0[c2]; \
[c2][eye]overlay=(W-w)/2:0" \
-frames:v 1 output.pngProposed Solution
A Composite() or compose() function that takes multiple images and a layout, returning a single image that can be passed to Video():
const eyeComposite = Composite({
width: 1920,
height: 1080,
layers: [
{ src: fanImage, crop: { width: 700 }, position: "left" },
{ src: fanImage, crop: { width: 700 }, position: "right", mirror: true },
{ src: eyeImage, crop: { width: 860 }, position: "center" },
],
background: "black",
})
// Now use as input for video generation
const eyeClosing = Video({
model: varg.videoModel("kling-v3"),
prompt: { text: "eye slowly closes...", images: [eyeComposite] },
duration: 5,
})This is similar to how Speech() returns a result that can be used in <Captions src={speech}> — the composite returns an image that can be used anywhere an image is expected.
Alternative: Simpler API
Even a basic Split or Mirror helper would solve most cases:
// Mirror an image horizontally and place original + mirror side by side
const triptych = Triptych({
left: fanImage,
center: eyeImage,
right: fanImage,
mirrorRight: true,
})Current Workaround
Separate eye-composite.sh script that runs before pipeline.tsx, generating and uploading the composite to S3. The URL is then hardcoded in the pipeline. This breaks the reproducible single-file workflow.