fix: use input frame rate for v2v consumption instead of production rate #219

ryanontheinside · 2025-12-05T12:27:00Z

Summary

Fix v2v streaming choppiness when using fast VAEs (LightVAE/TAE) by basing frame consumption rate on measured input FPS rather than pipeline production throughput. This fix is required to introduce faster VAEs and likely other performance enhancements.

Problem

LightVAE and TAE produced choppy playback during v2v streaming, where the boundaries of the chunks are visible. The slower Wan VAE worked perfectly. Symptoms:

Choppiness only with fast VAEs
Test scripts produced smooth output
Choppiness went away after ~45 seconds or when using more denoise steps
Choppiness sometimes went away upon updating the prompt
Lightvae and Wan vae are virtually identical apart from processing speed. Tae is faster still

Root Cause

The frame consumption rate (how fast WebRTC sends frames to the client) was calculated from production throughput (how fast the GPU produces frames) rather than content temporal rate (how fast frames should be played).

Eg, when a fast VAE produces 12 frames in 0.3s, the code calculated FPS=40 and sent frames to the client at 40fps. But the video content should maintain its original temporal rate for correct motion - playing it faster causes choppy/jerky appearance.

The test scripts worked because they export with a fixed FPS value, not the production rate.

Solution

Measure the actual input video frame rate by tracking timestamps of incoming frames, then use that rate for consumption:

Track timestamps of last 30 incoming frames in input_loop()
Calculate input FPS from frame intervals
Use input FPS when available (>=5 samples)
Fall back to existing pipeline FPS calculation otherwise (for t2v mode or during warm-up)

Tested

no regression with Wan VAE

## Summary Fix v2v streaming choppiness when using fast VAEs (LightVAE/TAE) by basing frame consumption rate on measured input FPS rather than pipeline production throughput. This fix is required to introduce faster VAEs and likely other performance enhancements. ## Problem LightVAE and TAE produced choppy playback during v2v streaming, where the boundaries of the chunks are visible. The slower Wan VAE worked perfectly. Symptoms: - Choppiness only with fast VAEs - Test scripts produced smooth output - Choppiness went away after ~45 seconds or when using more denoise steps - Choppiness _sometimes_ went away upon updating the prompt - Lightvae and Wan vae are virtually identical apart from processing speed. Tae is faster still ## Root Cause The frame consumption rate (how fast WebRTC sends frames to the client) was calculated from **production throughput** (how fast the GPU produces frames) rather than **content temporal rate** (how fast frames should be played). Eg, when a fast VAE produces 12 frames in 0.3s, the code calculated FPS=40 and sent frames to the client at 40fps. But the video content should maintain its original temporal rate for correct motion - playing it faster causes choppy/jerky appearance. The test scripts worked because they export with a fixed FPS value, not the production rate. ## Solution Measure the actual input video frame rate by tracking timestamps of incoming frames, then use that rate for consumption: - Track timestamps of last 30 incoming frames in `input_loop()` - Calculate input FPS from frame intervals - Use input FPS when available (>=5 samples) - Fall back to existing pipeline FPS calculation otherwise (for t2v mode or during warm-up) ## Tested - [x] no regression with Wan VAE Signed-off-by: RyanOnTheInside <7623207+ryanontheinside@users.noreply.github.com>

yondonfu

Jotting down my understanding:

The VAE speed should only matter in that it contributes to the overall generation speed of the pipeline eg faster VAE speed results in faster overall generation speed when all other components are held constant. We currently calculate the FPS of the pipeline based on its overall generation speed and then use that FPS to ensure that we send out frames at a constant rate.

But the video content should maintain its original temporal rate for correct motion - playing it faster causes choppy/jerky appearance.

The root problem is when input FPS < pipeline FPS = output FPS results in choppy/jerky appearance in the output? The default input FPS hardcoded in the frontend is 15 so if the VAE speed boost increased the pipeline FPS higher than the input FPS then we could get to this situation.

These changes address the root problem by ensuring that we cannot end up with output FPS > input FPS. We could increase the FPS used in the frontend, but that is a separate concern because regardless of what value is used there we would want to have logic in the backend that handles the scenario where the input FPS < pipeline FPS. And see my other comments about the actual conditional that I think we want for determining the output FPS.

Sound right?

src/scope/server/tracks.py

ryanontheinside · 2025-12-09T18:42:28Z

Sound right?

Correct. When VAE speed increased, pipeline FPS went above input FPS, causing choppy motion because frames were being sent faster than their intended temporal rate. This presumably would be surfaced by any performance improvement, not specifically VAE.

…t, pipeline) for output rate Signed-off-by: RyanOnTheInside <7623207+ryanontheinside@users.noreply.github.com>

ryanontheinside · 2025-12-09T19:17:30Z

cherry picked and tested in #221

yondonfu

LGTM!

ryanontheinside mentioned this pull request Dec 5, 2025

Add LightVAE support #221

Open

yondonfu requested changes Dec 9, 2025

View reviewed changes

src/scope/server/tracks.py Outdated Show resolved Hide resolved

src/scope/server/tracks.py Outdated Show resolved Hide resolved

refactor: consolidate FPS tracking in FrameProcessor and use min(inpu…

c812edf

…t, pipeline) for output rate Signed-off-by: RyanOnTheInside <7623207+ryanontheinside@users.noreply.github.com>

yondonfu approved these changes Dec 9, 2025

View reviewed changes

yondonfu merged commit e012f53 into main Dec 9, 2025
5 checks passed

yondonfu deleted the ryanontheinside/fix/consumption-rate branch December 9, 2025 19:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: use input frame rate for v2v consumption instead of production rate #219

fix: use input frame rate for v2v consumption instead of production rate #219

Uh oh!

ryanontheinside commented Dec 5, 2025

Uh oh!

yondonfu left a comment •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

ryanontheinside commented Dec 9, 2025

Uh oh!

ryanontheinside commented Dec 9, 2025

Uh oh!

yondonfu left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fix: use input frame rate for v2v consumption instead of production rate #219

fix: use input frame rate for v2v consumption instead of production rate #219

Uh oh!

Conversation

ryanontheinside commented Dec 5, 2025

Summary

Problem

Root Cause

Solution

Tested

Uh oh!

yondonfu left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ryanontheinside commented Dec 9, 2025

Uh oh!

ryanontheinside commented Dec 9, 2025

Uh oh!

yondonfu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yondonfu left a comment •

edited

Loading