-
Notifications
You must be signed in to change notification settings - Fork 164
Description
Hi NVIDIA team 👋
First of all, thank you for the great work on the TensorRT integration in Stable Diffusion WebUI — the acceleration on consumer GPUs is impressive and extremely helpful for real-world deployments.
At the moment, TensorRT UNet acceleration works perfectly for “vanilla” SD1.5 / SDXL inference, but it cannot be combined with ControlNet (and most runtime LoRA). The root cause, as far as I can see, is that ControlNet hooks into intermediate UNet blocks, while the TensorRT engine replaces the entire UNet as a static graph with no injection points or multiple feature bindings.
This means that:
ControlNet → still runs on PyTorch UNet hooks
TensorRT UNet → is a fully fused graph
The two execution paths diverge → device mismatch (CPU/CUDA) or missing injection points
Because ControlNet is essential in many production / real workflow use cases (edge maps, depth, canny, scribble, pose guided generation etc.), having TRT acceleration with ControlNet would make a huge difference for everyone running on mid-range GPUs (20xx / 30xx), where diffusion speed is the main bottleneck.
Request
Would you consider adding official support for:
TRT UNet with external control feature bindings
(extra ONNX/TRT inputs for the ControlNet tensors)
Or a partial-graph TRT strategy
(UNet core in TRT, control injection layers still in PyTorch)
Either solution would allow ControlNet + TensorRT to coexist, unlocking a massive usability improvement for the SD community.
Why this matters
ControlNet is not a “nice to have” anymore — it’s part of most workflows
Current acceleration → great for pure/stylized generation
But production users often need structure-guided generation
TRT would make those workflows fast enough for interactive use
If you could provide guidance on what form of patch/extension is acceptable upstream (e.g. additional ONNX bindings, TRT plugin layers, or graph partitioning approach), I would be happy to help push this forward or contribute to testing.
Thanks again for this amazing work — TensorRT already takes SD to another level, and ControlNet support would be a huge step for real-world adoption. 🙏