Skip to content

IyatomiLab/TAUE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

TAUE: Training-free Noise Transplant and Cultivation Diffusion Model

arXiv Huggingface Diffusers Project Page

Daichi Nagai*, Ryugo Morita*, Shunsuke Kitada, Hitoshi Iyatomi

Abstract

Despite the remarkable success of text-to-image diffusion models, their output of a single, flattened image remains a critical bottleneck for professional applications requiring layer-wise control. Existing solutions either rely on fine-tuning with large, inaccessible datasets or are training-free yet limited to generating isolated foreground elements, failing to produce a complete and coherent scene. To address this, we introduce the Training-free Noise Transplantation and Cultivation Diffusion Model (TAUE), a novel framework for zero-shot, layer-wise image generation. Our core technique, Noise Transplantation and Cultivation (NTC), extracts intermediate latent representations from both foreground and composite generation processes, transplanting them into the initial noise for subsequent layers. This ensures semantic and structural coherence across foreground, background, and composite layers, enabling consistent, multi-layered outputs without requiring fine-tuning or auxiliary datasets. Extensive experiments show that our training-free method achieves performance comparable to fine-tuned methods, enhancing layer-wise consistency while maintaining high image quality and fidelity. TAUE not only eliminates costly training and dataset requirements but also unlocks novel downstream applications, such as complex compositional editing, paving the way for more accessible and controllable generative workflows.

taue-animation.mp4

Acknowledgement

The project page under docs/ is based on Academic Project Page Template.

LICENSE

Apache License 2.0

About

TAUE is a training-free diffusion framework that enables zero-shot, layer-wise image generation by transplanting and cultivating noise representations across layers, ensuring semantic and structural coherence among foreground, background, and composite elements without the need for fine-tuning or large datasets.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors