Skip to content

[ACM MM 2025] Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation

Notifications You must be signed in to change notification settings

JD-GenX/Uni-Layout

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Uni-Layout

Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation

[ACM MM 2025] Official PyTorch Code for "Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation"

Abstract

Layout generation plays a crucial role in enhancing both user experience and design efficiency. However, current approaches suffer from task-specific generation capabilities and perceptually misaligned evaluation metrics, leading to limited applicability and ineffective measurement. In this paper, we propose \textit{Uni-Layout}, a novel framework that achieves unified generation, human-mimicking evaluation and alignment between the two. For universal generation, we incorporate various layout tasks into a single taxonomy and develop a unified generator that handles background or element contents constrained tasks via natural language prompts. To introduce human feedback for the effective evaluation of layouts, we build \textit{Layout-HF100k}, the first large-scale human feedback dataset with 100,000 expertly annotated layouts. Based on \textit{Layout-HF100k}, we introduce a human-mimicking evaluator that integrates visual and geometric information, employing a Chain-of-Thought mechanism to conduct qualitative assessments alongside a confidence estimation module to yield quantitative measurements. For better alignment between the generator and the evaluator, we integrate them into a cohesive system by adopting Dynamic-Margin Preference Optimization (DMPO), which dynamically adjusts margins based on preference strength to better align with human judgments. Extensive experiments show that \textit{Uni-Layout} significantly outperforms both task-specific and general-purpose methods.

image

📢 News

[2025-09-02]: 🚀 CoT data has been released! You can now find it in the "Dataset for Reward Model" link.

[2025-08-04]: 🎯 Our paper is now available on arXiv! Check it out here: https://arxiv.org/abs/2508.02374.

[2025-07-04]: 🎉 Exciting news! Our paper has been accepted to ACM MM 2025! Stay tuned for more updates!

🚀 Code & Weights Notice

The implementation code and pre-trained weights are currently undergoing JD Open-Source Review Process. We are committed to open-sourcing all materials to support research reproducibility.

🧪 Evaluation

  • Script: evaluation.py

Requirements

  • Python >= 3.8 (recommend Anaconda/Miniconda)
  • PyTorch >= 2.3.1 + CUDA 11.8 (install from official wheel index)
  • Extra deps in requirements.txt

Setup

conda create -n caig python==3.8.20 -y && conda activate caig
pip install torch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt

Run

python evaluation.py \
  --model_path /path/to/model \
  --input_data_path /path/to/input.json \
  --output_data_path /path/to/output.json

Notes

  • Optional: --model_base, --conv_mode, generation args (--temperature, --top_p, --num_beams, --max_new_tokens, --generate_nums), and process args (--save_interval, --batch_size).
  • Input JSON follows the dataset format below; image field is optional.

📊 Datasets

1. Dataset for Layout Generator

Download Link.

Key Fields

  • sku_id: Anonymized sample identifier.
  • image: Path to the image (optional; may be absent for text-only tasks).
  • conversations: List of two messages:
    • human: Task description, may include the <image> tag, canvas size, element types, and layout constraints.
    • gpt: Layout result; value is a string in the form Layout:{...}, where bounding boxes are [x_min, y_min, x_max, y_max].

2. Dataset for Layout Evaluator

Download Link.

Key Fields

  • image: Path to the image.
  • conversations: Single-turn QA pair:
    • human: Evaluation instruction with candidate layout and constraints; expects a binary decision (0/1).
    • gpt: The answer; value is the Ground Truth label (0 or 1).

📧 Contact for Urgent Requests

If you require early access for research collaboration or encounter urgent issues, please contact: shuolucs@gmail.com

Copyright & Licensing

© JD.COM. All rights reserved. The datasets and software provided in this repository are licensed exclusively for academic research purposes. Commercial use, reproduction, or distribution requires express written permission from JD.COM. Unauthorized commercial use constitutes a violation of these terms and is strictly prohibited.

About

[ACM MM 2025] Uni-Layout: Integrating Human Feedback in Unified Layout Generation and Evaluation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages