Feature Request: Batch Inference Support in Gradio App

I would like to know if it’s possible to add batch inference support to the Gradio app, allowing users to specify a batch size and generate multiple images simultaneously.

Additionally, I’d appreciate guidance on whether this is possible on a 24GB VRAM GPU. Currently, running a single generation with the provided command already occupies ~16GB of memory using the quantized FP8 model:
`python app.py --offload --name flux-dev-fp8`

Is batch inference technically possible in this setup, and if so, could you provide pointers or help in adding this feature?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Batch Inference Support in Gradio App #36

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Batch Inference Support in Gradio App #36

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions