Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -76,3 +76,5 @@ python-embed/

# Downloaded during CI for Windows NSIS installer
resources/vc_redist.x64.exe

reference_projects/
11 changes: 11 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,17 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Run: pnpm dev",
"type": "node",
"request": "launch",
"runtimeExecutable": "pnpm",
"runtimeArgs": [
"dev"
],
"cwd": "${workspaceFolder}",
"console": "integratedTerminal"
},
{
"name": "Attach: Electron Main",
"type": "node",
Expand Down
6 changes: 5 additions & 1 deletion backend/ltx2_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -181,9 +181,13 @@ def _resolve_force_api_generations() -> bool:
return force_api_generations


import os

FORCE_API_GENERATIONS = _resolve_force_api_generations()
_BYPASS_MODEL_CHECK = os.environ.get("LTX_BYPASS_MODEL_CHECK") == "1"

REQUIRED_MODEL_TYPES: frozenset[ModelFileType] = (
frozenset() if FORCE_API_GENERATIONS else DEFAULT_REQUIRED_MODEL_TYPES
frozenset() if (FORCE_API_GENERATIONS or _BYPASS_MODEL_CHECK) else DEFAULT_REQUIRED_MODEL_TYPES
)

CAMERA_MOTION_PROMPTS = {
Expand Down
4 changes: 4 additions & 0 deletions backend/runtime_config/runtime_policy.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
"""Runtime policy decisions for forced API mode."""

from __future__ import annotations
import os


def decide_force_api_generations(system: str, cuda_available: bool, vram_gb: int | None) -> bool:
"""Return whether API-only generation must be forced for this runtime."""
if os.environ.get("LTX_BYPASS_API_CHECK") == "1":
return False

if system == "Darwin":
return True

Expand Down
57 changes: 57 additions & 0 deletions change-requests/CR001-comfyui-integration/HLD.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
# High-Level Design (HLD): ComfyUI Integration

## 1. Overview

This High-Level Design details the integration of a ComfyUI engine into the LTX-Desktop application. As determined in the Impact Assessment, this integration follows **Option B: Proxy-Based Metadata Mapping**.

The core architectural philosophy is **Additive Isolation**: the ComfyUI integration will be built *on top* of the existing local generation capabilities without modifying their underlying logic. This minimizes merge conflicts with the upstream fork and ensures the default native GPU experience is preserved.

## 2. Core Components

### 2.1. App Settings and API Types (Additive)

* **`AppSettings`**: A new setting, `generation_backend` (Literal: `"local" | "comfyui"`), will be added to dictate the routing logic.
* **`api_types.py`**: Generation request payloads (e.g., `VideoGenerationRequest`) will be extended with an optional `workflow_params: dict[str, Any] | None` to pass dynamic proxy widget values from the UI to the backend.

### 2.2. State Management (`AppState`)

To avoid disrupting the highly tuned local `GpuSlot` management:
* A new state slot, **`ComfyUIJobSlot`**, will be introduced in `AppState`.
* The centralized `AppHandler` lock will protect this new slot exactly as it protects the `GpuSlot`.

### 2.3. ComfyUI Service Module

A new, isolated module (`backend/services/comfyui/`) will encapsulate all ComfyUI-specific logic:

1. **`WorkflowParser`**:
* Reads predefined ComfyUI JSON workflows.
* Extracts the `proxyWidgets` metadata to identify which internal node parameters are exposed to the UI.
2. **`ComfyUIClient`**:
* Handles HTTP communication with the ComfyUI server (e.g., `/prompt`, `/upload/image`, `/history`).
* Manages WebSocket connections (if required) for real-time progress updates.
3. **`ComfyUIPipelineAdapters`**:
* Implements the existing strictly-typed protocols (e.g., `FastVideoPipeline`).
* Translates the incoming `VideoGenerationRequest` (including `workflow_params`) into the final execution graph JSON.

### 2.4. Generation Handler Routing

The `GenerationHandler` will act as a router based on the `generation_backend` setting:

* **If `"local"`**: The handler proceeds normally, acquiring the `GpuSlot` and delegating to the native `services.video_processor`.
* **If `"comfyui"`**: The handler bypasses the `GpuSlot`, acquires the `ComfyUIJobSlot`, and delegates to the `ComfyUIPipelineAdapter`.

### 2.5. Progress Translation

To ensure the frontend requires zero changes to its progress tracking logic:
* The `ComfyUIPipelineAdapter` will spawn a background polling task (using the existing `TaskRunner`).
* This task will translate ComfyUI's native execution progress into the exact `GenerationProgress` (e.g., `GenerationRunning`, `GenerationComplete`) state objects expected by `AppState`.

## 3. Architectural Flow (ComfyUI Active)

1. **UI Configuration**: Frontend fetches available workflows via a new endpoint (parsed by `WorkflowParser`) and dynamically renders controls for the exposed `proxyWidgets`.
2. **Submission**: User clicks generate. Frontend sends `VideoGenerationRequest` including `workflow_params`.
3. **Routing**: `GenerationHandler` sees `generation_backend == "comfyui"`.
4. **Locking**: Handler acquires lock -> sets `ComfyUIJobSlot` to running -> unlocks.
5. **Execution**: `ComfyUIPipelineAdapter` constructs the final JSON graph and sends it to the `ComfyUIClient`.
6. **Progress**: Background task polls ComfyUI, locking briefly to update `ComfyUIJobSlot` progress.
7. **Completion**: Adapter retrieves the final media from ComfyUI, saves it locally, and updates state to `GenerationComplete`.
76 changes: 76 additions & 0 deletions change-requests/CR001-comfyui-integration/IMPLEMENTATION_PLAN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# Implementation Plan: ComfyUI Integration

This plan outlines the steps required to implement the ComfyUI integration described in the High-Level Design (HLD), adhering to the principles of "Additive Isolation" to minimize fork impact.

## Phase 1: Foundation (Settings & API Types)

**Goal:** Extend the core data structures to support routing and dynamic payload parameters without breaking existing schemas.

1. **Update Settings:**
* Modify `settings.json` and `backend/state/app_settings.py` to add `generation_backend` (defaulting to `"local"`).
2. **Update API Types:**
* In `backend/api_types.py`, add `workflow_params: dict[str, Any] | None = None` to generation requests (e.g., `VideoGenerationRequest`).
3. **Update State Types:**
* In `backend/state/app_state_types.py`, define a new `ComfyUIJobSlot` (tracking status, progress, current job ID).
* Add `comfyui_job: ComfyUIJobSlot | None` to the root `AppState` definition.

## Phase 2: Core ComfyUI Services

**Goal:** Create the isolated module (`backend/services/comfyui/`) for parsing workflows and communicating with the ComfyUI server.

1. **Create Service Directory:** Initialize `backend/services/comfyui/`.
2. **Implement `WorkflowParser`:**
* Create logic to read JSON workflows from a designated directory.
* Extract `proxyWidgets` metadata from node properties.
3. **Implement `ComfyUIClient`:**
* Create an asynchronous HTTP client to communicate with the ComfyUI API (`/prompt`, `/upload/image`, `/history`, `/view`).
4. **Expose Workflows Endpoint:**
* Create a new route in `backend/_routes/` (e.g., `workflows.py`) to expose the parsed workflows and their configurable parameters to the frontend.
* Wire the route into `app_factory.py`.

## Phase 3: Pipeline Adapters and Progress Tracking

**Goal:** Implement the adapter that bridges the strictly typed LTX-Desktop protocols with the dynamic ComfyUI engine.

1. **Implement `ComfyUIVideoPipeline`:**
* Create a class implementing the `FastVideoPipeline` (or relevant) protocol.
* Implement graph construction: merge the base JSON workflow with the incoming `workflow_params`.
2. **Implement Background Polling:**
* Use the existing `TaskRunner` to spawn a background task upon job submission.
* Poll the ComfyUI server for progress on the submitted `prompt_id`.
* Translate the progress into standard `GenerationProgress` state objects.

## Phase 4: Handler Routing & Locking Integration

**Goal:** Update the centralized handler to securely route tasks based on the active backend.

1. **Update `GenerationHandler`:**
* In `backend/handlers/generation_handler.py`, read `generation_backend` from settings.
* Implement branching logic:
* If `"local"`: Use `GpuSlot` and standard pipeline logic (existing code).
* If `"comfyui"`: Acquire lock, validate `ComfyUIJobSlot` is idle, set to running, release lock, and dispatch to `ComfyUIVideoPipeline`.
2. **Ensure Lock Safety:**
* Verify the "lock -> check -> unlock -> heavy work -> lock -> update" pattern is strictly followed for the new `ComfyUIJobSlot`.

## Phase 5: Frontend Integration

**Goal:** Update the React frontend to dynamically render UI elements based on the parsed ComfyUI workflows.

1. **Backend Toggle:** Add a UI toggle in the settings to switch between Local and ComfyUI backends.
2. **Fetch Workflows:** On mount (if ComfyUI is active), fetch the available workflows from the new backend endpoint.
3. **Dynamic Rendering:**
* Parse the returned proxy widget schemas.
* Dynamically render sliders, dropdowns, and text inputs based on the expected types of the proxy widgets.
4. **Submission Logic:** Update the `backendFetch` calls for generation to include the user-configured `workflow_params` dictionary.

## Phase 6: Testing and Validation

**Goal:** Ensure the integration is robust and the local pipeline remains unaffected.

1. **Backend Integration Tests:**
* Create new tests in `backend/tests/` using fakes for the `ComfyUIClient`.
* Verify routing logic works correctly based on settings.
2. **Type Checking:**
* Run `pnpm typecheck` to ensure the new dynamic dictionaries haven't violated strict mode rules elsewhere.
3. **Local Regression:**
* Run existing `backend:test` suite to guarantee standard local generation is completely isolated and functional.
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# Impact Assessment: Replacing Backend Generation with Configurable ComfyUI Workflows

## 1. Executive Summary

This document assesses the architectural impact of integrating a ComfyUI backend alongside the current local generative pipelines. The goal is to allow dynamic, runtime switching between the existing local GPU implementations and a configurable ComfyUI workflow engine.

Based on recent analysis of reference projects (Krita AI Diffusion and Vlo), this document focuses on clarifying the architectural options for procedural node discovery and UI mapping. The objective of this phase is to review these options prior to committing to a final architectural decision.

---

## 2. Current Architecture Overview

The backend uses a local FastAPI server where endpoints delegate business logic to a centralized `AppHandler`.

- **State Management:** A highly normalized, typed `AppState` manages limited resources (e.g., `GpuSlot`, `CpuSlot`, `DownloadingSession`).
- **Concurrency & Locking:** A single shared `RLock` protects `AppState`. Handlers follow a strict "lock -> check -> unlock -> heavy work -> lock -> update" pattern to prevent blocking the server during generation.
- **Service Boundaries:** Heavy generative tasks are isolated behind strictly typed Python Protocols in `backend/services/` (e.g., `FastVideoPipeline`).
- **Generation Lifecycle:** `GenerationHandler` tracks progress using normalized state machines (`GenerationRunning`, `GenerationComplete`, etc.).

---

## 3. Proposed Architecture Options for UI-to-Node Mapping

A core challenge is how the frontend UI (sliders, dropdowns for models/LoRAs) dynamically maps to and controls the underlying ComfyUI node graph. Two distinct architectural options have been identified based on industry reference projects.

### Option A: Functional Mapping via Dedicated Custom Nodes (The Krita Approach)

In this approach, the integration relies on abstracting the low-level node graph into a high-level Python API, tightly coupled with a dedicated suite of custom ComfyUI nodes.

* **Mechanism:**
* Relies on the ComfyUI `/object_info` API to discover available nodes and their schemas.
* The backend implements a "Builder Pattern" (`ComfyWorkflow`) that translates high-level UI requests (e.g., `generate_video(prompt, seed)`) into specific node instantiations.
* **Crucial Prerequisite:** Requires installing a dedicated ComfyUI extension with custom nodes (e.g., `LTX_LoadImage`, `LTX_InjectVideo`) designed specifically to bridge the communication gap (e.g., handling in-memory transfers or specific app logic).
* **Pros:**
* **Strong Type Safety:** The builder validates inputs/outputs against the `object_info` schema before execution.
* **Simpler UI Logic:** The UI only interacts with high-level parameters; the backend handles the complex graph construction.
* **Cons:**
* **High Maintenance:** Tightly coupled to specific custom nodes. Any changes to the node logic require updating the backend builder.
* **Less Flexible:** Harder for users to drop in arbitrary, wildly different ComfyUI workflows without backend updates.

### Option B: Proxy-Based Metadata Mapping (The Vlo Approach)

This approach is workflow-agnostic and relies on embedding UI mapping rules directly within the ComfyUI workflow JSON metadata.

* **Mechanism:**
* Relies on a custom metadata field, specifically the established ComfyUI convention `proxyWidgets`, located within the `properties` dictionary of Subgraphs (Group Nodes) or individual nodes.
* The workflow JSON explicitly defines mapping tuples (e.g., `["node_id_31", "seed"]`).
* The backend parses these JSON files at startup, discovers the exposed parameters, and serves this dynamic schema to the frontend.
* **Crucial Prerequisite:** Requires zero dedicated custom nodes. It interfaces with standard ComfyUI nodes and relies entirely on the JSON metadata structure.
* **Pros:**
* **Highly Decoupled & Workflow Agnostic:** The UI structure is defined *by* the graph. Users can drop in entirely new workflows (using standard nodes) as long as they tag the inputs with `proxyWidgets`.
* **Minimal Backend Logic:** The backend acts primarily as a pass-through and normalizer, rather than a complex graph builder.
* **Cons:**
* **Weaker Typing:** Relies on dynamic dictionaries passing through the backend, requiring careful validation logic.
* **Complex JSON Maintenance:** The burden of defining the UI shifts to whoever authors the ComfyUI JSON workflows; they must correctly set up the `proxyWidgets` arrays.

---

## 4. Architectural Impact on LTX-Desktop Backend (Option B Selected)

Following review, **Option B (Proxy-Based Metadata Mapping)** has been selected. The primary directive for this integration is to **maximize out-of-the-box compatibility** while ensuring a **minimal fork update impact**.

Crucially, the ComfyUI integration must be added *on top* of existing services. Current LTX Desktop functionality (local, native GPU generation) must be fully retained and operate exactly as before when ComfyUI is not active. By isolating the new ComfyUI logic, we ensure that upstream merges from the original `LTX-Desktop` repository remain trivial.

### 4.1. Core Principle: Isolation for Minimal Merge Conflicts
To minimize merge conflicts when pulling from the upstream fork, the ComfyUI integration will avoid heavily modifying existing core files (like `app_handler.py` or complex state machines) wherever possible. Instead, it will rely on new interface implementations and isolated modules.

* **New Modules:** All ComfyUI-specific parsing, proxy mapping, and network communication will live in a strictly separated directory (e.g., `backend/services/comfyui/`).
* **Interface Implementation:** The ComfyUI engine will implement the existing pipeline interfaces (e.g., creating a `ComfyUIVideoPipeline` that adheres to the same Protocol as `FastVideoPipeline`). This allows the core `GenerationHandler` to treat it as just another backend without knowing its internal complexities.

### 4.2. AppSettings and API Types (Additive Changes)
Changes to core API files will be strictly additive, preserving all existing schemas.
* **`app_settings.py`:** Add a new, optional `generation_backend` flag (defaulting to `local`).
* **`api_types.py`:** Add a new, optional `workflow_params: dict[str, Any] | None = None` to the generation request models to handle the dynamic `proxyWidgets` inputs. Existing strictly-typed fields (prompt, seed, etc.) remain untouched and can be mapped internally if ComfyUI is the active backend.

### 4.3. State Management (`AppState` & `GpuSlot`)
The existing `GpuSlot` logic, which carefully manages local VRAM, must remain untouched to ensure the default local generation experience is not compromised.
* **Additive State:** A new, distinct slot (e.g., `ExternalSlot` or `ComfyUIJobSlot`) will be introduced to `AppState`.
* **Locking:** The `AppHandler` will use the same locking mechanism to protect this new slot, ensuring thread safety without needing to rewrite the existing local GPU locking logic. If the active backend is ComfyUI, the handler checks the `ExternalSlot` instead of the `GpuSlot`.

### 4.4. Generation Handler and Progress Tracking
The existing `GenerationHandler` relies heavily on the local process actively reporting progress.
* **Adapter Pattern:** A `ComfyUIAdapter` service will act as a bridge. When a generation task is dispatched to ComfyUI, the adapter will use the `TaskRunner` to spawn a background polling task.
* **State Translation:** This polling task will fetch ComfyUI's native progress (using its API/Websocket) and translate it into the *exact same* `GenerationProgress` state objects currently expected by the frontend. This ensures the frontend UI requires minimal, if any, modifications to display progress bars.

---

## 5. Conclusion

By selecting **Option B (Proxy-Based Metadata Mapping)**, we achieve a highly flexible, workflow-agnostic system.

By applying a strict **"Additive Isolation"** architectural lens, we ensure that:
1. **Original functionality is preserved:** Local generation remains completely unaffected and acts as the default.
2. **Upstream merges are trivial:** By avoiding modifications to core logic loops and instead adding new interface implementations and isolated service folders, we minimize the "fork divergence," allowing the project to easily consume future updates from the original Lightricks repository.
Loading