Skip to content

🚀 Feat: DataGenFlow extensibility system via dgf CLI#67

Merged
nicofretti merged 14 commits intodevelopfrom
51-feat-datagenflow-extensibility-system-via-dgf-cli
Mar 8, 2026
Merged

🚀 Feat: DataGenFlow extensibility system via dgf CLI#67
nicofretti merged 14 commits intodevelopfrom
51-feat-datagenflow-extensibility-system-via-dgf-cli

Conversation

@nicofretti
Copy link
Owner

@nicofretti nicofretti commented Mar 6, 2026

Description

TODO

Related Issue

Checklist

  • Code follows project style guidelines
  • Comments explain "why" not "what"
  • Documentation updated (if needed)
  • No debug code or console statements
  • make format passes
  • make pre-merge passes
  • PR update from develop branch
  • CodeRabbit review requested and addressed (comment @coderabbitai review)

Summary by CodeRabbit

  • New Features

    • Extensions system: UI, API, and CLI to browse/manage custom blocks & templates, validate/install dependencies, and create pipelines from templates.
    • Hot-reload for user extensions with on-demand dependency installation.
    • Reviewer-friendly FieldMapper workflow to surface reviewer-relevant fields as top-level keys.
  • Documentation

    • New extensibility guide, setup examples, updated env example, and reviewer guidance.
  • Tests

    • Extensive unit, integration, and E2E coverage for extensions, CLI, dependency manager, and file-watcher.
  • Chores

    • Docker/compose updates to support extensibility and hot-reload.

@nicofretti nicofretti linked an issue Mar 6, 2026 that may be closed by this pull request
@nicofretti nicofretti self-assigned this Mar 6, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds an extensibility system: user block/template discovery, hot-reload file watcher, dependency manager and installer, Typer CLI (dgf), backend /api/extensions endpoints and Pydantic entities, frontend Extensions UI + client/types, registry/template refactors, many tests, docs, and review-flattening guidance for FieldMapper.

Changes

Cohort / File(s) Summary
Backend: Extensions API & Entities
app.py, lib/api/extensions.py, lib/entities/extensions.py
New /api/extensions endpoints (status, blocks, templates, reload, validate, dependencies, install-deps); Pydantic models BlockInfo/TemplateInfo/DependencyInfo; app includes extensions router and ExtensionFileWatcher lifecycle.
Registry & Blocks Core
lib/blocks/registry.py, lib/blocks/base.py, lib/blocks/builtin/...structure_sampler.py, lib/constants.py
Registry refactor to BlockEntry with source/available/error; runtime register/reload/unregister; list_blocks → list[BlockInfo]; BaseBlock.dependencies added; StructureSampler uses private _field_deps; added DEFAULT_BLOCKS_PATH/DEFAULT_TEMPLATES_PATH.
Dependency Management & Installer
lib/dependency_manager.py, lib/cli/*, tests/test_dependency_manager.py
New DependencyManager (parse/check/install), DependencyError, synchronous installer run in thread, dependency info APIs; CLI and client wiring to install/check dependencies; tests added.
File Watcher / Hot Reload
lib/file_watcher.py, .env.example, docker/Dockerfile, docker/docker-compose.yml
ExtensionFileWatcher with debounced handlers for blocks/templates; env-driven hot-reload; Docker compose updated to mount user_blocks/user_templates and enable hot reload.
CLI Tooling & Client
lib/cli/main.py, lib/cli/client.py, pyproject.toml
Adds dgf Typer CLI (blocks/templates/image/config), DataGenFlowClient HTTP wrapper, and script entrypoint.
Template System
lib/templates/__init__.py, tests/...template*
TemplateRegistry now loads builtin + user templates, tracks source, supports register/unregister/reload, returns TemplateInfo; tests updated to attribute access.
Frontend: Extensions UI & Client
frontend/src/pages/Extensions.tsx, frontend/src/services/extensionsApi.ts, frontend/src/types/index.ts, frontend/src/components/.../BlockPalette.tsx, frontend/src/App.tsx
New Extensions page and route, extensionsApi client and types, BlockPalette shows availability/source/errors with tooltips, per-item actions (validate/install-deps/create-pipeline).
Tests — unit/integration/e2e/cli
tests/... (many files)
Large battery of new and updated tests covering registry, dependency manager, file watcher, CLI, integration API, and E2E UI/API.
Docs & Skills
docs/extensibility.md, .claude/skills/*/SKILL.md, .env.example
New extensibility documentation; skill docs updated to recommend FieldMapper flattening for Review UI; .env.example documents DATAGENFLOW_* env vars.
Misc / Wiring
app.py, lib/workflow.py, docker/*, pyproject.toml
App lifespan starts/stops ExtensionFileWatcher; workflow uses registry.list_block_types(); Docker/service rename and mounts; added runtime deps (watchdog, typer, rich).

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as Backend API
    participant Registry as BlockRegistry
    participant DepMgr as DependencyManager
    participant System as Host

    Client->>API: POST /api/extensions/blocks/{name}/validate
    API->>Registry: get_entry(name)
    Registry-->>API: BlockEntry / 404
    alt Block Found
        API->>DepMgr: check_missing(block.dependencies)
        DepMgr->>System: inspect installed packages
        System-->>DepMgr: installed versions
        DepMgr-->>API: missing list
        alt No missing deps
            API-->>Client: 200 {valid: true, block: name}
        else Missing deps
            API-->>Client: 200 {valid: false, error: "...missing deps..."}
        end
    else Not Found
        API-->>Client: 404
    end
Loading
sequenceDiagram
    participant Client
    participant API as Backend API
    participant DepMgr as DependencyManager
    participant System as Host
    participant FileWatcher
    participant FS as Filesystem

    Client->>API: POST /api/extensions/blocks/{name}/install-deps
    API->>DepMgr: install(requirements)
    DepMgr->>System: run "uv pip install ..." (subprocess)
    System-->>DepMgr: install result
    DepMgr-->>API: installed list / raises DependencyError
    API-->>Client: 200 {installed: [...]} or 500 error

    Note over FileWatcher,FS: On filesystem change (user_blocks/user_templates)
    FS->>FileWatcher: detect .py/.yaml change (debounced)
    FileWatcher->>Registry: reload()
    Registry-->>API: updated listings
    API-->>Client: subsequent GETs reflect updates
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (3 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning PR description is incomplete with TODO placeholder and only Related Issue/Checklist sections filled; missing detailed explanation of what the PR accomplishes. Replace TODO with comprehensive description of extensibility system implementation, key features (CLI, Docker support, custom blocks/templates), and changes made across backend/frontend/docs.
Out of Scope Changes check ⚠️ Warning Multiple .claude/skills/ files and LLM state files (tests/e2e/TODOS.md, llm/state-backend.md, llm/state-frontend.md) violate CLAUDE.md guidelines prohibiting LLM-related files in commits. Remove .claude/skills/ directory, tests/e2e/TODOS.md, llm/state-backend.md, and llm/state-frontend.md from the commit per CLAUDE.md policy.
Docstring Coverage ⚠️ Warning Docstring coverage is 45.19% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly summarizes the main change: implementing a DataGenFlow extensibility system via dgf CLI, which matches the primary objective in the PR.
Linked Issues check ✅ Passed Code changes comprehensively implement issue #51 objectives: DataGenFlow Docker image, dgf CLI for blocks/templates registration, custom block/template support, and dependency management.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch 51-feat-datagenflow-extensibility-system-via-dgf-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
lib/blocks/base.py (1)

14-45: ⚠️ Potential issue | 🟠 Major

Use immutable default for dependencies class attribute.

The dependencies: list[str] = [] default is shared across all subclasses that don't override it. While the current code doesn't mutate it, get_schema() returns the object by reference, creating risk if future code appends to it. Use an immutable class-level default and return a copy instead.

Proposed fix
-from typing import Any
+from typing import Any, ClassVar

 class BaseBlock(ABC):
@@
-    dependencies: list[str] = []
+    dependencies: ClassVar[tuple[str, ...]] = ()
@@
-            "dependencies": cls.dependencies,
+            "dependencies": list(cls.dependencies),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/blocks/base.py` around lines 14 - 45, The class-level mutable default
dependencies should be replaced with an immutable default and the get_schema
method should return a copy to avoid exposing internal state: change the
attribute declaration from a mutable list to an immutable type (e.g.,
dependencies: tuple[str, ...] = ()), and update get_schema (the method named
get_schema on the same class) to convert cls.dependencies to a new list when
building the schema (e.g., "dependencies": list(cls.dependencies)) so callers
get a copy and cannot mutate the class attribute.
🟡 Minor comments (13)
tests/e2e/TODOS.md-1-295 (1)

1-295: ⚠️ Potential issue | 🟡 Minor

Remove this unrelated review dump from tests/e2e/.

This is a pasted branch review for 63-feat-add-datagenflow-skills, not actionable E2E TODOs for the current suite. Keeping it in-tree will confuse contributors and go stale immediately.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/TODOS.md` around lines 1 - 295, The file containing the pasted
branch review (the TODOs file that begins "We have to address a pr review (57)"
in the e2e docs) is unrelated to tests and must be removed or moved out of the
tests tree; delete this file from the tests/e2e directory (or relocate it to a
docs/reviews or project-internal notes location) and update any README or CI
references that expect only E2E artifacts there so contributors won't be
confused by stale branch-review dumps.
tests/blocks/test_base_dependencies.py-28-50 (1)

28-50: ⚠️ Potential issue | 🟡 Minor

Add a regression for dependencies aliasing.

These assertions only prove value equality. They still pass if BaseBlock.dependencies, NoDepsBlock.dependencies, and schema["dependencies"] all share the same mutable list, which is the failure mode here. Add a test that mutating one does not affect the others.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/blocks/test_base_dependencies.py` around lines 28 - 50, The tests
currently only check value equality for BaseBlock.dependencies,
NoDepsBlock.dependencies, and schema["dependencies"], which misses aliasing of
the same mutable list; add a regression test that mutates one of these lists
(e.g., append or pop on BaseBlock.dependencies or the schema returned by
WithDepsBlock.get_schema()) and assert the other lists
(NoDepsBlock.dependencies, WithDepsBlock.dependencies, and a freshly fetched
schema via get_schema()) are unchanged to ensure they are independent copies and
not aliased; reference BaseBlock, NoDepsBlock, WithDepsBlock, get_schema, and
dependencies when locating where to add this new test.
.claude/skills/using-datagenflow-extensibility/SKILL.md-101-115 (1)

101-115: ⚠️ Potential issue | 🟡 Minor

Rename LangfuseBlock to LangfuseDatasetBlock in the table.

The actual registered block is LangfuseDatasetBlock (defined in lib/blocks/builtin/langfuse.py and validated in tests/test_registry_regression.py). The table currently documents LangfuseBlock, which doesn't exist and will break sample YAML.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.claude/skills/using-datagenflow-extensibility/SKILL.md around lines 101 -
115, Update the documentation table entry that currently lists LangfuseBlock to
use the actual registered name LangfuseDatasetBlock; locate the table row
containing "LangfuseBlock" and change the block name to "LangfuseDatasetBlock"
(this matches the registered class LangfuseDatasetBlock used in the codebase and
tests such as test_registry_regression.py).
frontend/src/components/pipeline-editor/BlockPalette.tsx-11-11 (1)

11-11: ⚠️ Potential issue | 🟡 Minor

Replace any with proper type.

Per coding guidelines: "Reject any types in frontend - use proper types or unknown instead of any."

Proposed fix
-  config_schema: Record<string, any>;
+  config_schema: Record<string, unknown>;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/components/pipeline-editor/BlockPalette.tsx` at line 11, The
field config_schema currently uses Record<string, any> in BlockPalette.tsx;
replace the any with a stricter type (e.g., Record<string, unknown> or a
dedicated ConfigSchema interface/type) and update references to validate or
narrow values where used; add a new type alias or interface (e.g., ConfigSchema)
if needed and use it in place of config_schema to satisfy the "no any" rule
while preserving existing behavior.
frontend/src/pages/Extensions.tsx-17-37 (1)

17-37: ⚠️ Potential issue | 🟡 Minor

Add mounted flag to prevent state updates after unmount.

Per coding guidelines: "Use mounted flags for async operations in frontend - prevent state updates after component unmount."

Proposed fix
   useEffect(() => {
+    let mounted = true;
-    loadAll();
+    const load = async () => {
+      try {
+        const [s, b, t] = await Promise.all([
+          extensionsApi.getStatus(),
+          extensionsApi.listBlocks(),
+          extensionsApi.listTemplates(),
+        ]);
+        if (mounted) {
+          setStatus(s);
+          setBlocks(b);
+          setTemplates(t);
+        }
+      } catch (error) {
+        if (mounted) {
+          const message = error instanceof Error ? error.message : "Unknown error";
+          toast.error(`Failed to load extensions: ${message}`);
+        }
+      } finally {
+        if (mounted) setLoading(false);
+      }
+    };
+    load();
+    return () => { mounted = false; };
   }, []);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/pages/Extensions.tsx` around lines 17 - 37, The async loadAll
should avoid updating state after the component unmounts: create a mounted flag
(e.g., const mounted = useRef(true) set to false in useEffect cleanup) and in
loadAll check mounted.current before calling setStatus, setBlocks, setTemplates
and setLoading (or bail out early if not mounted); ensure the mounted flag is
referenced in the Promise resolution/error/finally paths so no state setters run
when mounted.current is false.
app.py-29-32 (1)

29-32: ⚠️ Potential issue | 🟡 Minor

Fix import sorting - pipeline failure.

Pipeline reports: "Import block is unsorted or un-formatted (I001)". The new imports need to be sorted with the existing import block.

Proposed fix
+from lib.api.extensions import router as extensions_router
+from lib.entities.extensions import BlockInfo, TemplateInfo
 from lib.errors import BlockExecutionError, BlockNotFoundError, ValidationError
+from lib.file_watcher import ExtensionFileWatcher
 from lib.job_processor import process_job_in_thread

Remove lines 29-31 and integrate them alphabetically into the lib.* import block starting at line 32.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@app.py` around lines 29 - 32, The import block is unsorted: remove the
separate lines importing router as extensions_router, BlockInfo, TemplateInfo,
ExtensionFileWatcher, BlockExecutionError, BlockNotFoundError, and
ValidationError and merge them alphabetically into the existing lib.* import
block (the import group beginning at the current lib.* block), ensuring imports
like "from lib.api.extensions import router as extensions_router", "from
lib.entities.extensions import BlockInfo, TemplateInfo", "from lib.file_watcher
import ExtensionFileWatcher", and "from lib.errors import BlockExecutionError,
BlockNotFoundError, ValidationError" are placed in proper alphabetical order
with the other lib imports and formatted consistently.
frontend/src/pages/Extensions.tsx-286-299 (1)

286-299: ⚠️ Potential issue | 🟡 Minor

Move fetch to service layer.

Per coding guidelines: "Reject inline fetch calls in frontend - all API calls must be in service layer, not in components."

The handleCreatePipeline uses inline fetch. Move this to a service or add to extensionsApi.

Proposed fix

Add to extensionsApi.ts:

async createPipelineFromTemplate(templateId: string): Promise<{ id: number; name: string }> {
  const response = await fetch(`${API_BASE}/pipelines/from_template/${encodeURIComponent(templateId)}`, {
    method: "POST",
  });
  if (!response.ok) throw new Error(`http ${response.status}`);
  return response.json();
}

Then update TemplateCard:

   const handleCreatePipeline = async () => {
     setCreating(true);
     try {
-      const res = await fetch(`/api/pipelines/from_template/${template.id}`, { method: "POST" });
-      if (!res.ok) throw new Error("Failed to create pipeline from template");
+      await extensionsApi.createPipelineFromTemplate(template.id);
       toast.success("Pipeline created from template");
       navigate("/pipelines");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/pages/Extensions.tsx` around lines 286 - 299, The inline fetch
in handleCreatePipeline should be moved to the service layer; add a new method
createPipelineFromTemplate(templateId: string) in extensionsApi (or the existing
API service) that POSTs to /pipelines/from_template/:id, checks response.ok and
returns response.json(), then replace the inline fetch in
TemplateCard/handleCreatePipeline to call
extensionsApi.createPipelineFromTemplate(template.id), handle success
(toast.success and navigate("/pipelines")), errors (toast.error using
error.message), and ensure setCreating(true/false) and error handling remain
unchanged.
lib/cli/main.py-566-566 (1)

566-566: ⚠️ Potential issue | 🟡 Minor

f-string without placeholders — pipeline failure.

Fix
-    console.print(f"\n[bold]Build from DataGenFlow repo root:[/bold]")
+    console.print("\n[bold]Build from DataGenFlow repo root:[/bold]")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/cli/main.py` at line 566, The console.print call uses an unnecessary
f-string with no placeholders which causes the pipeline to fail; locate the
console.print invocation (the line calling console.print(f"\n[bold]Build from
DataGenFlow repo root:[/bold]")) and change it to a normal string literal
(remove the leading f) so it's console.print("\n[bold]Build from DataGenFlow
repo root:[/bold]") ensuring no f-string is used when no interpolation is
required.
lib/cli/main.py-129-129 (1)

129-129: ⚠️ Potential issue | 🟡 Minor

Line too long — pipeline failure.

Fix
-    install_deps: bool = typer.Option(False, "--install-deps", help="Install block dependencies after adding"),
+    install_deps: bool = typer.Option(
+        False, "--install-deps", help="Install block dependencies after adding"
+    ),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/cli/main.py` at line 129, The line defining install_deps using
typer.Option is too long; split the arguments onto multiple lines or use
implicit line continuation to keep line length under the limit. Locate the
install_deps variable declaration (install_deps: bool = typer.Option(...)) and
reformat the call to typer.Option so each parameter (default, flag, help) is on
its own line or the whole assignment is broken into a wrapped expression,
ensuring the final line length complies with the pipeline limit and retains the
same parameter values.
lib/cli/main.py-172-172 (1)

172-172: ⚠️ Potential issue | 🟡 Minor

Line too long — pipeline failure.

Fix
-                console.print(f"[yellow]![/yellow] Block '{name}': {result.get('error', 'unknown error')}")
+                err = result.get('error', 'unknown error')
+                console.print(f"[yellow]![/yellow] Block '{name}': {err}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/cli/main.py` at line 172, The console.print f-string in lib/cli/main.py
is exceeding the max line length; refactor the long line that uses
console.print(f"[yellow]![/yellow] Block '{name}': {result.get('error', 'unknown
error')}") by building a shorter intermediate variable (e.g., message =
f"[yellow]![/yellow] Block '{name}': {error_msg}" where error_msg =
result.get('error', 'unknown error')) or by splitting the call into multiple
shorter segments before calling console.print, keeping the same output but
ensuring no line exceeds the length limit; update references to name and
result.get(...) accordingly.
tests/cli/test_commands.py-6-6 (1)

6-6: ⚠️ Potential issue | 🟡 Minor

Remove unused import — pipeline failure.

Path is imported but never used directly; tmp_path fixture provides paths.

Fix
-from pathlib import Path
 from unittest.mock import MagicMock, patch
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cli/test_commands.py` at line 6, Remove the unused import Path from the
top of tests/cli/test_commands.py; the tmp_path pytest fixture already supplies
a Path-like object for tests, so delete the "from pathlib import Path" import
(remove the "Path" symbol) to resolve the linter/pipeline failure and keep
imports minimal.
tests/cli/test_commands.py-66-73 (1)

66-73: ⚠️ Potential issue | 🟡 Minor

Remove unused get_client import — pipeline failure.

The import is only needed for the patch target string, not the symbol itself.

Fix
 `@pytest.fixture`
 def cli_app(mock_client):
     """import the app with mocked client"""
-    from lib.cli.main import app, get_client
+    from lib.cli.main import app

     # patch get_client to return our mock
     with patch("lib.cli.main.get_client", return_value=mock_client):
         yield app
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cli/test_commands.py` around lines 66 - 73, The test fixture cli_app
imports get_client unnecessarily causing a pipeline failure; change the import
to only bring in app (e.g., replace "from lib.cli.main import app, get_client"
with "from lib.cli.main import app") and keep the patch target string
"lib.cli.main.get_client" unchanged in the with patch(...) call so the mock
still applies; update the cli_app fixture accordingly and remove any other
references to the get_client symbol.
lib/templates/__init__.py-106-117 (1)

106-117: ⚠️ Potential issue | 🟡 Minor

Atomicity comment is misleading — two separate assignments.

The comment claims atomic swap via GIL, but lines 116-117 are two separate reference assignments. A concurrent reader could see updated _templates with stale _sources.

Fix: assign as tuple or use a single container
     def reload(self) -> None:
-        """re-scan builtin and user template directories.
-        builds into local dicts, then swaps atomically so concurrent readers
-        never see an empty or partial registry."""
+        """re-scan builtin and user template directories."""
         templates: dict[str, dict[str, Any]] = {}
         sources: dict[str, str] = {}
         self._load_builtin_into(templates, sources)
         if self.user_templates_dir.exists():
             self._load_user_into(self.user_templates_dir, templates, sources)
-        # atomic swap — GIL guarantees reference assignment is atomic
-        self._templates = templates
-        self._sources = sources
+        # swap both refs together (still not thread-safe for writes, but readers see consistent state)
+        self._templates, self._sources = templates, sources

Note: Tuple unpacking is also two assignments at bytecode level. For true atomicity, consider a single _state object holding both dicts, or document that concurrent writes require external locking.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/templates/__init__.py` around lines 106 - 117, The comment in reload() is
misleading because assigning self._templates and self._sources on separate lines
isn’t atomic and can expose readers to mismatched state; change the
implementation to store both dicts in a single container and assign that
container in one statement (e.g., create a single _state object/value that holds
templates and sources and replace self._state = new_state atomically), update
usages to read from the new single container (or alternatively add explicit
locking around reload and readers); reference the reload method and the
attributes _templates and _sources when making this change so all code paths are
updated to the new _state container.
🧹 Nitpick comments (12)
tests/test_file_watcher.py (1)

22-30: Replace fixed sleeps with deterministic waiting.

Both assertions depend on the timer thread firing within 150ms. That is fragile on slower CI workers and will produce intermittent failures unrelated to the debounce logic. Wait on a threading.Event or poll until timeout instead.

Also applies to: 41-47

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_file_watcher.py` around lines 22 - 30, The test relies on
time.sleep to wait for DebouncedHandler's timer which is flaky; replace the
fixed sleeps in tests around DebouncedHandler._schedule_callback (lines
exercising test_path "modified") with a deterministic wait: either have the test
provide a threading.Event that the callback sets and wait on event.wait(timeout)
or poll a shared condition (e.g., call_count) in a loop with a timeout until it
reaches the expected value; apply the same change to the other occurrence (lines
~41-47) so the assertion waits deterministically instead of sleeping for 150ms.
tests/test_template_regression.py (1)

53-56: Consider more descriptive test name.

Per test naming guidelines (test_<method>_<scenario>_<expected>):

-def test_template_registry_custom_dir(tmp_path):
+def test_list_templates_empty_dir_returns_empty_list(tmp_path):
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_template_regression.py` around lines 53 - 56, Rename the test
function test_template_registry_custom_dir to follow the
test_<method>_<scenario>_<expected> pattern (for example
test_list_templates_empty_dir_returns_empty_list) and update its docstring
accordingly; locate the test that instantiates
TemplateRegistry(templates_dir=tmp_path) and asserts reg.list_templates() == []
and change the function name and description so it clearly references
TemplateRegistry.list_templates, the empty custom directory scenario, and the
expected empty list result.
docs/extensibility.md (1)

59-61: Placeholder URL in quick start.

The clone URL uses your-org placeholder. Consider using the actual repository URL for copy-paste convenience:

-git clone https://github.com/your-org/DataGenFlow.git
+git clone https://github.com/nicofretti/DataGenFlow.git
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/extensibility.md` around lines 59 - 61, The quick-start uses a
placeholder clone URL "git clone https://github.com/your-org/DataGenFlow.git";
update that line in docs/extensibility.md to the real repository URL (or a
canonical example like the org/repo used by the project) so users can copy-paste
the command successfully — look for the exact string "git clone
https://github.com/your-org/DataGenFlow.git" and replace it with the correct
repository clone URL.
lib/file_watcher.py (2)

56-59: Remove redundant exception object from logging.

logger.exception() automatically includes exception info. The {e} is redundant.

Proposed fix
         try:
             self.callback(path, event_type)
         except Exception:
-            logger.exception(f"error in file watcher callback: {e}")
+            logger.exception("error in file watcher callback")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/file_watcher.py` around lines 56 - 59, The logger.exception call in the
file watcher exception handler is redundantly interpolating the exception
object; update the except block in lib/file_watcher.py where self.callback(path,
event_type) is called so that logger.exception is called with a descriptive
message only (e.g., "error in file watcher callback") and rely on
logger.exception to include the exception info automatically—leave the callback
invocation and exception handling structure unchanged.

126-128: is_running may be inaccurate if observer thread dies.

Current check only verifies _observer is not None. Consider also checking _observer.is_alive() for accuracy.

Proposed fix
     `@property`
     def is_running(self) -> bool:
-        return self._observer is not None
+        return self._observer is not None and self._observer.is_alive()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/file_watcher.py` around lines 126 - 128, The is_running property
currently returns True whenever self._observer is not None which can be stale if
the observer thread has died; update is_running to also check the thread
liveness by returning True only when self._observer is not None and
self._observer.is_alive() (or equivalent) so that the status reflects an
actually running observer; locate the is_running property and the _observer
attribute in the class and add the is_alive() check while guarding for cases
where _observer may be None.
frontend/src/components/pipeline-editor/BlockPalette.tsx (1)

250-256: Missing key prop on conditional wrapper.

When blockItem is wrapped in Tooltip, the key is on the Tooltip. When returned directly at line 258, the key is inside blockItem (line 195). This works but consider extracting the key to avoid the duplicate key definition.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/components/pipeline-editor/BlockPalette.tsx` around lines 250 -
256, The conditional return wraps blockItem with Tooltip which places the key on
Tooltip while blockItem itself also contains a key (duplicate/conditional
location); extract the key up one level so the same key prop (use block.type) is
applied to the outer-most returned element in both branches (i.e., ensure the
key is set where blockItem or Tooltip is returned by moving the key out of
blockItem into the surrounding JSX that returns either blockItem or <Tooltip
...>{blockItem}</Tooltip>), updating the code around blockItem and the Tooltip
conditional to reference block.type for the key.
lib/api/extensions.py (2)

96-101: Use exception chaining with raise ... from e.

Per B904: distinguishes re-raised exceptions from errors in exception handling.

Proposed fix
     try:
         installed = await dependency_manager.install(missing)
         registry.reload()
         return {"status": "ok", "installed": installed}
     except DependencyError as e:
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=str(e)) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/api/extensions.py` around lines 96 - 101, The except block handling
DependencyError should use exception chaining so the original exception is
preserved; update the except DependencyError as e block where
dependency_manager.install(missing) is called (and registry.reload() follows) to
re-raise the HTTPException using "raise HTTPException(status_code=500,
detail=str(e)) from e" instead of plain raise, referencing DependencyError and
the HTTPException construction to locate the change.

58-72: Add return type annotation.

The function returns different dict shapes. Consider a typed response model or at minimum annotate the return type.

Proposed fix
 `@router.post`("/blocks/{name}/validate")
-async def validate_block(name: str) -> dict:
+async def validate_block(name: str) -> dict[str, str | bool | list[str] | None]:
     """validate a block's availability and dependencies"""

Alternatively, define a proper response model for consistency with other endpoints.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/api/extensions.py` around lines 58 - 72, The validate_block function
returns multiple dict shapes; add an explicit return type or response model to
make the API contract clear: either create a Pydantic model (e.g.,
BlockValidationResponse with optional fields valid: bool, block: str, error:
Optional[str], missing_dependencies: Optional[List[str]]) and annotate the
function as async def validate_block(name: str) -> BlockValidationResponse, or
at minimum add a typing annotation like -> Dict[str, Any]; update imports and
replace bare dict returns in validate_block (referenced symbols: validate_block,
registry.get_block_class, registry.list_blocks,
dependency_manager.check_missing) to return instances matching the chosen
model/annotation.
frontend/src/services/extensionsApi.ts (1)

48-51: Handle potential JSON parse failure in error path.

If the error response isn't valid JSON, response.json() will throw and mask the original HTTP error.

Proposed fix
     if (!response.ok) {
-      const error = await response.json();
-      throw new Error(error.detail || `http ${response.status}`);
+      let detail = `http ${response.status}`;
+      try {
+        const error = await response.json();
+        detail = error.detail || detail;
+      } catch {
+        // response body not JSON, use status
+      }
+      throw new Error(detail);
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/services/extensionsApi.ts` around lines 48 - 51, The error-path
currently awaits response.json() which can throw if the body is not valid JSON;
wrap the JSON parse in a try/catch (around the response.json() call that lives
in the error branch handling "if (!response.ok)"), and on failure fall back to
reading the body as text (response.text()) or using response.status to construct
the Error message; then throw a new Error that includes either the parsed
error.detail, the fallback text, or `http ${response.status}` so the original
HTTP error isn't masked by a JSON parse exception.
tests/e2e/test_extensions_e2e.py (1)

370-401: Consider adding test failure summary in main block.

The manual test runner catches exceptions but doesn't track/report total pass/fail counts.

Track and report results
+    passed = 0
+    failed = 0
     for name, test_fn in tests:
         print(f"\ntest: {name}")
         try:
             test_fn()
             print("✓ passed")
+            passed += 1
         except Exception as e:
             print(f"✗ failed: {e}")
+            failed += 1

     cleanup_database()
-    print("\n✅ all extensions e2e tests completed!")
+    print(f"\n{'✅' if failed == 0 else '❌'} {passed} passed, {failed} failed")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_extensions_e2e.py` around lines 370 - 401, The main manual
runner inside the if __name__ == "__main__": block currently iterates tests (the
tests list and test_fn) and prints pass/fail but doesn't aggregate results;
modify this block to track totals (passed, failed), collect failed test names
and exception messages when catching exceptions around test_fn(), print a
concise summary after the loop (e.g., total run, passed, failed, and list
failures with errors), ensure cleanup_database() still runs, and return a
non-zero exit code (use sys.exit(1)) when there are failures so CI can detect
test failures.
lib/dependency_manager.py (2)

85-85: Use spread operator for list construction.

Static analysis correctly flags this.

Suggested fix
-        cmd = ["uv", "pip", "install", "--quiet"] + requirements
+        cmd = ["uv", "pip", "install", "--quiet", *requirements]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/dependency_manager.py` at line 85, Replace the list concatenation with
list unpacking to satisfy static analysis: modify the construction of cmd
(currently set via ["uv", "pip", "install", "--quiet"] + requirements) to use
the spread/unpacking form so that cmd becomes ["uv", "pip", "install",
"--quiet", *requirements]; keep the same variable names (cmd and requirements)
and ensure any downstream usage remains unchanged.

88-97: Preserve exception context with raise ... from.

Chaining exceptions preserves the original traceback for debugging.

Suggested fix
         try:
             result = subprocess.run(cmd, capture_output=True, text=True, timeout=timeout)
             if result.returncode != 0:
                 raise DependencyError(f"uv pip install failed: {result.stderr}")
-            logger.info(f"successfully installed: {requirements}")
-            return requirements
         except subprocess.TimeoutExpired:
-            raise DependencyError(f"installation timed out after {timeout}s")
+            raise DependencyError(f"installation timed out after {timeout}s") from None
         except FileNotFoundError:
-            raise DependencyError("uv not found")
+            raise DependencyError("uv not found") from None
+        else:
+            logger.info(f"successfully installed: {requirements}")
+            return requirements
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/dependency_manager.py` around lines 88 - 97, The subprocess failures
should preserve original exception context: switch to using subprocess.run(...,
check=True) (instead of manually checking result.returncode) so a
subprocess.CalledProcessError is raised on non-zero exits and catch it
(CalledProcessError as e) then raise DependencyError(... ) from e; also change
the existing except handlers to capture the exception objects (except
subprocess.TimeoutExpired as e and except FileNotFoundError as e) and re-raise
DependencyError using "from e". Update references around subprocess.run,
CalledProcessError, TimeoutExpired, FileNotFoundError, DependencyError, cmd and
requirements accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docker/docker-compose.yml`:
- Around line 21-22: The healthcheck in docker-compose uses curl (healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]) but the
python:3.10-slim production image doesn't include curl; either install curl in
the production stage of the Dockerfile (add an apt-get update && apt-get install
-y curl && cleanup) or change the healthcheck to a Python-based check (e.g., run
python -c to request /health) so the healthcheck command matches what is
actually available in the image. Ensure you update the Dockerfile production
stage or replace the healthcheck test command accordingly.

In `@frontend/src/types/index.ts`:
- Around line 119-140: Change the exported DTOs to avoid using any on the API
boundary: replace BlockInfo.config_schema's type and TemplateInfo.example_seed's
type with a safer type (e.g., JsonValue or unknown) so callers get proper
type-safety; update the interfaces BlockInfo and TemplateInfo accordingly and
ensure imports or type aliases for JsonValue are added if chosen (or use unknown
if you prefer not to introduce a new type).

In `@lib/blocks/registry.py`:
- Around line 61-79: The except branch currently swallows import failures and
skips adding any registry entry; instead catch the exception and add a
BlockEntry representing the failed import so the block appears with
available=False and the failure reason is recorded. In the except block
(handling module_name), create and insert an entries[...] entry (using the same
key convention as successful entries) with block_class=None (or a sentinel),
source=source, available=False and an error/message field set to the exception
text (str(e)), and still log the warning; this ensures the registry records
failed imports for validate/install-deps and UI display.
- Around line 15-20: The _SOURCE_MAP currently uses hard-coded relative strings
("lib/blocks/builtin", "lib/blocks/custom") which fail when the process CWD is
not the repo root; update registry discovery to resolve builtin/custom paths
relative to the package file rather than the CWD: compute a base package path
using the module file location (e.g., Path(__file__).resolve().parent or
similar) and join that base with "blocks/builtin" and "blocks/custom" before
using them in the registry logic that references _SOURCE_MAP and the
builtin/custom discovery code paths (also update the corresponding code block
around the other occurrence at lines ~51-60 to use the same resolved paths).

In `@lib/cli/main.py`:
- Around line 362-367: The loop over yaml_file that currently swallows all
exceptions should instead catch Exception as e and emit a warning that includes
the filename and the error before continuing; locate the block that opens
yaml_file and calls yaml.safe_load, change "except Exception: continue" to
"except Exception as e: logger.warning(...)" (or use the module's existing
logger variable) and include yaml_file and the exception message/traceback in
the log so the failure is visible to users/ops while preserving the continue
behavior.
- Around line 161-164: Replace the silent except in the
client.reload_extensions() call with error logging: catch Exception as e in the
except block and call the module's logger (or processLogger) to log a
descriptive error message that includes the exception details (and optionally
the traceback) so reload failures are visible; update the except block around
client.reload_extensions() to log the failure rather than using "pass".

In `@tests/integration/test_extensions.py`:
- Around line 84-86: The test only covers the 404 branch for
test_install_deps_nonexistent_block_returns_404; add a complementary happy-path
test that stubs the dependency installer used by the install-deps endpoint (mock
the installer function/class the route calls), call
client.post("/api/extensions/blocks/{existing_block}/install-deps") with a real
or fixture block name, assert resp.status_code == 200 and validate the JSON
payload shape/content returned by the endpoint; reference the test function name
test_install_deps_nonexistent_block_returns_404 to place the new test nearby and
stub the exact installer symbol the route invokes so the route wiring and
response serialization are exercised.

In `@tests/test_api_regression.py`:
- Around line 21-29: The test test_blocks_endpoint_schema_shape needs to expect
the new block-level "dependencies" key: update the required_keys set (in that
function) to include "dependencies" so the check which computes missing =
required_keys - set(block.keys()) will fail when the API omits the new field;
locate the variable required_keys in test_blocks_endpoint_schema_shape and add
"dependencies" to the set of expected keys.

---

Outside diff comments:
In `@lib/blocks/base.py`:
- Around line 14-45: The class-level mutable default dependencies should be
replaced with an immutable default and the get_schema method should return a
copy to avoid exposing internal state: change the attribute declaration from a
mutable list to an immutable type (e.g., dependencies: tuple[str, ...] = ()),
and update get_schema (the method named get_schema on the same class) to convert
cls.dependencies to a new list when building the schema (e.g., "dependencies":
list(cls.dependencies)) so callers get a copy and cannot mutate the class
attribute.

---

Minor comments:
In @.claude/skills/using-datagenflow-extensibility/SKILL.md:
- Around line 101-115: Update the documentation table entry that currently lists
LangfuseBlock to use the actual registered name LangfuseDatasetBlock; locate the
table row containing "LangfuseBlock" and change the block name to
"LangfuseDatasetBlock" (this matches the registered class LangfuseDatasetBlock
used in the codebase and tests such as test_registry_regression.py).

In `@app.py`:
- Around line 29-32: The import block is unsorted: remove the separate lines
importing router as extensions_router, BlockInfo, TemplateInfo,
ExtensionFileWatcher, BlockExecutionError, BlockNotFoundError, and
ValidationError and merge them alphabetically into the existing lib.* import
block (the import group beginning at the current lib.* block), ensuring imports
like "from lib.api.extensions import router as extensions_router", "from
lib.entities.extensions import BlockInfo, TemplateInfo", "from lib.file_watcher
import ExtensionFileWatcher", and "from lib.errors import BlockExecutionError,
BlockNotFoundError, ValidationError" are placed in proper alphabetical order
with the other lib imports and formatted consistently.

In `@frontend/src/components/pipeline-editor/BlockPalette.tsx`:
- Line 11: The field config_schema currently uses Record<string, any> in
BlockPalette.tsx; replace the any with a stricter type (e.g., Record<string,
unknown> or a dedicated ConfigSchema interface/type) and update references to
validate or narrow values where used; add a new type alias or interface (e.g.,
ConfigSchema) if needed and use it in place of config_schema to satisfy the "no
any" rule while preserving existing behavior.

In `@frontend/src/pages/Extensions.tsx`:
- Around line 17-37: The async loadAll should avoid updating state after the
component unmounts: create a mounted flag (e.g., const mounted = useRef(true)
set to false in useEffect cleanup) and in loadAll check mounted.current before
calling setStatus, setBlocks, setTemplates and setLoading (or bail out early if
not mounted); ensure the mounted flag is referenced in the Promise
resolution/error/finally paths so no state setters run when mounted.current is
false.
- Around line 286-299: The inline fetch in handleCreatePipeline should be moved
to the service layer; add a new method createPipelineFromTemplate(templateId:
string) in extensionsApi (or the existing API service) that POSTs to
/pipelines/from_template/:id, checks response.ok and returns response.json(),
then replace the inline fetch in TemplateCard/handleCreatePipeline to call
extensionsApi.createPipelineFromTemplate(template.id), handle success
(toast.success and navigate("/pipelines")), errors (toast.error using
error.message), and ensure setCreating(true/false) and error handling remain
unchanged.

In `@lib/cli/main.py`:
- Line 566: The console.print call uses an unnecessary f-string with no
placeholders which causes the pipeline to fail; locate the console.print
invocation (the line calling console.print(f"\n[bold]Build from DataGenFlow repo
root:[/bold]")) and change it to a normal string literal (remove the leading f)
so it's console.print("\n[bold]Build from DataGenFlow repo root:[/bold]")
ensuring no f-string is used when no interpolation is required.
- Line 129: The line defining install_deps using typer.Option is too long; split
the arguments onto multiple lines or use implicit line continuation to keep line
length under the limit. Locate the install_deps variable declaration
(install_deps: bool = typer.Option(...)) and reformat the call to typer.Option
so each parameter (default, flag, help) is on its own line or the whole
assignment is broken into a wrapped expression, ensuring the final line length
complies with the pipeline limit and retains the same parameter values.
- Line 172: The console.print f-string in lib/cli/main.py is exceeding the max
line length; refactor the long line that uses console.print(f"[yellow]![/yellow]
Block '{name}': {result.get('error', 'unknown error')}") by building a shorter
intermediate variable (e.g., message = f"[yellow]![/yellow] Block '{name}':
{error_msg}" where error_msg = result.get('error', 'unknown error')) or by
splitting the call into multiple shorter segments before calling console.print,
keeping the same output but ensuring no line exceeds the length limit; update
references to name and result.get(...) accordingly.

In `@lib/templates/__init__.py`:
- Around line 106-117: The comment in reload() is misleading because assigning
self._templates and self._sources on separate lines isn’t atomic and can expose
readers to mismatched state; change the implementation to store both dicts in a
single container and assign that container in one statement (e.g., create a
single _state object/value that holds templates and sources and replace
self._state = new_state atomically), update usages to read from the new single
container (or alternatively add explicit locking around reload and readers);
reference the reload method and the attributes _templates and _sources when
making this change so all code paths are updated to the new _state container.

In `@tests/blocks/test_base_dependencies.py`:
- Around line 28-50: The tests currently only check value equality for
BaseBlock.dependencies, NoDepsBlock.dependencies, and schema["dependencies"],
which misses aliasing of the same mutable list; add a regression test that
mutates one of these lists (e.g., append or pop on BaseBlock.dependencies or the
schema returned by WithDepsBlock.get_schema()) and assert the other lists
(NoDepsBlock.dependencies, WithDepsBlock.dependencies, and a freshly fetched
schema via get_schema()) are unchanged to ensure they are independent copies and
not aliased; reference BaseBlock, NoDepsBlock, WithDepsBlock, get_schema, and
dependencies when locating where to add this new test.

In `@tests/cli/test_commands.py`:
- Line 6: Remove the unused import Path from the top of
tests/cli/test_commands.py; the tmp_path pytest fixture already supplies a
Path-like object for tests, so delete the "from pathlib import Path" import
(remove the "Path" symbol) to resolve the linter/pipeline failure and keep
imports minimal.
- Around line 66-73: The test fixture cli_app imports get_client unnecessarily
causing a pipeline failure; change the import to only bring in app (e.g.,
replace "from lib.cli.main import app, get_client" with "from lib.cli.main
import app") and keep the patch target string "lib.cli.main.get_client"
unchanged in the with patch(...) call so the mock still applies; update the
cli_app fixture accordingly and remove any other references to the get_client
symbol.

In `@tests/e2e/TODOS.md`:
- Around line 1-295: The file containing the pasted branch review (the TODOs
file that begins "We have to address a pr review (57)" in the e2e docs) is
unrelated to tests and must be removed or moved out of the tests tree; delete
this file from the tests/e2e directory (or relocate it to a docs/reviews or
project-internal notes location) and update any README or CI references that
expect only E2E artifacts there so contributors won't be confused by stale
branch-review dumps.

---

Nitpick comments:
In `@docs/extensibility.md`:
- Around line 59-61: The quick-start uses a placeholder clone URL "git clone
https://github.com/your-org/DataGenFlow.git"; update that line in
docs/extensibility.md to the real repository URL (or a canonical example like
the org/repo used by the project) so users can copy-paste the command
successfully — look for the exact string "git clone
https://github.com/your-org/DataGenFlow.git" and replace it with the correct
repository clone URL.

In `@frontend/src/components/pipeline-editor/BlockPalette.tsx`:
- Around line 250-256: The conditional return wraps blockItem with Tooltip which
places the key on Tooltip while blockItem itself also contains a key
(duplicate/conditional location); extract the key up one level so the same key
prop (use block.type) is applied to the outer-most returned element in both
branches (i.e., ensure the key is set where blockItem or Tooltip is returned by
moving the key out of blockItem into the surrounding JSX that returns either
blockItem or <Tooltip ...>{blockItem}</Tooltip>), updating the code around
blockItem and the Tooltip conditional to reference block.type for the key.

In `@frontend/src/services/extensionsApi.ts`:
- Around line 48-51: The error-path currently awaits response.json() which can
throw if the body is not valid JSON; wrap the JSON parse in a try/catch (around
the response.json() call that lives in the error branch handling "if
(!response.ok)"), and on failure fall back to reading the body as text
(response.text()) or using response.status to construct the Error message; then
throw a new Error that includes either the parsed error.detail, the fallback
text, or `http ${response.status}` so the original HTTP error isn't masked by a
JSON parse exception.

In `@lib/api/extensions.py`:
- Around line 96-101: The except block handling DependencyError should use
exception chaining so the original exception is preserved; update the except
DependencyError as e block where dependency_manager.install(missing) is called
(and registry.reload() follows) to re-raise the HTTPException using "raise
HTTPException(status_code=500, detail=str(e)) from e" instead of plain raise,
referencing DependencyError and the HTTPException construction to locate the
change.
- Around line 58-72: The validate_block function returns multiple dict shapes;
add an explicit return type or response model to make the API contract clear:
either create a Pydantic model (e.g., BlockValidationResponse with optional
fields valid: bool, block: str, error: Optional[str], missing_dependencies:
Optional[List[str]]) and annotate the function as async def validate_block(name:
str) -> BlockValidationResponse, or at minimum add a typing annotation like ->
Dict[str, Any]; update imports and replace bare dict returns in validate_block
(referenced symbols: validate_block, registry.get_block_class,
registry.list_blocks, dependency_manager.check_missing) to return instances
matching the chosen model/annotation.

In `@lib/dependency_manager.py`:
- Line 85: Replace the list concatenation with list unpacking to satisfy static
analysis: modify the construction of cmd (currently set via ["uv", "pip",
"install", "--quiet"] + requirements) to use the spread/unpacking form so that
cmd becomes ["uv", "pip", "install", "--quiet", *requirements]; keep the same
variable names (cmd and requirements) and ensure any downstream usage remains
unchanged.
- Around line 88-97: The subprocess failures should preserve original exception
context: switch to using subprocess.run(..., check=True) (instead of manually
checking result.returncode) so a subprocess.CalledProcessError is raised on
non-zero exits and catch it (CalledProcessError as e) then raise
DependencyError(... ) from e; also change the existing except handlers to
capture the exception objects (except subprocess.TimeoutExpired as e and except
FileNotFoundError as e) and re-raise DependencyError using "from e". Update
references around subprocess.run, CalledProcessError, TimeoutExpired,
FileNotFoundError, DependencyError, cmd and requirements accordingly.

In `@lib/file_watcher.py`:
- Around line 56-59: The logger.exception call in the file watcher exception
handler is redundantly interpolating the exception object; update the except
block in lib/file_watcher.py where self.callback(path, event_type) is called so
that logger.exception is called with a descriptive message only (e.g., "error in
file watcher callback") and rely on logger.exception to include the exception
info automatically—leave the callback invocation and exception handling
structure unchanged.
- Around line 126-128: The is_running property currently returns True whenever
self._observer is not None which can be stale if the observer thread has died;
update is_running to also check the thread liveness by returning True only when
self._observer is not None and self._observer.is_alive() (or equivalent) so that
the status reflects an actually running observer; locate the is_running property
and the _observer attribute in the class and add the is_alive() check while
guarding for cases where _observer may be None.

In `@tests/e2e/test_extensions_e2e.py`:
- Around line 370-401: The main manual runner inside the if __name__ ==
"__main__": block currently iterates tests (the tests list and test_fn) and
prints pass/fail but doesn't aggregate results; modify this block to track
totals (passed, failed), collect failed test names and exception messages when
catching exceptions around test_fn(), print a concise summary after the loop
(e.g., total run, passed, failed, and list failures with errors), ensure
cleanup_database() still runs, and return a non-zero exit code (use sys.exit(1))
when there are failures so CI can detect test failures.

In `@tests/test_file_watcher.py`:
- Around line 22-30: The test relies on time.sleep to wait for
DebouncedHandler's timer which is flaky; replace the fixed sleeps in tests
around DebouncedHandler._schedule_callback (lines exercising test_path
"modified") with a deterministic wait: either have the test provide a
threading.Event that the callback sets and wait on event.wait(timeout) or poll a
shared condition (e.g., call_count) in a loop with a timeout until it reaches
the expected value; apply the same change to the other occurrence (lines ~41-47)
so the assertion waits deterministically instead of sleeping for 150ms.

In `@tests/test_template_regression.py`:
- Around line 53-56: Rename the test function test_template_registry_custom_dir
to follow the test_<method>_<scenario>_<expected> pattern (for example
test_list_templates_empty_dir_returns_empty_list) and update its docstring
accordingly; locate the test that instantiates
TemplateRegistry(templates_dir=tmp_path) and asserts reg.list_templates() == []
and change the function name and description so it clearly references
TemplateRegistry.list_templates, the empty custom directory scenario, and the
expected empty list result.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1a05036a-671f-4ff9-8f3d-e7b10bcca54f

📥 Commits

Reviewing files that changed from the base of the PR and between 9a7a28a and b0c11f2.

📒 Files selected for processing (47)
  • .claude/skills/creating-pipeline-templates/SKILL.md
  • .claude/skills/implementing-datagenflow-blocks/SKILL.md
  • .claude/skills/testing-pipeline-templates/SKILL.md
  • .claude/skills/using-datagenflow-extensibility/SKILL.md
  • .claude/skills/writing-e2e-tests/SKILL.md
  • .env.example
  • app.py
  • docker/Dockerfile
  • docker/docker-compose.yml
  • docs/extensibility.md
  • frontend/src/App.tsx
  • frontend/src/components/pipeline-editor/BlockPalette.tsx
  • frontend/src/pages/Extensions.tsx
  • frontend/src/services/extensionsApi.ts
  • frontend/src/types/index.ts
  • lib/api/__init__.py
  • lib/api/extensions.py
  • lib/blocks/base.py
  • lib/blocks/registry.py
  • lib/cli/__init__.py
  • lib/cli/client.py
  • lib/cli/main.py
  • lib/dependency_manager.py
  • lib/entities/extensions.py
  • lib/file_watcher.py
  • lib/templates/__init__.py
  • lib/workflow.py
  • pyproject.toml
  • tests/blocks/test_base_dependencies.py
  • tests/blocks/test_registry.py
  • tests/blocks/test_registry_enhanced.py
  • tests/cli/__init__.py
  • tests/cli/test_commands.py
  • tests/e2e/TODOS.md
  • tests/e2e/run_all_tests.sh
  • tests/e2e/test_extensions_api_e2e.py
  • tests/e2e/test_extensions_e2e.py
  • tests/e2e/test_helpers.py
  • tests/integration/test_extensions.py
  • tests/test_api_regression.py
  • tests/test_dependency_manager.py
  • tests/test_extensions_api.py
  • tests/test_file_watcher.py
  • tests/test_registry_regression.py
  • tests/test_template_enhanced.py
  • tests/test_template_regression.py
  • tests/test_templates.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🧹 Nitpick comments (6)
tests/test_file_watcher.py (1)

30-31: Replace fixed sleeps with a deadline-based wait.

These assertions depend on thread scheduling and can fail intermittently on busy CI runners. Poll until the expected callback count is reached, or wait on the pending timers directly, instead of sleeping a fixed 150 ms. As per coding guidelines, tests/**/*.py: Tests are focused and maintainable.

Also applies to: 47-48

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_file_watcher.py` around lines 30 - 31, Replace the brittle fixed
sleep + assert pattern by polling with a deadline: instead of time.sleep(0.15)
followed by assert call_count == 1, loop (using time.monotonic) until either
call_count reaches the expected value or a timeout elapses and then assert;
apply the same change to the other occurrence (lines referencing call_count at
47-48). Locate the spots using the call_count variable in
tests/test_file_watcher.py and replace the fixed sleep with a short-poll loop
(or wait on any pending timer objects if available) that raises/asserts only
after the deadline expires.
lib/dependency_manager.py (2)

100-103: Add exception chaining for better tracebacks.

Use raise ... from err to preserve the original exception context.

Proposed fix
-        except subprocess.TimeoutExpired:
-            raise DependencyError(f"installation timed out after {timeout}s")
-        except FileNotFoundError:
-            raise DependencyError("uv not found")
+        except subprocess.TimeoutExpired as err:
+            raise DependencyError(f"installation timed out after {timeout}s") from err
+        except FileNotFoundError as err:
+            raise DependencyError("uv not found") from err
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/dependency_manager.py` around lines 100 - 103, The except handlers
currently raise DependencyError without exception chaining; update the two
except blocks that catch subprocess.TimeoutExpired and FileNotFoundError (the
handlers that raise "installation timed out..." and "uv not found") to re-raise
DependencyError using "raise DependencyError(...) from err" so the original
exception (err) is preserved in the traceback; keep the same messages but attach
the original exception when raising.

91-91: Use unpacking for cleaner list construction.

Minor nit from Ruff RUF005.

Proposed fix
-        cmd = ["uv", "pip", "install", "--quiet"] + requirements
+        cmd = ["uv", "pip", "install", "--quiet", *requirements]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/dependency_manager.py` at line 91, The list construction for cmd
currently concatenates two lists; replace it with list unpacking to satisfy Ruff
RUF005 by using the starred expression to inline requirements, e.g. build cmd as
a single list literal that starts with the fixed elements and uses *requirements
(refer to the cmd variable and the requirements identifier) so the resulting
list is identical but uses unpacking instead of +.
tests/e2e/test_extensions_e2e.py (3)

306-323: Test name is misleading.

test_validate_nonexistent_block_type_in_ui actually validates an existing block (the first one). The docstring correctly describes the behavior: "verify all Validate buttons are clickable and produce a toast." Consider renaming to match the actual test behavior.

Suggested rename
-def test_validate_nonexistent_block_type_in_ui():
-    """verify all Validate buttons are clickable and produce a toast (success or error)"""
+def test_validate_button_produces_toast():
+    """verify clicking Validate button produces a toast notification"""
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_extensions_e2e.py` around lines 306 - 323, Rename the
misleading test function test_validate_nonexistent_block_type_in_ui to a name
that reflects it validates the first extension's Validate button and checks a
toast, e.g., test_validate_button_shows_toast or
test_validate_first_extension_button_shows_toast; update the function name and
its docstring to match ("verify Validate buttons are clickable and produce a
toast" or similar), and update any references to
test_validate_nonexistent_block_type_in_ui (e.g., in test suites or markers) so
_navigate_to_extensions, validate_btn, and the toast assertions remain
unchanged.

197-223: Reload preservation test has a subtle race.

After clicking reload and waiting for the toast, you call page.wait_for_load_state("networkidle") but then close the browser before asserting UI state. The API assertions happen outside the browser context, which is fine, but if the intent was to verify UI counts post-reload, that's missing.

Current implementation correctly verifies API consistency. The comment "wait for page to re-render after reload" suggests UI verification was intended but isn't performed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_extensions_e2e.py` around lines 197 - 223, The test
test_reload_preserves_block_count currently waits for the reload toast and calls
page.wait_for_load_state("networkidle") then closes the browser, but never
verifies UI counts; to fix, after the toast and networkidle wait (using
page.wait_for_load_state and page.get_by_role("button", name="Reload").click())
keep the browser/page open long enough to locate and assert the UI
block/template count elements (instead of immediately closing browser) or
explicitly add UI assertions that read the displayed counts (e.g., locate the
blocks/templates count elements) and compare them to get_extensions_status()
before calling browser.close(); ensure these UI checks occur in the same context
as page.wait_for_load_state so the test verifies both API and UI consistency.

397-403: Fragile skip detection in main block.

Checking type(e).__name__ == "Skipped" is brittle. If pytest internals change or the exception is wrapped, this breaks silently.

Proposed fix
+import pytest
+
 for name, test_fn in tests:
     print(f"\ntest: {name}")
     try:
         test_fn()
         print("✓ passed")
     except BaseException as e:
-        if type(e).__name__ == "Skipped":
+        if isinstance(e, pytest.skip.Exception):
             print(f"⊘ skipped: {e}")
         elif isinstance(e, (KeyboardInterrupt, SystemExit)):
             raise
         else:
             print(f"✗ failed: {e}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_extensions_e2e.py` around lines 397 - 403, The except
BaseException as e block uses a brittle string comparison (type(e).__name__ ==
"Skipped"); import pytest and replace that check with a robust isinstance check
against pytest.SkipException (i.e., detect skip via isinstance(e,
pytest.SkipException)), and keep the existing KeyboardInterrupt/SystemExit
re-raise and generic failure branch; update the except block where "except
BaseException as e" is handled in tests/e2e/test_extensions_e2e.py to use
pytest.SkipException detection instead of comparing type names.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app.py`:
- Around line 91-92: The directory creation calls for the configured extension
paths use Path(...).mkdir(exist_ok=True) which fails for nested targets; update
the two mkdir invocations that create DATAGENFLOW_BLOCKS_PATH and
DATAGENFLOW_TEMPLATES_PATH so they pass parents=True (i.e.,
Path(os.getenv("DATAGENFLOW_BLOCKS_PATH", "user_blocks")).mkdir(parents=True,
exist_ok=True) and similarly for DATAGENFLOW_TEMPLATES_PATH) to ensure parent
directories are created recursively.

In `@docker/docker-compose.yml`:
- Line 2: Service name change from "backend" to "datagenflow" will break
existing deployments; update documentation by adding a clear migration note to
the project CHANGELOG or release notes that lists the rename (backend ->
datagenflow), any affected docker commands (e.g., docker-compose exec backend),
network aliases, and example updated commands/scripts, and include a short
remediation section showing the new service name and any steps users must take
to update their deployments and CI scripts.

In `@frontend/src/types/index.ts`:
- Around line 127-139: The BlockInfo.config_schema type is too generic; replace
Record<string, JsonValue> with a proper JSON Schema interface (e.g., JsonSchema
with optional properties: Record<string, JsonSchemaProperty>, required?:
string[], type?: string | string[], title?: string, description?: string) and
use that type in the BlockInfo declaration so PipelineEditor and
BlockConfigPanel can access .properties and .required without casting; add or
export the new JsonSchema/JsonSchemaProperty types in the same types file and
update any imports/usages (PipelineEditor.tsx and BlockConfigPanel.tsx) to
accept the new type or add narrow type guards where necessary. Ensure the new
schema types include the minimal fields your UI expects (properties and
required) and keep other fields optional to remain compatible with other
callers.

In `@lib/api/extensions.py`:
- Around line 77-85: The handler block_dependencies currently returns an empty
list when entry.block_class is None, losing declared requirements for
import-failed blocks; change it to return
dependency_manager.get_dependency_info(entry.requirements) (or equivalent field
holding declared requirements) when block_class is None, and include the
recorded import error (e.g., entry.import_error or entry.load_error) in the
response or as an error detail so the UI can surface and allow installation;
apply the same change to the related install-deps endpoint logic that checks
entry.block_class (the handlers referenced around lines 88-97) so both endpoints
use entry.requirements and/or entry.import_error instead of treating the
dependency list as empty.

In `@lib/cli/main.py`:
- Around line 203-206: When ast.parse(py_file.read_text()) raises SyntaxError
the code currently silently continues; update the except to capture the
exception (except SyntaxError as e) and log a clear message including the file
identifier py_file and the exception e before continuing. Locate the try/except
around ast.parse in lib/cli/main.py and use the existing project logger (or
logging.getLogger) to emit a warning or error like "Skipping file {py_file}:
{e}" so users know which files were skipped.

In `@lib/file_watcher.py`:
- Around line 44-50: Pending threading.Timer instances created in the debounce
logic can fire after shutdown and call registry.reload(); mark each timer as
daemon and cancel pending timers in stop() to avoid callbacks during teardown.
Before timer.start() in the debounce path (where threading.Timer is constructed
with self.debounce_ms and _execute_callback and stored in self._pending[key]),
set timer.daemon = True and ensure the same change is applied to the other
identical timer creation site; then modify stop() to iterate over self._pending
values, call cancel() on each Timer, clear the _pending dict, and only then join
the observer so no timers can fire after shutdown.

---

Nitpick comments:
In `@lib/dependency_manager.py`:
- Around line 100-103: The except handlers currently raise DependencyError
without exception chaining; update the two except blocks that catch
subprocess.TimeoutExpired and FileNotFoundError (the handlers that raise
"installation timed out..." and "uv not found") to re-raise DependencyError
using "raise DependencyError(...) from err" so the original exception (err) is
preserved in the traceback; keep the same messages but attach the original
exception when raising.
- Line 91: The list construction for cmd currently concatenates two lists;
replace it with list unpacking to satisfy Ruff RUF005 by using the starred
expression to inline requirements, e.g. build cmd as a single list literal that
starts with the fixed elements and uses *requirements (refer to the cmd variable
and the requirements identifier) so the resulting list is identical but uses
unpacking instead of +.

In `@tests/e2e/test_extensions_e2e.py`:
- Around line 306-323: Rename the misleading test function
test_validate_nonexistent_block_type_in_ui to a name that reflects it validates
the first extension's Validate button and checks a toast, e.g.,
test_validate_button_shows_toast or
test_validate_first_extension_button_shows_toast; update the function name and
its docstring to match ("verify Validate buttons are clickable and produce a
toast" or similar), and update any references to
test_validate_nonexistent_block_type_in_ui (e.g., in test suites or markers) so
_navigate_to_extensions, validate_btn, and the toast assertions remain
unchanged.
- Around line 197-223: The test test_reload_preserves_block_count currently
waits for the reload toast and calls page.wait_for_load_state("networkidle")
then closes the browser, but never verifies UI counts; to fix, after the toast
and networkidle wait (using page.wait_for_load_state and
page.get_by_role("button", name="Reload").click()) keep the browser/page open
long enough to locate and assert the UI block/template count elements (instead
of immediately closing browser) or explicitly add UI assertions that read the
displayed counts (e.g., locate the blocks/templates count elements) and compare
them to get_extensions_status() before calling browser.close(); ensure these UI
checks occur in the same context as page.wait_for_load_state so the test
verifies both API and UI consistency.
- Around line 397-403: The except BaseException as e block uses a brittle string
comparison (type(e).__name__ == "Skipped"); import pytest and replace that check
with a robust isinstance check against pytest.SkipException (i.e., detect skip
via isinstance(e, pytest.SkipException)), and keep the existing
KeyboardInterrupt/SystemExit re-raise and generic failure branch; update the
except block where "except BaseException as e" is handled in
tests/e2e/test_extensions_e2e.py to use pytest.SkipException detection instead
of comparing type names.

In `@tests/test_file_watcher.py`:
- Around line 30-31: Replace the brittle fixed sleep + assert pattern by polling
with a deadline: instead of time.sleep(0.15) followed by assert call_count == 1,
loop (using time.monotonic) until either call_count reaches the expected value
or a timeout elapses and then assert; apply the same change to the other
occurrence (lines referencing call_count at 47-48). Locate the spots using the
call_count variable in tests/test_file_watcher.py and replace the fixed sleep
with a short-poll loop (or wait on any pending timer objects if available) that
raises/asserts only after the deadline expires.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 659f58ec-504b-426b-8cc4-a3c52a3925e7

📥 Commits

Reviewing files that changed from the base of the PR and between b0c11f2 and 1e2bc9e.

📒 Files selected for processing (27)
  • app.py
  • docker/docker-compose.yml
  • frontend/src/components/pipeline-editor/BlockPalette.tsx
  • frontend/src/pages/Extensions.tsx
  • frontend/src/services/extensionsApi.ts
  • frontend/src/types/index.ts
  • lib/api/extensions.py
  • lib/blocks/builtin/structure_sampler.py
  • lib/blocks/registry.py
  • lib/cli/client.py
  • lib/cli/main.py
  • lib/dependency_manager.py
  • lib/file_watcher.py
  • pyproject.toml
  • tests/blocks/test_base_dependencies.py
  • tests/blocks/test_registry_enhanced.py
  • tests/blocks/test_structure_sampler.py
  • tests/cli/test_commands.py
  • tests/e2e/test_extensions_api_e2e.py
  • tests/e2e/test_extensions_e2e.py
  • tests/integration/test_extensions.py
  • tests/test_api_regression.py
  • tests/test_dependency_manager.py
  • tests/test_file_watcher.py
  • tests/test_registry_regression.py
  • tests/test_template_enhanced.py
  • tests/test_template_regression.py
🚧 Files skipped from review as they are similar to previous changes (7)
  • frontend/src/pages/Extensions.tsx
  • pyproject.toml
  • tests/test_registry_regression.py
  • lib/cli/client.py
  • tests/test_template_enhanced.py
  • frontend/src/components/pipeline-editor/BlockPalette.tsx
  • tests/e2e/test_extensions_api_e2e.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (5)
lib/file_watcher.py (2)

64-65: Remove redundant exception object from logger.exception.

logger.exception() already captures and logs the exception; passing e duplicates it in the message.

Proposed fix
-            logger.exception(f"error in file watcher callback: {e}")
+            logger.exception("error in file watcher callback")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/file_watcher.py` around lines 64 - 65, The except block in
file_watcher.py redundantly passes the exception object to logger.exception,
causing duplicated exception output; modify the except clause handling in the
callback by changing "except Exception as e:" to still catch the exception but
call logger.exception with a message only (e.g., logger.exception("error in file
watcher callback")) and remove the appended "{e}" so logger.exception itself
records the traceback; ensure the except block still uses the same exception
binding if needed elsewhere or drop the "as e" binding if unused.

44-50: Consider marking timers as daemon threads.

If the process terminates unexpectedly (without calling stop()), non-daemon timers could prevent clean shutdown. While cancel_pending() handles graceful shutdown, marking timers as daemon adds a safety net.

Proposed fix
             timer = threading.Timer(
                 self.debounce_ms / 1000,
                 self._execute_callback,
                 args=(path, event_type),
             )
+            timer.daemon = True
             self._pending[key] = timer
             timer.start()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/file_watcher.py` around lines 44 - 50, The timers created in the file
watcher are non-daemon which can block process exit; in the place where
threading.Timer(...) is constructed (the code that uses self.debounce_ms,
self._execute_callback, args=(path, event_type) and assigns to
self._pending[key] and calls timer.start()), set the timer thread as a daemon
before starting it (e.g., timer.daemon = True) so timers won't prevent shutdown
if stop() isn't called.
lib/api/extensions.py (1)

106-111: Chain the exception to preserve context.

Use raise ... from e to maintain the exception chain for better debugging.

Proposed fix
     try:
         installed = await dependency_manager.install(missing)
         registry.reload()
         return {"status": "ok", "installed": installed}
     except DependencyError as e:
-        raise HTTPException(status_code=500, detail=str(e))
+        raise HTTPException(status_code=500, detail=str(e)) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/api/extensions.py` around lines 106 - 111, The except block that catches
DependencyError and raises HTTPException should preserve the original exception
chain; modify the handler around dependency_manager.install where it currently
does "except DependencyError as e: raise HTTPException(...)" to use "raise
HTTPException(status_code=500, detail=str(e)) from e" so the original
DependencyError is chained into the HTTPException for better diagnostics (look
for the try/except surrounding dependency_manager.install and the except
DependencyError as e block).
frontend/src/types/index.ts (1)

127-130: JsonSchemaObject is simplified but functional.

The current definition with properties?: Record<string, JsonValue> works for basic access patterns. A more complete JSON Schema type would include type, description, enum, format, etc., but this adds complexity. Consider expanding if more schema features are needed downstream.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/types/index.ts` around lines 127 - 130, JsonSchemaObject is
currently too minimal (properties?: Record<string, JsonValue>) — expand the
interface to better model common JSON Schema features: add optional fields like
type?: string | string[], description?: string, title?: string, enum?:
JsonValue[], format?: string, default?: JsonValue, items?: JsonSchemaObject |
JsonSchemaObject[], additionalProperties?: boolean | JsonSchemaObject, and allow
nested schemas for properties values (i.e., properties?: Record<string,
JsonSchemaObject>); update any usages expecting properties values as JsonValue
to accept JsonSchemaObject instead and keep JsonValue for literal values.
lib/cli/main.py (1)

584-592: Optional: Add Docker tag format validation for better UX.

The tag parameter passed to docker build -t lacks format validation. While the subprocess call is safe from injection (list-based invocation), Docker will reject malformed tags at runtime. Validating tag format early (lowercase/uppercase letters, digits, underscores, periods, dashes; no leading period/dash; max 128 chars) would provide users with immediate feedback instead of generic "Build failed" message.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/cli/main.py` around lines 584 - 592, Validate the Docker image tag string
(the variable tag used in the docker build command and before subprocess.run)
before constructing cmd: check length <= 128, ensure it contains only allowed
characters (letters, digits, underscores, periods, dashes), and that it does not
start with a period or dash (use a regex or explicit checks); if validation
fails, print a clear error with console.print and exit with typer.Exit(1)
instead of invoking subprocess.run so users get immediate, specific feedback
about malformed tags.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@app.py`:
- Around line 91-92: The second Path(...) call exceeds the 100-char limit;
shorten the line by splitting the long call into variables or two statements:
assign os.getenv("DATAGENFLOW_TEMPLATES_PATH", "user_templates") to a local
variable (e.g., templates_path) and then call Path(templates_path).mkdir(...).
Do the same or keep the existing shorter first line using
DATAGENFLOW_BLOCKS_PATH/Path to ensure both Path(...).mkdir calls stay under 100
characters while preserving the same behavior.

---

Nitpick comments:
In `@frontend/src/types/index.ts`:
- Around line 127-130: JsonSchemaObject is currently too minimal (properties?:
Record<string, JsonValue>) — expand the interface to better model common JSON
Schema features: add optional fields like type?: string | string[],
description?: string, title?: string, enum?: JsonValue[], format?: string,
default?: JsonValue, items?: JsonSchemaObject | JsonSchemaObject[],
additionalProperties?: boolean | JsonSchemaObject, and allow nested schemas for
properties values (i.e., properties?: Record<string, JsonSchemaObject>); update
any usages expecting properties values as JsonValue to accept JsonSchemaObject
instead and keep JsonValue for literal values.

In `@lib/api/extensions.py`:
- Around line 106-111: The except block that catches DependencyError and raises
HTTPException should preserve the original exception chain; modify the handler
around dependency_manager.install where it currently does "except
DependencyError as e: raise HTTPException(...)" to use "raise
HTTPException(status_code=500, detail=str(e)) from e" so the original
DependencyError is chained into the HTTPException for better diagnostics (look
for the try/except surrounding dependency_manager.install and the except
DependencyError as e block).

In `@lib/cli/main.py`:
- Around line 584-592: Validate the Docker image tag string (the variable tag
used in the docker build command and before subprocess.run) before constructing
cmd: check length <= 128, ensure it contains only allowed characters (letters,
digits, underscores, periods, dashes), and that it does not start with a period
or dash (use a regex or explicit checks); if validation fails, print a clear
error with console.print and exit with typer.Exit(1) instead of invoking
subprocess.run so users get immediate, specific feedback about malformed tags.

In `@lib/file_watcher.py`:
- Around line 64-65: The except block in file_watcher.py redundantly passes the
exception object to logger.exception, causing duplicated exception output;
modify the except clause handling in the callback by changing "except Exception
as e:" to still catch the exception but call logger.exception with a message
only (e.g., logger.exception("error in file watcher callback")) and remove the
appended "{e}" so logger.exception itself records the traceback; ensure the
except block still uses the same exception binding if needed elsewhere or drop
the "as e" binding if unused.
- Around line 44-50: The timers created in the file watcher are non-daemon which
can block process exit; in the place where threading.Timer(...) is constructed
(the code that uses self.debounce_ms, self._execute_callback, args=(path,
event_type) and assigns to self._pending[key] and calls timer.start()), set the
timer thread as a daemon before starting it (e.g., timer.daemon = True) so
timers won't prevent shutdown if stop() isn't called.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 57330d76-f318-494e-b93f-8e80d5a2081c

📥 Commits

Reviewing files that changed from the base of the PR and between 1e2bc9e and 9e82a9b.

📒 Files selected for processing (6)
  • app.py
  • frontend/src/types/index.ts
  • lib/api/extensions.py
  • lib/cli/main.py
  • lib/file_watcher.py
  • tests/integration/test_extensions.py

@nicofretti
Copy link
Owner Author

Code review

Found 1 issue:

  1. Claude skill files committed to the repository, violating CLAUDE.md: "never include 'claude' or any llm-related files in commits"

The following .claude/skills/ files are included in this PR's diff and should not be committed:

  • .claude/skills/using-datagenflow-extensibility/SKILL.md (new)
  • .claude/skills/creating-pipeline-templates/SKILL.md
  • .claude/skills/implementing-datagenflow-blocks/SKILL.md
  • .claude/skills/testing-pipeline-templates/SKILL.md
  • .claude/skills/writing-e2e-tests/SKILL.md

---
name: using-datagenflow-extensibility
description: Use when creating data generation pipelines with DataGenFlow's extensibility system — user_templates, user_blocks, docker-compose setup, and dgf CLI. Use for any task involving generating synthetic data, building custom pipelines, or extending DataGenFlow from an external project without modifying its source.
---

🤖 Generated with Claude Code

- If this code review was useful, please react with 👍. Otherwise, react with 👎.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (3)
lib/file_watcher.py (1)

66-67: Redundant exception in logging.exception.

logger.exception() already captures the exception; no need to format it again.

Proposed fix
-            logger.exception(f"error in file watcher callback: {e}")
+            logger.exception("error in file watcher callback")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/file_watcher.py` around lines 66 - 67, In file_watcher.py replace the
call that passes the formatted exception string to logger.exception so it
doesn't duplicate exception info: inside the except block where
logger.exception(f"error in file watcher callback: {e}") is used, change it to
call logger.exception with a simple message (e.g., "error in file watcher
callback") so the exception is captured once by logger.exception; keep the
except Exception as e block as-is to preserve handling.
llm/state-backend.md (1)

40-98: Missing: Extensions API endpoints.

Per coding guidelines, new API endpoints should be documented here. The PR adds /api/extensions/* endpoints (status, blocks, templates, reload, validate, dependencies, install-deps) that are not listed.

Suggested addition after line 98
### extensions
- `GET /api/extensions/status` - block/template counts and availability
- `GET /api/extensions/blocks` - list all blocks with source and availability
- `GET /api/extensions/templates` - list all templates with source
- `POST /api/extensions/reload` - trigger manual extension reload
- `POST /api/extensions/blocks/{name}/validate` - validate block availability
- `GET /api/extensions/blocks/{name}/dependencies` - get block dependencies
- `POST /api/extensions/blocks/{name}/install-deps` - install missing dependencies

As per coding guidelines: "Update llm/state-backend.md when new API endpoints are added or changed"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@llm/state-backend.md` around lines 40 - 98, The documentation is missing the
new /api/extensions endpoints added by the PR; update llm/state-backend.md to
add an "extensions" section listing the new endpoints (e.g., GET
/api/extensions/status, GET /api/extensions/blocks, GET
/api/extensions/templates, POST /api/extensions/reload, POST
/api/extensions/blocks/{name}/validate, GET
/api/extensions/blocks/{name}/dependencies, POST
/api/extensions/blocks/{name}/install-deps) with short descriptions matching the
suggested addition so docs reflect the implemented API.
frontend/src/services/extensionsApi.ts (1)

49-55: createPipelineFromTemplate doesn't consume response body.

The backend returns {"id": ..., "name": ..., "template_id": ...} but this method ignores it. If the caller needs the created pipeline ID (e.g., to navigate directly to it), consider returning the response.

Proposed fix
-  async createPipelineFromTemplate(templateId: string): Promise<void> {
+  async createPipelineFromTemplate(templateId: string): Promise<{ id: number; name: string; template_id: string }> {
     const response = await fetch(
       `${API_BASE}/pipelines/from_template/${encodeURIComponent(templateId)}`,
       { method: "POST" }
     );
     if (!response.ok) throw new Error("Failed to create pipeline from template");
+    return response.json();
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/services/extensionsApi.ts` around lines 49 - 55, The
createPipelineFromTemplate function currently throws on non-OK responses but
discards the successful response body; update createPipelineFromTemplate to
parse and return the JSON payload from the backend (e.g., await response.json())
so callers can access the created pipeline's id/name/template_id, and adjust the
function's return type accordingly (replace Promise<void> with the appropriate
shape or Promise<any>) while keeping the existing error handling for
!response.ok.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@lib/api/extensions.py`:
- Around line 116-121: The success return is currently inside the try block and
the exception is not chained; update the flow in the function that calls
dependency_manager.install so that the try only wraps the call to
dependency_manager.install (and registry.reload), then add an else block for the
success path to return {"status": "ok", "installed": installed}; in the except
DependencyError as e handler re-raise the HTTPException using raise
HTTPException(status_code=500, detail=str(e)) from e to preserve exception
chaining and ensure the normal return is outside the exception scope; reference
dependency_manager.install, registry.reload, DependencyError and HTTPException
when applying the change.

In `@lib/file_watcher.py`:
- Around line 126-129: The two assignments for self.blocks_path and
self.templates_path exceed the 100-char limit; update these in the FileWatcher
initializer (the lines assigning self.blocks_path and self.templates_path) to
wrap arguments across multiple lines or split the getenv and Path calls so each
line stays under 100 chars—for example, call os.getenv(...) on its own line or
pass the default on a new line when creating Path(...) for DEFAULT_BLOCKS_PATH
and DEFAULT_TEMPLATES_PATH and the
DATAGENFLOW_BLOCKS_PATH/DATAGENFLOW_TEMPLATES_PATH env vars.
- Around line 89-94: Concurrent calls to BlockRegistry.reload() and
TemplateRegistry.reload() must be serialized because importlib.reload is not
thread-safe; add a threading.Lock() member (e.g., self._reload_lock) in each
registry's __init__ and wrap the entire body of BlockRegistry.reload() and
TemplateRegistry.reload() in a with self._reload_lock: block to ensure only one
thread performs discovery/module reload at a time; also import threading where
needed so file_watcher._handle_change (which triggers registry.reload()) and API
calls use the same serialized reload logic.

In `@tests/e2e/test_extensions_e2e.py`:
- Line 387: The tuple references a nonexistent function
test_validate_nonexistent_block_type_in_ui which breaks the test runner; update
the tuple to reference the correct test function name
test_validate_button_produces_toast (or if the intended test is named
differently, rename that test function to match the tuple), ensuring the tuple
("validate button produces toast", test_validate_button_produces_toast) points
to an existing callable so the standalone runner can execute it.

---

Nitpick comments:
In `@frontend/src/services/extensionsApi.ts`:
- Around line 49-55: The createPipelineFromTemplate function currently throws on
non-OK responses but discards the successful response body; update
createPipelineFromTemplate to parse and return the JSON payload from the backend
(e.g., await response.json()) so callers can access the created pipeline's
id/name/template_id, and adjust the function's return type accordingly (replace
Promise<void> with the appropriate shape or Promise<any>) while keeping the
existing error handling for !response.ok.

In `@lib/file_watcher.py`:
- Around line 66-67: In file_watcher.py replace the call that passes the
formatted exception string to logger.exception so it doesn't duplicate exception
info: inside the except block where logger.exception(f"error in file watcher
callback: {e}") is used, change it to call logger.exception with a simple
message (e.g., "error in file watcher callback") so the exception is captured
once by logger.exception; keep the except Exception as e block as-is to preserve
handling.

In `@llm/state-backend.md`:
- Around line 40-98: The documentation is missing the new /api/extensions
endpoints added by the PR; update llm/state-backend.md to add an "extensions"
section listing the new endpoints (e.g., GET /api/extensions/status, GET
/api/extensions/blocks, GET /api/extensions/templates, POST
/api/extensions/reload, POST /api/extensions/blocks/{name}/validate, GET
/api/extensions/blocks/{name}/dependencies, POST
/api/extensions/blocks/{name}/install-deps) with short descriptions matching the
suggested addition so docs reflect the implemented API.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: deb1b6c6-1820-4ecb-a037-3e927bd2f9a4

📥 Commits

Reviewing files that changed from the base of the PR and between 9e82a9b and 14946d8.

📒 Files selected for processing (13)
  • app.py
  • docs/extensibility.md
  • frontend/src/pages/Extensions.tsx
  • frontend/src/services/extensionsApi.ts
  • lib/api/extensions.py
  • lib/cli/main.py
  • lib/constants.py
  • lib/entities/extensions.py
  • lib/file_watcher.py
  • llm/state-backend.md
  • llm/state-frontend.md
  • tests/e2e/test_extensions_e2e.py
  • tests/test_extensions_api.py
✅ Files skipped from review due to trivial changes (1)
  • llm/state-frontend.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/entities/extensions.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (2)
lib/blocks/registry.py (1)

141-142: ⚠️ Potential issue | 🟠 Major

Race condition: list_blocks() reads _entries without lock protection.

Same issue as TemplateRegistry: reload() holds the lock when rebuilding _entries, but list_blocks() iterates without the lock. Concurrent calls can cause iteration over a swapped dict.

Proposed fix
     def list_blocks(self) -> list[BlockInfo]:
-        return [entry.to_block_info() for entry in self._entries.values()]
+        with self._lock:
+            entries = list(self._entries.values())
+        return [entry.to_block_info() for entry in entries]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/blocks/registry.py` around lines 141 - 142, list_blocks() reads
self._entries without holding the same lock used in reload(), risking iteration
over a concurrently swapped dict; modify list_blocks in the BlockRegistry class
to acquire the registry lock (the same lock used by reload()) around
reading/iterating self._entries (e.g., copy or iterate under lock) so you either
return a snapshot or build the list while holding the lock; reference the
list_blocks method, the _entries attribute, and the reload method to ensure
consistent locking.
lib/api/extensions.py (1)

87-97: ⚠️ Potential issue | 🟠 Major

Import-failed blocks cannot recover via this API.

When a block fails to import (e.g., missing dependencies), block_class is None and the declared dependencies are unknown. Both /dependencies and /install-deps return 422, leaving no path for the UI to discover and install the missing packages. Consider storing declared dependencies from block metadata (e.g., docstring or decorator) even when import fails.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/api/extensions.py` around lines 87 - 97, The block_dependencies endpoint
currently returns 422 when entry.block_class is None, preventing the UI from
discovering/installing declared deps; update block_dependencies (and similarly
the install-deps flow) to, when registry.get_entry(name) returns an entry with
block_class is None, extract declared dependency metadata from the entry itself
(e.g., entry.metadata, entry.docstring, or a decorator field) and pass those
declared dependencies to dependency_manager.get_dependency_info instead of
failing; use unique symbols entry.block_class, registry.get_entry,
dependency_manager.get_dependency_info, and any existing entry.metadata or
entry.declared_dependencies fields (or add one) so import-failed blocks can
still surface their declared dependencies to the API consumer.
🧹 Nitpick comments (5)
lib/blocks/registry.py (1)

127-142: Read methods should also acquire the lock for consistency.

get_block_class(), get_entry(), get_block_source(), and list_block_types() read _entries without the lock. While individual dict lookups are atomic in CPython, using the lock ensures consistent behavior across implementations and with the write methods.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/blocks/registry.py` around lines 127 - 142, The read methods
get_block_class, get_entry, get_block_source, and list_block_types access the
shared _entries dict without synchronization; update each to acquire and release
the registry's _lock (the same lock used by write methods) around the read to
ensure consistent behavior: wrap the lookup/iteration of _entries in a with
self._lock: block so get_block_class returns entry.block_class, get_entry
returns the BlockEntry, get_block_source returns entry.source, and
list_block_types returns list(self._entries.keys()) while holding the lock.
tests/e2e/test_extensions_e2e.py (1)

52-68: Consider using a pytest fixture for browser lifecycle.

Each test creates and closes its own browser instance. A session/module-scoped fixture would reduce overhead and make tests faster.

Example fixture
`@pytest.fixture`(scope="module")
def browser():
    with sync_playwright() as p:
        browser = p.chromium.launch(headless=get_headless_mode())
        yield browser
        browser.close()

`@pytest.fixture`
def page(browser):
    page = browser.new_page()
    yield page
    page.close()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_extensions_e2e.py` around lines 52 - 68, Replace per-test
browser setup/teardown with module-scoped fixtures: add a pytest fixture named
browser(scope="module") that opens sync_playwright() and launches p.chromium in
get_headless_mode(), and a page fixture that creates browser.new_page() and
closes it after yield; then update test_extensions_page_loads to accept the page
fixture (remove calls to sync_playwright(), p.chromium.launch, browser.new_page,
and browser.close) and use the provided page for _navigate_to_extensions and the
expect assertions.
frontend/src/pages/Extensions.tsx (1)

292-305: Consider navigating to the newly created pipeline directly.

The createPipelineFromTemplate API returns the new pipeline ID (see lib/api/extensions.py:821), but extensionsApi.createPipelineFromTemplate() discards it with Promise<void>. Navigating to /pipelines/${id} would provide better UX.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/pages/Extensions.tsx` around lines 292 - 305, The handler
handleCreatePipeline ignores the new pipeline ID returned by
extensionsApi.createPipelineFromTemplate; update the
extensionsApi.createPipelineFromTemplate TypeScript definition to return
Promise<string> (the new pipeline id), await and capture the returned id in
handleCreatePipeline, then navigate to `/pipelines/${id}` instead of
`/pipelines`; keep the toast but consider updating its text to reference the
created pipeline and ensure navigate (from navigate) uses the captured id.
lib/templates/__init__.py (1)

68-69: Consider catching more specific exceptions.

Static analysis flags Exception as too broad. While logging mitigates silent failure, catching specific exceptions (e.g., yaml.YAMLError, OSError) would be more precise.

Also applies to: 92-93

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/templates/__init__.py` around lines 68 - 69, Replace the broad "except
Exception as e" handlers around the builtin template loading/logging with
narrower exception catches: catch yaml.YAMLError and OSError (or other relevant
I/O/parsing exceptions raised by the loader) as e and log the same message via
logger.warning(f"failed to load builtin template {template_file}: {e}"); do the
same replacement for the second instance at the other handler (the occurrence
around lines 92-93). Ensure yaml is imported (or reference the correct YAML
parser exception type) and keep a final generic except Exception as e only if
you want a last-resort fallback, otherwise let unexpected exceptions propagate.
lib/api/extensions.py (1)

1-121: Documentation update needed.

Per coding guidelines, new API endpoints require updating llm/state-backend.md. The /extensions/* endpoints should be documented there.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/api/extensions.py` around lines 1 - 121, Add documentation for the new
/extensions endpoints in the API docs (state-backend.md): describe each endpoint
(GET /extensions -> extensions_status, GET /extensions/blocks ->
extensions_blocks, GET /extensions/templates -> extensions_templates, POST
/extensions/reload -> reload_extensions, POST /extensions/blocks/{name}/validate
-> validate_block, GET /extensions/blocks/{name}/dependencies ->
block_dependencies, POST /extensions/blocks/{name}/install-deps ->
install_block_deps), include brief descriptions, expected request/response
shapes, possible HTTP errors (404/422/500) and payload fields (e.g.,
missing_dependencies, installed, error), and note side-effects like
registry.reload() and dependency_manager.install() for the install endpoint.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@frontend/src/pages/Extensions.tsx`:
- Around line 278-283: The JSX assumes block.dependencies exists and uses
.length and .join, which will throw if undefined; update the conditional and
usage in Extensions.tsx to defensively handle malformed data by checking that
block.dependencies is an array (e.g., Array.isArray(block.dependencies) &&
block.dependencies.length > 0) or using a nullish fallback (block.dependencies
?? []) before calling .join, and ensure the same guarded expression is used for
both the conditional and when rendering the joined string so .join never runs on
undefined.

In `@lib/blocks/registry.py`:
- Around line 104-116: register() and unregister() mutate the shared _entries
dict without synchronization and can race with reload(); wrap the mutations in
the same lock used by reload() (e.g., self._lock or the registry lock field) so
updates to self._entries are performed under a with self._lock: block (apply the
same pattern in both register and unregister to acquire the lock, perform the
assignment or deletion on self._entries, then release).

In `@lib/file_watcher.py`:
- Around line 126-131: The current initialization of self.blocks_path and
self.templates_path uses Path(...) without resolving, causing inconsistent
relative vs resolved paths compared to how builtin/custom blocks are created in
lib/blocks/registry.py; update lib/blocks/registry.py to use
Path("user_blocks").resolve() for the default user blocks path and change the
file_watcher initialization that sets self.blocks_path and self.templates_path
to call .resolve() on the constructed Path objects (i.e., Path(...).resolve())
so both watcher and registry use fully resolved absolute paths (refer to the
symbols self.blocks_path, self.templates_path and the default user blocks
creation in registry).

In `@lib/templates/__init__.py`:
- Around line 95-106: The register() and unregister() methods mutate shared
state (_templates and _sources) without synchronization; wrap their mutations in
the same lock used by reload() to avoid races—acquire the instance lock (e.g.,
self._lock or the existing threading.RLock used by reload) using a context
manager before assigning or popping from self._templates and self._sources so
the updates are atomic with respect to reload().
- Around line 120-131: list_templates() reads shared dictionaries _templates and
_sources without synchronization, causing a race with reload() that swaps them
under self._lock; fix by acquiring self._lock in list_templates(), copy the
needed data (e.g., local_templates = dict(self._templates); local_sources =
dict(self._sources)) while holding the lock, release the lock and then iterate
over the locals to build the list of TemplateInfo objects, ensuring you
reference the same symbols (list_templates, reload, self._lock, _templates,
_sources) so the iteration never sees a mid-swap state.

---

Duplicate comments:
In `@lib/api/extensions.py`:
- Around line 87-97: The block_dependencies endpoint currently returns 422 when
entry.block_class is None, preventing the UI from discovering/installing
declared deps; update block_dependencies (and similarly the install-deps flow)
to, when registry.get_entry(name) returns an entry with block_class is None,
extract declared dependency metadata from the entry itself (e.g.,
entry.metadata, entry.docstring, or a decorator field) and pass those declared
dependencies to dependency_manager.get_dependency_info instead of failing; use
unique symbols entry.block_class, registry.get_entry,
dependency_manager.get_dependency_info, and any existing entry.metadata or
entry.declared_dependencies fields (or add one) so import-failed blocks can
still surface their declared dependencies to the API consumer.

In `@lib/blocks/registry.py`:
- Around line 141-142: list_blocks() reads self._entries without holding the
same lock used in reload(), risking iteration over a concurrently swapped dict;
modify list_blocks in the BlockRegistry class to acquire the registry lock (the
same lock used by reload()) around reading/iterating self._entries (e.g., copy
or iterate under lock) so you either return a snapshot or build the list while
holding the lock; reference the list_blocks method, the _entries attribute, and
the reload method to ensure consistent locking.

---

Nitpick comments:
In `@frontend/src/pages/Extensions.tsx`:
- Around line 292-305: The handler handleCreatePipeline ignores the new pipeline
ID returned by extensionsApi.createPipelineFromTemplate; update the
extensionsApi.createPipelineFromTemplate TypeScript definition to return
Promise<string> (the new pipeline id), await and capture the returned id in
handleCreatePipeline, then navigate to `/pipelines/${id}` instead of
`/pipelines`; keep the toast but consider updating its text to reference the
created pipeline and ensure navigate (from navigate) uses the captured id.

In `@lib/api/extensions.py`:
- Around line 1-121: Add documentation for the new /extensions endpoints in the
API docs (state-backend.md): describe each endpoint (GET /extensions ->
extensions_status, GET /extensions/blocks -> extensions_blocks, GET
/extensions/templates -> extensions_templates, POST /extensions/reload ->
reload_extensions, POST /extensions/blocks/{name}/validate -> validate_block,
GET /extensions/blocks/{name}/dependencies -> block_dependencies, POST
/extensions/blocks/{name}/install-deps -> install_block_deps), include brief
descriptions, expected request/response shapes, possible HTTP errors
(404/422/500) and payload fields (e.g., missing_dependencies, installed, error),
and note side-effects like registry.reload() and dependency_manager.install()
for the install endpoint.

In `@lib/blocks/registry.py`:
- Around line 127-142: The read methods get_block_class, get_entry,
get_block_source, and list_block_types access the shared _entries dict without
synchronization; update each to acquire and release the registry's _lock (the
same lock used by write methods) around the read to ensure consistent behavior:
wrap the lookup/iteration of _entries in a with self._lock: block so
get_block_class returns entry.block_class, get_entry returns the BlockEntry,
get_block_source returns entry.source, and list_block_types returns
list(self._entries.keys()) while holding the lock.

In `@lib/templates/__init__.py`:
- Around line 68-69: Replace the broad "except Exception as e" handlers around
the builtin template loading/logging with narrower exception catches: catch
yaml.YAMLError and OSError (or other relevant I/O/parsing exceptions raised by
the loader) as e and log the same message via logger.warning(f"failed to load
builtin template {template_file}: {e}"); do the same replacement for the second
instance at the other handler (the occurrence around lines 92-93). Ensure yaml
is imported (or reference the correct YAML parser exception type) and keep a
final generic except Exception as e only if you want a last-resort fallback,
otherwise let unexpected exceptions propagate.

In `@tests/e2e/test_extensions_e2e.py`:
- Around line 52-68: Replace per-test browser setup/teardown with module-scoped
fixtures: add a pytest fixture named browser(scope="module") that opens
sync_playwright() and launches p.chromium in get_headless_mode(), and a page
fixture that creates browser.new_page() and closes it after yield; then update
test_extensions_page_loads to accept the page fixture (remove calls to
sync_playwright(), p.chromium.launch, browser.new_page, and browser.close) and
use the provided page for _navigate_to_extensions and the expect assertions.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2e38a504-e69e-4ea8-9838-e85b894850bd

📥 Commits

Reviewing files that changed from the base of the PR and between 14946d8 and 5d5920e.

📒 Files selected for processing (8)
  • app.py
  • frontend/src/pages/Extensions.tsx
  • frontend/src/services/extensionsApi.ts
  • lib/api/extensions.py
  • lib/blocks/registry.py
  • lib/file_watcher.py
  • lib/templates/__init__.py
  • tests/e2e/test_extensions_e2e.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • frontend/src/services/extensionsApi.ts

@nicofretti nicofretti force-pushed the 51-feat-datagenflow-extensibility-system-via-dgf-cli branch from d4c9335 to eefda22 Compare March 7, 2026 16:50
@nicofretti
Copy link
Owner Author

@CodeRabbit review to address the guidelines

@coderabbitai
Copy link

coderabbitai bot commented Mar 7, 2026

@nicofretti, I'll re-review the PR with a focus on the CLAUDE.md guidelines, including the committed .claude/skills/ files and any other LLM-related artifacts that should not be part of the diff.

Beyond the .claude/skills/ files already identified, please also double-check these files that appear in the diff and may violate the same policy:

  • tests/e2e/TODOS.md — contains a Claude Code-generated PR review transcript
  • llm/state-backend.md and llm/state-frontend.md — LLM state files
✅ Actions performed

Full review triggered.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/blocks/test_structure_sampler.py (1)

69-79: ⚠️ Potential issue | 🟠 Major

Cover the actual regression, not just the private rename.

This update follows the _field_deps rename, but it still wouldn’t catch the bug this change is fixing: field dependencies overwriting the block dependency metadata exposed to the extensibility system. Add a regression assertion on the public dependency path (StructureSampler.get_schema() / block info dependency data) after execute() so this can’t silently come back.

As per coding guidelines, "Block bug fixes without regression tests - all bug fixes must include test cases covering the fixed scenario."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/blocks/test_structure_sampler.py` around lines 69 - 79, The test
currently only sets the private _field_deps and doesn't assert the public-facing
metadata; after calling StructureSampler.execute() in
test_conditional_probabilities, call the public accessor
StructureSampler.get_schema() (or the block info dependency data exposed by the
block) and add an assertion that the dependencies given in the constructor
(e.g., {"role": ["plan"]}) remain present and unchanged — this ensures
field-level _field_deps didn't overwrite the block's public dependency metadata
after execute().
🧹 Nitpick comments (10)
lib/blocks/builtin/structure_sampler.py (1)

77-77: Add a brief why-comment for _field_deps.

_field_deps otherwise reads like a cosmetic rename. A short note that this intentionally avoids colliding with BaseBlock.dependencies/extension package metadata would make the constraint obvious and reduce accidental regressions.

As per coding guidelines, "Add comments explaining why for complex logic in code - comments should explain reasoning, not repeat what code does."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/blocks/builtin/structure_sampler.py` at line 77, Add a short "why"
comment above the _field_deps attribute in class StructureSampler explaining its
purpose: that it stores per-field dependency lists and is named/kept separate to
intentionally avoid colliding with BaseBlock.dependencies or extension package
metadata; reference the attribute _field_deps and BaseBlock.dependencies in the
comment so future readers understand this is a deliberate separation to prevent
accidental metadata collisions.
tests/e2e/test_helpers.py (1)

73-77: Consolidate BASE_URL usage.

BASE_URL is defined here, but cleanup_database() at line 19 and other existing functions still use hardcoded "http://localhost:8000". Consolidate to use the constant throughout the file.

♻️ Proposed fix
+BASE_URL = "http://localhost:8000"
+
+
 def get_headless_mode():
     """get headless mode from environment variable"""
     return os.getenv("E2E_HEADLESS", "true").lower() in ("true", "1", "yes")


 def cleanup_database():
     """delete all pipelines, jobs, and records from the database"""
-    base_url = "http://localhost:8000"

     try:
         # delete all records
-        resp = httpx.delete(f"{base_url}/api/records", timeout=10.0)
+        resp = httpx.delete(f"{BASE_URL}/api/records", timeout=10.0)

And similarly update other occurrences in cleanup_database, wait_for_server, and get_pipeline_count.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_helpers.py` around lines 73 - 77, Replace hardcoded
"http://localhost:8000" occurrences in this file with the BASE_URL constant;
update functions cleanup_database, wait_for_server, get_pipeline_count (and any
other helpers) to reference BASE_URL instead of the literal string so all HTTP
requests use the single defined constant. Ensure string concatenation or
f-strings use BASE_URL (e.g., BASE_URL + "/path" or f"{BASE_URL}/path") and run
tests to confirm no remaining hardcoded URLs remain.
lib/dependency_manager.py (1)

91-91: Use spread operator for cleaner list construction.

Per static analysis suggestion.

♻️ Proposed fix
-        cmd = ["uv", "pip", "install", "--quiet"] + requirements
+        cmd = ["uv", "pip", "install", "--quiet", *requirements]
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/dependency_manager.py` at line 91, Replace the list concatenation used to
build the pip command (the variable cmd) with Python's list unpacking to make
construction cleaner: change the current ["uv", "pip", "install", "--quiet"] +
requirements to a single list literal that uses the unpacking operator to
include requirements (i.e., include *requirements in the list). Update the
assignment to the cmd variable accordingly so it uses ["uv", "pip", "install",
"--quiet", *requirements].
frontend/src/App.tsx (1)

17-17: Consider lazy loading the Extensions page.

Based on coding guidelines for lazy loading heavy components. If Extensions page is lightweight, this is optional.

♻️ Proposed change
-import Extensions from "./pages/Extensions";
+const Extensions = React.lazy(() => import("./pages/Extensions"));

And wrap the route with Suspense:

<Route path="/extensions" element={
  <React.Suspense fallback={<div>Loading...</div>}>
    <Extensions />
  </React.Suspense>
} />

Based on learnings: "Lazy load heavy components in frontend - use React.lazy for code splitting"

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/App.tsx` at line 17, The Extensions import is not lazy-loaded;
update the import to use React.lazy for the Extensions component and wrap its
route element with React.Suspense (fallback UI) so the /extensions route is
code-split; specifically replace the static import of Extensions with a lazy
import (React.lazy(() => import(...))) and render it inside a <React.Suspense
fallback={...}> when used in the Route for path "/extensions".
lib/cli/client.py (1)

15-20: Handle non-JSON error responses.

If the server returns non-JSON content (e.g., HTML error page), response.json() will raise a JSONDecodeError. Consider wrapping or documenting this behavior.

♻️ Proposed defensive handling
     def _request(self, method: str, path: str, **kwargs: Any) -> Any:
         url = f"{self.endpoint}{path}"
         with httpx.Client(timeout=self.timeout) as client:
             response = client.request(method, url, **kwargs)
             response.raise_for_status()
-            return response.json()
+            try:
+                return response.json()
+            except ValueError as e:
+                raise httpx.HTTPStatusError(
+                    f"invalid JSON response: {response.text[:200]}",
+                    request=response.request,
+                    response=response,
+                ) from e
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/cli/client.py` around lines 15 - 20, The _request method currently calls
response.json() unguarded which will raise JSONDecodeError for non-JSON
responses; wrap the call to response.json() in a try/except that catches
json.JSONDecodeError (or ValueError for broader compatibility), and on failure
either return response.text (or a dict like {"text": response.text,
"status_code": response.status_code}) or re-raise a clearer exception that
includes response.text and response.status_code; keep
response.raise_for_status() where it is but ensure the JSON decoding is handled
after that, and reference the _request method, response.json(), response.text
and httpx.Client when making the change.
frontend/src/pages/Extensions.tsx (1)

43-45: Consider mounted guard for async load.

The loadAll call in useEffect doesn't have cleanup. If the component unmounts during the initial fetch, state updates will be attempted on an unmounted component. This is a minor edge case but worth protecting against.

Optional fix with mounted guard
   useEffect(() => {
-    loadAll();
+    let mounted = true;
+    loadAll().finally(() => {
+      if (!mounted) return;
+    });
+    return () => { mounted = false; };
   }, [loadAll]);

Alternatively, use AbortController in loadAll for a more robust solution.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/pages/Extensions.tsx` around lines 43 - 45, The useEffect that
calls loadAll should guard against updating state after the component unmounts;
add a mounted flag (e.g., let isMounted = true) or use an AbortController and
return a cleanup from the useEffect that sets isMounted = false (or aborts the
controller), then inside loadAll check isMounted before calling any state
setters (or catch aborts and skip updates). Ensure you reference the existing
useEffect and the loadAll function so you only add the mounted/abort guard and
the cleanup without changing the loadAll API.
lib/blocks/registry.py (1)

62-66: Minor: redundant initial assignment.

Line 64 initializes _entries to an empty dict, then line 65 immediately overwrites it. This is harmless but unnecessary:

     def __init__(self) -> None:
         self._lock = threading.Lock()
-        self._entries: dict[str, BlockEntry] = {}
-        self._entries = self._discover_blocks()
+        self._entries: dict[str, BlockEntry] = self._discover_blocks()
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/blocks/registry.py` around lines 62 - 66, In __init__, remove the
redundant initialization of self._entries to an empty dict before calling
self._discover_blocks(); keep a single assignment (either set self._entries =
self._discover_blocks() or initialize to {} only if discover should not be
called), so edit the __init__ in the class using _lock, _entries, and
_discover_blocks to avoid the pointless overwrite.
lib/entities/extensions.py (1)

53-58: Consider using Literal for DependencyInfo.status.

The comment documents allowed values but the type is str. Using Literal["ok", "not_installed", "invalid"] would provide compile-time validation and better IDE support.

Proposed fix
 class DependencyInfo(BaseModel):
     requirement: str
     name: str
     installed_version: str | None = None
-    status: str  # "ok", "not_installed", "invalid"
+    status: Literal["ok", "not_installed", "invalid"]
     error: str | None = None
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@lib/entities/extensions.py` around lines 53 - 58, Change the type of
DependencyInfo.status from str to a Literal union to enforce the allowed values:
update the class DependencyInfo (field "status") to use Literal["ok",
"not_installed", "invalid"] and add the necessary import (from typing import
Literal or from typing import get_args depending on your typing style); ensure
any code constructing DependencyInfo still passes one of those exact strings or
adjust callers/tests to use the new Literal values.
frontend/src/services/extensionsApi.ts (1)

5-64: Consider adding AbortController support for cancelable requests.

Per coding guidelines, fetch calls should use AbortController to prevent race conditions and memory leaks. The current implementation could cause state updates on unmounted components if requests complete after navigation.

A common pattern is to accept an optional signal parameter:

async listBlocks(signal?: AbortSignal): Promise<BlockInfo[]> {
  const response = await fetch(`${API_BASE}/extensions/blocks`, { signal });
  // ...
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@frontend/src/services/extensionsApi.ts` around lines 5 - 64, The fetch calls
in ExtensionsApi (methods getStatus, listBlocks, listTemplates, reload,
validateBlock, createPipelineFromTemplate, installBlockDeps) should accept an
optional AbortSignal so callers can cancel requests; update each method
signature to include signal?: AbortSignal and pass that signal into the fetch
options (e.g., fetch(url, { signal, method: "POST" } or { signal } as
appropriate), keeping existing error handling and return types unchanged; ensure
encodeURIComponent usage remains for validateBlock/installBlockDeps and that you
await response.json() on error paths (installBlockDeps) while still throwing the
composed Error so canceled requests propagate correctly.
tests/e2e/test_extensions_api_e2e.py (1)

354-360: String-based exception type check is fragile.

The manual runner checks type(e).__name__ == "Skipped" instead of importing and catching pytest.skip.Exception. This could break if pytest internals change.

Proposed fix
+from _pytest.outcomes import Skipped
+
 ...
         except BaseException as e:
-            if type(e).__name__ == "Skipped":
+            if isinstance(e, Skipped):
                 print(f"⊘ skipped: {e}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/e2e/test_extensions_api_e2e.py` around lines 354 - 360, Replace the
fragile string-based check in the exception handler with real exception-type
checks: import pytest and unittest at the top of the test file and change the
condition from type(e).__name__ == "Skipped" to isinstance(e,
(pytest.SkipException, unittest.SkipTest)) (or include pytest.skip.Exception
alias if your pytest version exposes it) so skipped tests are detected reliably;
keep the existing KeyboardInterrupt/SystemExit re-raise and the generic failure
path for all other exceptions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.claude/skills/using-datagenflow-extensibility/SKILL.md:
- Around line 12-19: The fenced code block that currently starts with the
"your-project/" tree should include an explicit language identifier (e.g.,
```text or ```plaintext) to satisfy markdownlint MD040; update that block and
the other similar fenced block around lines 118-130 in
.claude/skills/using-datagenflow-extensibility/SKILL.md by adding a language
token after the opening triple backticks so the examples are treated as plain
text.
- Line 58: The guide incorrectly presents a repo-local workflow (the command
`docker build -f docker/Dockerfile -t datagenflow:local .` and running `dgf`
from the DataGenFlow repo) as the external extensibility path; update SKILL.md
to either (A) replace these repo-local steps with the
published-image/installable-CLI flow (e.g., provide `docker pull
datagenflow:<tag>` or the CLI installation command and example usage of `dgf`)
including the correct image/tag and CLI invocation, or (B) explicitly label the
existing build/run instructions as "Contributor-only setup" and add a short
paragraph pointing external users to the published image/CLI alternative; ensure
you update both occurrences that reference the repo-local build and `dgf` so the
document no longer implies cloning the main repo for external extensibility.

In `@docs/extensibility.md`:
- Around line 246-277: The docs say /api/extensions/status includes hot reload
status but the example JSON only shows counts; update the example response for
GET /api/extensions/status to include a hot_reload field (e.g., hot_reload: {
enabled: boolean, last_reload_at: string, status: "idle"|"reloading" } or
whatever the API actually returns) so the sample matches the documented payload,
or alternatively change the surrounding text to remove the claim that hot reload
status is included—modify references to /api/extensions/status and the example
accordingly.
- Around line 369-377: The docs currently treat "unavailable" blocks as only
missing pip deps; update the section to document the import-error recovery path:
explain that discovery can fail and store block_class=None (so GET
/api/extensions/blocks/{name}/dependencies returns [] and POST
/api/extensions/blocks/{name}/install-deps will surface the import error), and
instruct the user to fix the import error in their block source, then trigger a
reload (either via the extension reload endpoint or by ensuring
DATAGENFLOW_HOT_RELOAD=true and that user_blocks/ and user_templates/ exist at
startup so discovery can pick up the fixed class), after which dependencies will
be repopulated and install-deps can be used if needed.
- Around line 321-333: The quick-start `.env` example uses OPENAI_API_KEY but
the application and helpers use LLM_API_KEY (see config.py, lib/llm_config.py,
lib/storage.py); update the docs snippet to replace OPENAI_API_KEY=sk-... with
LLM_API_KEY=your-key (or LLM_API_KEY= if you prefer blank), and ensure the note
indicates this matches the variable names used in config.py and
lib/llm_config.py so users don’t set the wrong env var.

In `@frontend/src/services/extensionsApi.ts`:
- Around line 56-59: The error-handling branch uses await response.json() which
can throw for non-JSON bodies; wrap the JSON parse in a try-catch, attempt
response.json() inside try, and on failure fall back to await response.text()
(or a default message) so you throw a meaningful Error containing either
error.detail, the fallback text, or `http ${response.status}`; update the code
around response.ok and the response.json() call to use this safe-parse flow
(referencing response, response.json(), response.text(), and response.ok).

In `@lib/cli/main.py`:
- Around line 254-283: The scaffold code writes files with
output_path.write_text(...) without ensuring the target directory exists,
causing FileNotFoundError when users pass a new directory; update the scaffold
logic (the block that builds output_path and calls write_text) to create the
parent directory first (e.g., output_path.parent.mkdir(parents=True,
exist_ok=True)) or raise a clear, friendly error if the directory cannot be
created, and apply the same change to the other scaffold occurrence that also
uses write_text; ensure you reference output, output_path, and write_text in
your edits so the directory is prepared before writing.
- Around line 327-345: After yaml.safe_load assign to data, guard that data is a
mapping before indexing: replace direct checks like '"name" in data' and
data["blocks"] with an isinstance(data, dict) check first (e.g., if not
isinstance(data, dict): treat as invalid and append errors or set data = {}),
then validate 'name' and 'blocks' safely (use data.get or dict membership after
the type check and verify blocks is a non-empty list). Apply the same hardening
to the other command blocks that index data (the sections around the other
checks noted), and add regression tests for empty file (None), top-level list,
and scalar YAML inputs so these produce CLI validation errors instead of
TypeError/AttributeError.

In `@lib/dependency_manager.py`:
- Around line 100-103: The except handlers for subprocess.TimeoutExpired and
FileNotFoundError in the dependency installation code should chain the original
exceptions to preserve tracebacks: capture each exception (e.g., "except
subprocess.TimeoutExpired as err:" and "except FileNotFoundError as err:") and
re-raise the DependencyError using "raise DependencyError(... ) from err" so the
original error context is preserved; update the handlers around the install call
where DependencyError is raised to use "from err".

In `@lib/templates/__init__.py`:
- Around line 138-143: get_template() and get_template_source() read shared
dictionaries without synchronisation; wrap their accesses with the same lock
used for writes (e.g. self._lock) to prevent concurrent swaps from corrupting
reads. Modify get_template(self, template_id) and get_template_source(self,
template_id) to acquire the lock (context-managed or explicit acquire/release)
before accessing self._templates and self._sources, then return the value and
release the lock so reads are protected the same way reload() and any writers
are.

In `@llm/state-backend.md`:
- Line 35: The documentation update currently only lists new constants (e.g.,
RECORD_UPDATABLE_FIELDS, DEFAULT_BLOCKS_PATH, DEFAULT_TEMPLATES_PATH in
constants.py) but must be expanded to fully document the extensibility backend:
add descriptions for the new /api/extensions/* endpoints (what each endpoint
does, request/response shapes, auth requirements), document the extension data
models defined in lib/entities/extensions.py (fields, validation, and which
fields are updatable), describe the hot-reload/file-watcher flow (what
directories are watched, reload triggers, error handling), and note the
dependency-installation path/behaviour (how and where dependencies are
installed, security/permission considerations); update llm/state-backend.md to
include these parts so the docs match the actual code changes.

In `@tests/cli/test_commands.py`:
- Around line 60-61: Add tests that exercise the CLI "blocks install-deps"
command using the existing test harness and the mocked client: call the CLI
runner to invoke the command (e.g., "blocks install-deps <block-name>") and
assert that client.install_block_deps is called and the CLI prints the expected
success output for client.install_block_deps.return_value = {"status":"ok",
"installed":[...]}; also add a failing-case test where client.install_block_deps
raises or returns an error payload and assert the CLI returns a non-zero exit
code and prints the error message. Ensure you modify the same test file
locations around the existing client.validate_block and
client.install_block_deps setup (and replicate the pattern used in lines
~89-137) so both success and error paths are covered.

In `@tests/e2e/test_extensions_e2e.py`:
- Around line 132-150: The test test_block_validate_shows_success_toast computes
available_block but always clicks the first Validate button
(page.get_by_role("button", name="Validate").first.click()), risking selecting
an unavailable block; update the test to locate the UI card for available_block
(use its unique title/name or id from available_block["name"] / ["id"]) after
_navigate_to_extensions(page) and click the Validate button scoped to that card
instead of the global first button so the clicked validate corresponds to the
API-selected available_block.
- Around line 370-406: The standalone runner currently ignores wait_for_server()
failures and swallows test exceptions, causing a zero exit even on errors;
update the main block to fail-fast: have wait_for_server() raise or return a
falsy value that you check and exit non-zero immediately (use SystemExit or
raise RuntimeError) if the server is down, run tests inside a try/finally so
cleanup_database() always runs, and on any non-Skipped test exception re-raise
or call sys.exit(nonzero) instead of only printing so the process exits with a
failure; refer to wait_for_server(), cleanup_database(), the tests list and the
for name, test_fn loop to locate where to apply these changes.

In `@tests/e2e/TODOS.md`:
- Around line 1-295: tests/e2e/TODOS.md is an AI-generated PR-review transcript
and violates CLAUDE.md guidelines and repo conventions; remove this file from
the branch (delete tests/e2e/TODOS.md and remove it from the commit), ensure no
other AI-generated or extraneous review transcripts are committed, and update
any related .gitignore/.claude rules or repository contribution docs (CLAUDE.md)
if needed to prevent accidental inclusion in future commits; also verify the PR
diff no longer contains unrelated review notes before re-submitting.

In `@tests/test_file_watcher.py`:
- Around line 23-31: Replace the fixed time.sleep(0.15) in the DebouncedHandler
tests with a synchronization-based wait: modify the test callback to set a
threading.Event (or similar) when invoked, then after calling
handler._schedule_callback(...) use event.wait(timeout=...) to block until the
callback runs (or times out) and assert that event.wait returned True and
call_count == 1; apply the same change to the other debounce test block (the one
around lines 42-48) so tests no longer rely on scheduler timing or fixed sleeps.

---

Outside diff comments:
In `@tests/blocks/test_structure_sampler.py`:
- Around line 69-79: The test currently only sets the private _field_deps and
doesn't assert the public-facing metadata; after calling
StructureSampler.execute() in test_conditional_probabilities, call the public
accessor StructureSampler.get_schema() (or the block info dependency data
exposed by the block) and add an assertion that the dependencies given in the
constructor (e.g., {"role": ["plan"]}) remain present and unchanged — this
ensures field-level _field_deps didn't overwrite the block's public dependency
metadata after execute().

---

Nitpick comments:
In `@frontend/src/App.tsx`:
- Line 17: The Extensions import is not lazy-loaded; update the import to use
React.lazy for the Extensions component and wrap its route element with
React.Suspense (fallback UI) so the /extensions route is code-split;
specifically replace the static import of Extensions with a lazy import
(React.lazy(() => import(...))) and render it inside a <React.Suspense
fallback={...}> when used in the Route for path "/extensions".

In `@frontend/src/pages/Extensions.tsx`:
- Around line 43-45: The useEffect that calls loadAll should guard against
updating state after the component unmounts; add a mounted flag (e.g., let
isMounted = true) or use an AbortController and return a cleanup from the
useEffect that sets isMounted = false (or aborts the controller), then inside
loadAll check isMounted before calling any state setters (or catch aborts and
skip updates). Ensure you reference the existing useEffect and the loadAll
function so you only add the mounted/abort guard and the cleanup without
changing the loadAll API.

In `@frontend/src/services/extensionsApi.ts`:
- Around line 5-64: The fetch calls in ExtensionsApi (methods getStatus,
listBlocks, listTemplates, reload, validateBlock, createPipelineFromTemplate,
installBlockDeps) should accept an optional AbortSignal so callers can cancel
requests; update each method signature to include signal?: AbortSignal and pass
that signal into the fetch options (e.g., fetch(url, { signal, method: "POST" }
or { signal } as appropriate), keeping existing error handling and return types
unchanged; ensure encodeURIComponent usage remains for
validateBlock/installBlockDeps and that you await response.json() on error paths
(installBlockDeps) while still throwing the composed Error so canceled requests
propagate correctly.

In `@lib/blocks/builtin/structure_sampler.py`:
- Line 77: Add a short "why" comment above the _field_deps attribute in class
StructureSampler explaining its purpose: that it stores per-field dependency
lists and is named/kept separate to intentionally avoid colliding with
BaseBlock.dependencies or extension package metadata; reference the attribute
_field_deps and BaseBlock.dependencies in the comment so future readers
understand this is a deliberate separation to prevent accidental metadata
collisions.

In `@lib/blocks/registry.py`:
- Around line 62-66: In __init__, remove the redundant initialization of
self._entries to an empty dict before calling self._discover_blocks(); keep a
single assignment (either set self._entries = self._discover_blocks() or
initialize to {} only if discover should not be called), so edit the __init__ in
the class using _lock, _entries, and _discover_blocks to avoid the pointless
overwrite.

In `@lib/cli/client.py`:
- Around line 15-20: The _request method currently calls response.json()
unguarded which will raise JSONDecodeError for non-JSON responses; wrap the call
to response.json() in a try/except that catches json.JSONDecodeError (or
ValueError for broader compatibility), and on failure either return
response.text (or a dict like {"text": response.text, "status_code":
response.status_code}) or re-raise a clearer exception that includes
response.text and response.status_code; keep response.raise_for_status() where
it is but ensure the JSON decoding is handled after that, and reference the
_request method, response.json(), response.text and httpx.Client when making the
change.

In `@lib/dependency_manager.py`:
- Line 91: Replace the list concatenation used to build the pip command (the
variable cmd) with Python's list unpacking to make construction cleaner: change
the current ["uv", "pip", "install", "--quiet"] + requirements to a single list
literal that uses the unpacking operator to include requirements (i.e., include
*requirements in the list). Update the assignment to the cmd variable
accordingly so it uses ["uv", "pip", "install", "--quiet", *requirements].

In `@lib/entities/extensions.py`:
- Around line 53-58: Change the type of DependencyInfo.status from str to a
Literal union to enforce the allowed values: update the class DependencyInfo
(field "status") to use Literal["ok", "not_installed", "invalid"] and add the
necessary import (from typing import Literal or from typing import get_args
depending on your typing style); ensure any code constructing DependencyInfo
still passes one of those exact strings or adjust callers/tests to use the new
Literal values.

In `@tests/e2e/test_extensions_api_e2e.py`:
- Around line 354-360: Replace the fragile string-based check in the exception
handler with real exception-type checks: import pytest and unittest at the top
of the test file and change the condition from type(e).__name__ == "Skipped" to
isinstance(e, (pytest.SkipException, unittest.SkipTest)) (or include
pytest.skip.Exception alias if your pytest version exposes it) so skipped tests
are detected reliably; keep the existing KeyboardInterrupt/SystemExit re-raise
and the generic failure path for all other exceptions.

In `@tests/e2e/test_helpers.py`:
- Around line 73-77: Replace hardcoded "http://localhost:8000" occurrences in
this file with the BASE_URL constant; update functions cleanup_database,
wait_for_server, get_pipeline_count (and any other helpers) to reference
BASE_URL instead of the literal string so all HTTP requests use the single
defined constant. Ensure string concatenation or f-strings use BASE_URL (e.g.,
BASE_URL + "/path" or f"{BASE_URL}/path") and run tests to confirm no remaining
hardcoded URLs remain.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 70af8aac-4de0-4405-bbe8-b63fe44fb1ad

📥 Commits

Reviewing files that changed from the base of the PR and between 9a7a28a and eefda22.

📒 Files selected for processing (53)
  • .claude/skills/address-pr-review/SKILL.md
  • .claude/skills/creating-pipeline-templates/SKILL.md
  • .claude/skills/implementing-datagenflow-blocks/SKILL.md
  • .claude/skills/testing-pipeline-templates/SKILL.md
  • .claude/skills/using-datagenflow-extensibility/SKILL.md
  • .claude/skills/writing-e2e-tests/SKILL.md
  • .env.example
  • app.py
  • docker/Dockerfile
  • docker/docker-compose.yml
  • docs/extensibility.md
  • frontend/src/App.tsx
  • frontend/src/components/pipeline-editor/BlockPalette.tsx
  • frontend/src/pages/Extensions.tsx
  • frontend/src/services/extensionsApi.ts
  • frontend/src/types/index.ts
  • lib/api/__init__.py
  • lib/api/extensions.py
  • lib/blocks/base.py
  • lib/blocks/builtin/structure_sampler.py
  • lib/blocks/registry.py
  • lib/cli/__init__.py
  • lib/cli/client.py
  • lib/cli/main.py
  • lib/constants.py
  • lib/dependency_manager.py
  • lib/entities/extensions.py
  • lib/file_watcher.py
  • lib/templates/__init__.py
  • lib/workflow.py
  • llm/state-backend.md
  • llm/state-frontend.md
  • pyproject.toml
  • tests/blocks/test_base_dependencies.py
  • tests/blocks/test_registry.py
  • tests/blocks/test_registry_enhanced.py
  • tests/blocks/test_structure_sampler.py
  • tests/cli/__init__.py
  • tests/cli/test_commands.py
  • tests/e2e/TODOS.md
  • tests/e2e/run_all_tests.sh
  • tests/e2e/test_extensions_api_e2e.py
  • tests/e2e/test_extensions_e2e.py
  • tests/e2e/test_helpers.py
  • tests/integration/test_extensions.py
  • tests/test_api_regression.py
  • tests/test_dependency_manager.py
  • tests/test_extensions_api.py
  • tests/test_file_watcher.py
  • tests/test_registry_regression.py
  • tests/test_template_enhanced.py
  • tests/test_template_regression.py
  • tests/test_templates.py

@nicofretti nicofretti merged commit 3a545cc into develop Mar 8, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🚀 Feat: DataGenFlow extensibility system via dgf CLI

1 participant