nicofretti · nicofretti · Mar 8, 2026 · Jan 31, 2026 · Jan 31, 2026 · Feb 3, 2026
@@ -79,18 +79,20 @@ python .claude/skills/address-pr-review/scripts/fetch_comments.py <PR> --all
 
 | Phase | Actions |
 |-------|---------|
-| **Fetch** | Run `--summary` first to see counts<br>Then `--id <ID>` for each comment to analyze<br>Exit if no unresolved comments |
+| **Fetch** | Run `--summary` first to see counts<br>**Only process unresolved comments** — resolved ones are already closed, skip them<br>Then `--id <ID>` for each unresolved comment to analyze<br>Exit if no unresolved comments |
 | **Per Comment** | Show: file:line, author, comment, ±10 lines context<br>Analyze: Valid/Nitpick/Disagree/Question<br>Recommend: Fix/Reply/Skip with reasoning |
-| **Fix** | Minimal changes per llm/rules-*.md<br>Offer reply draft: `Fixed: [what]. [why]`<br>Show: `gh api --method POST repos/{owner}/{repo}/pulls/comments/$ID/replies -f body="..."` |
-| **Reply** | Draft based on type: Question/Suggestion/Disagreement<br>Let user edit<br>Show gh command (never auto-post) |
+| **Fix** | Minimal changes per llm/rules-*.md<br>Do NOT reply — just fix the code |
+| **Reply** | Draft based on type: Question/Suggestion/Disagreement<br>Wait 2 minutes between each reply<br>Post with: `gh api --method POST repos/{owner}/{repo}/pulls/{PR}/comments -f body="..." -F in_reply_to=<ID>`<br>(never auto-post without user confirmation) |
 | **Summary** | Processed X/N: Fixed Y, Replied Z, Skipped W<br>List: files modified, reply drafts, next steps |
 
 ## Critical Principles
 
 | Principle | Violation Pattern |
 |-----------|-------------------|
+| **Unresolved only** | Processing already-resolved comments — the script default filters to unresolved; never re-open resolved threads |
 | **Analyze first** | Accepting all feedback as valid without critical analysis |
-| **Never auto-post** | Posting replies automatically instead of showing gh command |
+| **Never auto-post** | Posting replies automatically without user confirmation or skipping 2-minute wait between replies |
+| **No reply on fix** | Replying to comments that were addressed with a code fix — fixes speak for themselves |
 | **One at a time** | Batch processing all comments without individual analysis |
 | **Show context** | Making changes without displaying ±10 lines around code |
 | **Minimal changes** | Large refactors in response to small comments |

@@ -71,7 +71,58 @@ StructureSampler → SemanticInfiller → DuplicateRemover
 
 # generation + metrics
 StructuredGenerator → FieldMapper → RagasMetrics
+
+# generation + review-friendly output
+StructuredGenerator → FieldMapper (flatten for review)
+```
+
+## Adding a FieldMapper for Review
+
+The Review page displays records from the **last block's accumulated_state**. Only **first-level keys** are shown as primary/secondary fields. Nested objects (e.g. `generated.confirmed_dependencies`) appear as raw JSON strings and can't be configured as separate review fields.
+
+**Always add a `FieldMapper` as the last block** to surface the fields reviewers need at the top level.
+
+### Why it matters
+
+Without a FieldMapper, the accumulated_state after a `StructuredGenerator` looks like:
+```json
+{
+  "input_field": "...",
+  "generated": {
+    "question": "...",
+    "answer": "...",
+    "contexts": ["..."]
+  }
+}
 ```
+The review UI sees `input_field` and `generated` (a blob). Reviewers can't configure `question` or `answer` as primary fields.
+
+### How to add it
+
+Add a `FieldMapper` as the **last block** (or last before metrics/observability blocks):
+
+```yaml
+  - type: FieldMapper
+    config:
+      mappings:
+        # Flatten nested fields to top level
+        question: "{{ generated.question }}"
+        answer: "{{ generated.answer }}"
+        # tojson is safe only for structured data (IDs, numbers, short labels)
+        # avoid tojson on arrays/objects with free-text — newlines/quotes break JSON parsing
+        context_count: "{{ generated.contexts | length }}"
+        # Carry forward useful seed metadata
+        source: "{{ source_document }}"
+```
+
+### Rules
+
+1. **Map every field the reviewer needs** — if it's not a first-level key after the last block, it won't be configurable in the review field settings
+2. **Use `| tojson`** for arrays/objects — FieldMapper auto-parses JSON strings back to objects, so the review UI can display them properly. **Exception:** `tojson` on arrays/objects whose values contain unescaped quotes or newlines (e.g. free-text descriptions) will break FieldMapper JSON parsing. In that case, map only scalar summaries (counts, IDs) and let the array flow through as an existing first-level key.
+3. **Use `| length`** for counts — gives reviewers a quick numeric summary without expanding lists
+4. **Use `| default('')`** for optional fields — prevents Jinja2 errors when a field is missing
+5. **Don't map internal/noisy fields** — skip `folder_path`, `_usage`, `_seed_samples` etc. Only map what's useful for human review
+6. **Order matters** — FieldMapper outputs merge into accumulated_state, so its keys become the available fields in the Review "Configure Fields" modal
 
 ## Step-by-Step Workflow
 
@@ -126,6 +177,8 @@ StructuredGenerator → FieldMapper → RagasMetrics
 | Missing seed variable referenced in prompt | Add the variable to seed metadata |
 | MarkdownMultiplierBlock not first | Multiplier blocks must always be first |
 | Seed file not named `seed_<template_id>.*` | Template ID must match: `foo.yaml` → `seed_foo.json` |
+| Nested fields not visible in Review UI | Add a `FieldMapper` as last block to flatten nested outputs to top-level keys |
+| Review shows `generated` as a JSON blob | Map individual sub-fields: `question: "{{ generated.question }}"` |
 
 ## Checklist
 
@@ -135,6 +188,7 @@ StructuredGenerator → FieldMapper → RagasMetrics
 - [ ] Single execution produces expected output fields
 - [ ] Trace shows all blocks executed successfully
 - [ ] Seed file has 2-3 diverse examples
+- [ ] FieldMapper as last block flattens outputs for Review UI (all reviewer-relevant fields are top-level keys)
 
 ## Related Skills
 

@@ -467,6 +467,88 @@ async def execute(self, context: BlockExecutionContext) -> dict[str, Any]:
     cached_embeddings = self._embeddings_cache[trace_id]
 ```
 
+## Agentic Tool-Calling Block Pattern
+
+For blocks that need multi-turn LLM reasoning with tool use (e.g. exploring an external data source before generating output):
+
+```python
+async def execute(self, context: BlockExecutionContext) -> dict[str, Any]:
+    from app import llm_config_manager
+
+    llm_config = await llm_config_manager.get_llm_model(self.model_name)
+    total_usage = pipeline.Usage(input_tokens=0, output_tokens=0, cached_tokens=0)
+
+    messages = [
+        {"role": "system", "content": SYSTEM_PROMPT},
+        {"role": "user", "content": render_template(self.user_prompt, context.accumulated_state)},
+    ]
+
+    for turn in range(self.max_turns):
+        if turn == self.max_turns - 1:
+            messages.append({"role": "user", "content": "Wrap up and return final JSON now."})
+
+        llm_params = llm_config_manager.prepare_llm_call(
+            llm_config,
+            messages=messages,
+            temperature=self.temperature,
+            max_tokens=self.max_tokens,
+            tools=TOOLS,
+            tool_choice="auto",
+        )
+        llm_params["metadata"] = {"trace_id": context.trace_id, "tags": ["datagenflow"]}
+
+        response = await litellm.acompletion(**llm_params)
+        msg = response.choices[0].message
+        total_usage.input_tokens += response.usage.prompt_tokens or 0
+        total_usage.output_tokens += response.usage.completion_tokens or 0
+        total_usage.cached_tokens += getattr(response.usage, "cache_read_input_tokens", 0) or 0
+
+        if not msg.tool_calls:
+            # final answer — parse JSON
+            try:
+                result = json.loads(msg.content or "{}")
+            except json.JSONDecodeError:
+                result = {}
+            return {"my_result": result.get("my_result", []), "_usage": total_usage.model_dump()}
+
+        # append assistant message and process tool calls
+        messages.append({"role": "assistant", "content": None, "tool_calls": [
+            {"id": tc.id, "type": "function", "function": {"name": tc.function.name, "arguments": tc.function.arguments}}
+            for tc in msg.tool_calls
+        ]})
+        for tc in msg.tool_calls:
+            try:
+                args = json.loads(tc.function.arguments)
+            except json.JSONDecodeError:
+                args = {}
+            # always use .get() — LLM may send malformed args
+            tool_result = _execute_tool(tc.function.name, args)
+            messages.append({"role": "tool", "tool_call_id": tc.id, "content": tool_result})
+
+    # max turns exhausted — force final answer without tools
+    messages.append({"role": "user", "content": "No more tool calls. Return final JSON NOW."})
+    llm_params = llm_config_manager.prepare_llm_call(
+        llm_config, messages=messages,
+        temperature=self.temperature, max_tokens=self.max_tokens,
+    )
+    llm_params["metadata"] = {"trace_id": context.trace_id, "tags": ["datagenflow"]}
+    response = await litellm.acompletion(**llm_params)
+    try:
+        result = json.loads(response.choices[0].message.content or "{}")
+    except json.JSONDecodeError:
+        result = {}
+    return {"my_result": result.get("my_result", []), "_usage": total_usage.model_dump()}
+```
+
+**Key rules:**
+- Always nudge on last turn (`turn == max_turns - 1`) before the forced final call
+- Always force a final call without tools when max_turns exhausted — otherwise you get no output
+- Use `args.get("key", "")` not `args["key"]` — LLM may send malformed arguments
+- If tool responses contain `"$ref"` keys, rename before sending: `output.replace('"$ref"', '"schema_ref"')` — Gemini rejects `$ref` in tool responses
+- Cap tool result sizes (e.g. 50 items max) to avoid context overflow
+
+---
+
 ## Multiplier Blocks
 
 Blocks that generate multiple items from one input:

@@ -32,7 +32,12 @@ curl -s -X POST http://localhost:8000/api/pipelines/<ID>/execute \
 - `trace` — each entry has `block_type`, `execution_time`, `output`
 - `accumulated_state` — data flowing correctly between blocks?
 
-**Red flags:** missing fields, metadata pollution (extra fields like `samples`, `target_count`), execution_time >30s, empty/null generator outputs.
+**Check review readiness:**
+- Look at the **last trace entry's `accumulated_state`** — these are the fields visible in the Review UI
+- All reviewer-relevant fields should be **first-level keys** (not nested under `generated` or other objects)
+- If useful fields are nested, add a `FieldMapper` as the last block to flatten them (see `creating-pipeline-templates` skill)
+
+**Red flags:** missing fields, metadata pollution (extra fields like `samples`, `target_count`), execution_time >30s, empty/null generator outputs, reviewer-relevant data buried in nested objects.
 
 ## Phase 2: Small Batch