feat: add trace visualization to display_sample_record (#396) by nabinchha · Pull Request #438 · NVIDIA-NeMo/DataDesigner

nabinchha · 2026-03-18T20:20:41Z

📋 Summary

Adds LLM conversation trace visualization to display_sample_record, allowing users to inspect the full chain-of-thought, tool calls, and tool results for
LLM-generated columns. Supports both terminal (Rich) and Jupyter notebook (HTML) rendering.

🔄 Changes

✨ Added

New TraceRenderer class in [trace_renderer.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-designer
-config/src/data_designer/config/utils/trace_renderer.py) with Rich terminal and Jupyter HTML rendering
is_notebook_environment() helper in misc.py for centralized notebook detection
include_traces parameter on display_sample_record and WithRecordSamplerMixin (default True)
Comprehensive test suite in [test_trace_renderer.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-desi
gner-config/tests/config/utils/test_trace_renderer.py) (338 lines, covering Rich rendering, HTML rendering, edge cases, and integration)

🔧 Changed

visualization.py: consolidated notebook detection to use shared is_notebook_environment(), added trace rendering for all LLM column types, moved record index
display to top, normalized padding defaults

🔍 Attention Areas

⚠️ Reviewers: Please pay special attention to the following:

[trace_renderer.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-designer-config/src/data_designer/con
fig/utils/trace_renderer.py) — New module: core rendering logic for both Rich and HTML output, typed trace message structures
[visualization.py](https://github.com/NVIDIA-NeMo/DataDesigner/blob/nmulepati/feat/396-trace-visualization/packages/data-designer-config/src/data_designer/conf
ig/utils/visualization.py) — Integration point: trace column discovery, rendering order, and padding changes

In ipynb notebook

In console

🤖 Generated with AI

Render LLM conversation traces (produced by `with_trace != TraceType.NONE`) as readable conversation flows in `display_sample_record()`. Two backends: Rich terminal panels (styled by role) and Jupyter HTML block diagrams. - New module `trace_renderer.py` with `TraceMessage` TypedDict and `TraceRenderer` class (`render_rich`, `render_notebook_html`) - `include_traces` parameter on both mixin and standalone function (defaults to True, opt out with `include_traces=False`) - Traces shown after Generated Columns table, before images - Unit tests for various trace shapes and integration tests Made-with: Cursor

Extract is_notebook_environment() utility to replace scattered get_ipython() try/except blocks. Improve Rich trace readability with better colors, separators, text folding, and dedented content. Match HTML trace font to Rich monospace output. Move index label to top of display and reduce inter-table spacing. Made-with: Cursor

greptile-apps · 2026-03-18T20:33:31Z

Greptile Summary

This PR adds LLM conversation trace visualization to display_sample_record, introducing a new TraceRenderer class (trace_renderer.py) that renders chain-of-thought, tool calls, and tool results as Rich panels in the terminal or as HTML blocks in Jupyter notebooks. It also centralises notebook detection into is_notebook_environment() in misc.py and moves the record index to the top of the output.

Key issues found:

Google Colab detection bug (misc.py:46): shell.__class__.__name__ returns "Shell" in Colab, not the full module path "google.colab._shell", so Colab is never recognised as a notebook. HTML trace rendering is silently skipped in Colab, and users get plain-text output instead.
turn_count overcounts turns for parallel tool calls (trace_renderer.py:157): turn_ids contains one entry per tool call ID, so parallel calls in a single assistant message inflate the "turns" counter. A trace with two simultaneous calls would be reported as "2 tool calls in 2 turns" instead of "2 tool calls in 1 turn".
Record index omitted from saved files in notebooks (visualization.py:379): When in_notebook is True, the record index is sent only via IPython's display() and is not appended to render_list. Any caller that also passes save_path will receive an HTML/SVG file that lacks the index — a regression from the prior behaviour.

Confidence Score: 2/5

Three independent bugs — broken Colab detection, wrong turn-count for parallel tool calls, and record index dropped from saved files in notebooks — should be fixed before merging.
The feature logic is sound and the test coverage is solid for terminal paths, but three correctness regressions were found: the Colab environment check never matches (wrong attribute used), the tool-call turn summary is wrong for parallel calls, and the record index is missing from saved files when run in a notebook. None are critical runtime crashes, but they produce silently wrong output for real users.
misc.py (Colab detection), trace_renderer.py (turn_count logic), visualization.py (record index in save_path path)

Important Files Changed

Filename	Overview
packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py	New module providing Rich terminal and Jupyter HTML trace rendering; contains a logic bug where parallel tool calls in a single turn are miscounted as multiple turns in the summary label.
packages/data-designer-config/src/data_designer/config/utils/visualization.py	Integrates trace rendering into display_sample_record; introduces a regression where the record index is not written to saved files when running in a notebook environment.
packages/data-designer-config/src/data_designer/config/utils/misc.py	Centralises notebook detection into is_notebook_environment(); the Google Colab check uses class.name instead of class.module, causing Colab to be misidentified as a non-notebook environment.
packages/data-designer-config/tests/config/utils/test_trace_renderer.py	Comprehensive test suite covering Rich rendering, HTML rendering, edge cases, and display_sample_record integration; success path of render_notebook_html (with IPython mocked in) is not directly exercised.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[display_sample_record called] --> B[is_notebook_environment]
    B -->|True| C[Display record index via IPython HTML]
    B -->|False| D[Append index Text to render_list]
    C --> E[Build render_list: seed/LLM/image tables]
    D --> E
    E --> F{include_traces?}
    F -->|Yes| G[TraceRenderer instantiated\nIterate LLM column side_effect_columns]
    G --> H{in_notebook?}
    H -->|No - terminal| I[render_rich → append Rich Panel\nto render_list]
    H -->|Yes - notebook| J[Append to traces_to_display_later]
    I --> K[Console.print render_list]
    J --> K
    F -->|No| K
    K --> L{save_path set?}
    L -->|Yes| M[Record Console → save HTML/SVG]
    L -->|No| N[Display Console print]
    M --> O{in_notebook and\ntraces_to_display_later?}
    N --> O
    O -->|Yes| P[render_notebook_html → IPython display HTML]
    O -->|No| Q[Display images via IPython if in notebook]
    P --> Q

Prompt To Fix All With AI

This is a comment left during a code review.
Path: packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py
Line: 157-160

Comment:
**`turn_count` overcounts turns for parallel tool calls**

`turn_ids` is populated with one entry per tool call ID, so `len(turn_ids)` equals the total number of unique tool calls — not the number of conversational turns. When multiple tool calls are issued in a single assistant message (parallel calls), the summary will incorrectly say "N tool calls in N turns" instead of "N tool calls in 1 turn".

For example, the `multi_turn_tool_trace` fixture has two parallel calls (`call_a`, `call_b`) in one assistant message. `turn_ids = {"call_a", "call_b"}` → `turn_count = 2`, producing "2 tool calls in 2 turns" when the correct value is "2 tool calls in 1 turn".

The fix is to track the number of assistant messages that contain tool calls, not the number of unique IDs:

```python
# Replace turn_ids: set[str] = set() with:
turn_count = 0

# Inside the `if tool_calls:` block, increment turn count once per assistant message:
if tool_calls:
    turn_count += 1
    for tc in tool_calls:
        ...
    continue

# Then at the summary:
turn_count_val = turn_count if tool_call_count > 0 else 0
turn_word = "turn" if turn_count_val == 1 else "turns"
summary = f"{tool_call_count} tool {call_word} in {turn_count_val} {turn_word}" if tool_call_count > 0 else ""
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/data-designer-config/src/data_designer/config/utils/visualization.py
Line: 379-394

Comment:
**Record index omitted from saved files in notebook environments**

When `in_notebook` is `True`, the record index is emitted via IPython's `display()` and is never added to `render_list`. This means that when a caller also passes `save_path`, the saved HTML/SVG file will not contain the record index — a regression from the previous behaviour where the index was always appended to `render_list` and therefore always appeared in the saved output.

Consider always adding the index to `render_list` (for saved output) in addition to the HTML display call in notebook mode:

```python
if record_index is not None:
    if in_notebook:
        try:
            from IPython.display import HTML, display

            display(
                HTML(
                    f"<div style='text-align:center;font-family:Menlo,Monaco,Consolas,monospace;"
                    f"font-size:12px;color:#666;margin:8px 0;'>[index: {record_index}]</div>"
                )
            )
        except ImportError:
            pass
    # Always include in render_list so it appears in saved files
    render_list.append(Text(f"[index: {record_index}]", justify="center"))
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: packages/data-designer-config/src/data_designer/config/utils/misc.py
Line: 46

Comment:
**Google Colab detection always returns `False`**

`shell.__class__.__name__` returns the bare class name (e.g. `"Shell"`), not the fully-qualified module path. In Google Colab the interactive shell class is `google.colab._shell.Shell`, so `__class__.__name__` is `"Shell"` — which is not in the tuple `("ZMQInteractiveShell", "google.colab._shell")`. As a result, `is_notebook_environment()` will return `False` in Colab even though HTML rendering works fine there, and users will get plain-text Rich output instead of the new HTML trace visualization.

```suggestion
        return shell is not None and (
            shell.__class__.__name__ == "ZMQInteractiveShell"
            or shell.__class__.__module__ == "google.colab._shell"
        )
```

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: "fix: address review ..."}

packages/data-designer-config/src/data_designer/config/utils/trace_renderer.py

packages/data-designer-config/src/data_designer/config/utils/visualization.py

packages/data-designer-config/src/data_designer/config/utils/misc.py

- Remove double html.escape() on func_name in render_notebook_html; _build_html_block already escapes the title - Guard notebook trace display with include_traces to prevent potential NameError if trace_renderer is not instantiated - Improve is_notebook_environment() to check shell class name (ZMQInteractiveShell / google.colab._shell) instead of just get_ipython() existence, avoiding false positives in IPython terminals Made-with: Cursor

nabinchha added 3 commits March 13, 2026 16:44

Merge branch 'main' into nmulepati/feat/396-trace-visualization

1eec0d5

nabinchha requested a review from a team as a code owner March 18, 2026 20:20

move required to import from type_extensions

65e346b

greptile-apps bot reviewed Mar 18, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add trace visualization to display_sample_record (#396) #438

feat: add trace visualization to display_sample_record (#396) #438
nabinchha wants to merge 5 commits intomainfrom
nmulepati/feat/396-trace-visualization

nabinchha commented Mar 18, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 18, 2026 •

edited

Loading

Confidence Score: 2/5

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nabinchha commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📋 Summary

🔄 Changes

✨ Added

🔧 Changed

🔍 Attention Areas

In ipynb notebook

In console

Uh oh!

greptile-apps bot commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

nabinchha commented Mar 18, 2026 •

edited

Loading

greptile-apps bot commented Mar 18, 2026 •

edited

Loading