Add skill and evals for dynamic mode usage by rostan-t · Pull Request #6271 · NVIDIA/DALI

rostan-t · 2026-03-20T15:34:32Z

Category:

Other (e.g. Documentation, Tests, Configuration)

Description:

Since dynamic mode is fairly new, AI agents are not very good at writing code using it. For instance, according to Anthropic, Claude Sonnet 4.6's knowledge cutoff is August 2026. Even when presented with a few examples, agents miss some dynamic-mode specific patterns and are not very helpful to write code using it.

This PR adds a Claude Code skill containing guidelines on how to use dynamic mode. It was generated with the /skill-creator which generates evals for the skill. Here are the results on running the eval with Claude Code using Sonnet 4.6:

Eval	Task	With Skill	Without Skill
1	Image classification pipeline	10/10	1/10
2	Batch column extraction	4/4	1/4
3	Pipeline-to-dynamic conversion	7/7	0/7
4	Debugging intermittent corruption	6/6	1/6
5	Audio mel spectrogram	6/6	1/6
6	Object detection pipeline	7/7	0/7
Total	40 assertions	40/40 (100%)	4/40 (10%)

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

.claude/skills/using-dali-dynamic-mode-workspace/evals/files/pipeline_to_convert.py

greptile-apps · 2026-03-20T15:38:39Z

Greptile Summary

This PR adds a Claude Code skill (SKILL.md) and an eval suite (evals.json + pipeline_to_convert.py) to teach AI agents how to write correct DALI dynamic-mode code. The motivation is clear and well-supported by the 10% → 100% eval improvement shown in the PR description.

All significant issues raised in prior review rounds have been addressed:

Missing imports in pipeline_to_convert.py fixed (bd30e92)
Eval ID numbering gap (was 1–4, 6–7) corrected to sequential 1–6
max_batch_size constructor usage added to the skill
Incorrect claim that batch size can vary between epochs removed (932cb74)
copy parameter defaults for Tensor.torch() vs Batch.torch() confirmed correct

The skill content itself is accurate: device=\"gpu\" vs \"mixed\" guidance, Batch.__getitem__ absence, stateful reader pattern, EvalMode context-manager syntax, thread-local RNG, and the Pipeline Mode Migration table all match the documented DALI dynamic-mode API. The eval assertions directly target the failure modes that matter most for agent-generated code.

Confidence Score: 5/5

Safe to merge — documentation/eval-only change with no production code impact.

All previously raised P1-level concerns (missing imports, batch-size variation claim, eval numbering gap, missing max_batch_size guidance) have been resolved in preceding commits. No new correctness, security, or data-integrity issues were found in this diff. Remaining P2-level observations are too trivial to block merge.

No files require special attention.

Important Files Changed

Filename	Overview
.claude/skills/using-dali-dynamic-mode/SKILL.md	Adds comprehensive AI-agent skill guide for DALI dynamic mode; content is accurate and all previously flagged issues (missing max_batch_size guidance, erroneous batch-size-variation claim, device_id confusion) have been resolved in prior commits.
.claude/skills/using-dali-dynamic-mode-workspace/evals/evals.json	Six evals with sequential IDs 1-6 (numbering gap previously fixed); assertions are well-targeted at real agent failure modes across image, audio, detection, and debugging scenarios.
.claude/skills/using-dali-dynamic-mode-workspace/evals/files/pipeline_to_convert.py	Self-contained pipeline-mode reference script (imports previously fixed); correctly demonstrates the full set of patterns that eval 3 expects agents to convert.